Modify my regex so that pattern search should not be case sensitive - php

I have following php code that removes whole word that matches the pattern
$patterns = ["re", "get", "ER"];
$string = "You are definitely getting better today";
$alternations = implode('|', $patterns);
$re = '(?!(?<=\s)(?:'.$alternations.')(?=\s))\S*(?:'.$alternations.')\S*';
$string = preg_replace('#'.$re.'#', '', $string);
$string = preg_replace('#\h{2,}#', ' ', $string);
echo $string;
I want two modifications
The pattern search should not be case sensitive e.g. the pattern ER must remove better in $string
If removed word in $string have line breaks before or after it, only one line break should be removed.
If $string is
You are definitely getting
better
today
Output must be
You definitely
today
Sample PHP Code
Regards,

You may use
$patterns = ["re", "get", "ER"];
$string = "You are definitely getting\nbetter\ntoday";
$alternations = implode('|', $patterns);
$re = '\R?(?!(?<=\s)(?:'.$alternations.')(?=\s))\S*(?:'.$alternations.')\S*';
$string = preg_replace('#'.$re.'#i', '', $string);
$string = preg_replace('#\h{2,}#', ' ', $string);
echo $string;
See the PHP demo.
While the i modifier provides the case insensitivity to regex matching, another, less obvious thing here is that you need to add an optional line break pattern.
That line break can be matched in various ways, but in PHP PCRE, you may easily match it with \R construct.
Adding a ? quantifier after it, you may make it match 1 or 0 times, i.e. make it optional, so that the whole pattern could still match at the start of the string.

Related

RegEx for adding a space in a special pattern

Quick note: I know markdown parsers don't care about this issue. It's for the sake of visual consistency in the md file and also experimentation.
Sample:
# this
##that
###or this other
Goal: read each line and,if a markdown header does not have a space after the pound/hashtag sign, add one so that it would look like:
# this
## that
### or this other
My non-regex attempt:
function inelegantFunction (string $string){
$array = explode('#',$string);
$num = count($array);
$text = end($array);
return str_repeat('#', $num-1)." ".$text;
}
echo inelegantFunction("###or this other");
// returns ### or this other
This works, but it has no mechanism to match the unlikely case of seven '#'.
Regardless of efficacy, I would like to figure out how to do this with regex in php (and perhaps javascript if that matters).
Try to match (?m)^#++\K\S which matches lines starting with one or more number signs then replace it with $0 in your function:
return preg_replace('~(?m)^#++\K\S~', ' $0', $string);
See live demo here
To limit the number of #s to six use:
(?m)^(?!#{7})#++\K\S
I'm guessing that a simple expression with a right char-list boundary might be working here, maybe:
(#)([a-z])
If we might be having more chars, we can simply add it to [a-z].
Demo
Test
$re = '/(#)([a-z])/m';
$str = '#this
##that
###that
### or this other';
$subst = '$1 $2';
$result = preg_replace($re, $subst, $str);
echo "The result of the substitution is ".$result;

Split regex in to two regex: whole words only & words with substring matches only

I have below code that removes whole words that contain any pattern
$patterns = ["are", "finite", "get", "er"];
$string = "You are definitely getting better today";
$re = '\S*('.implode('|', $patterns).')\S*';
$string = preg_replace('#'.$re.'#', '', $string);
$string = preg_replace('#\h{2,}#', ' ', $string);
echo $string;
the output of the above code is
You today
I want to split this code into two functions so that the first function only removes whole words present in the pattern and a second function that only removes words that contain any pattern.
I expect the output of the function one that remove only whole words
You definitely getting better today (**are** is removed)
and output of the other function that remove whole word that contain pattern
You are today (**definitely getting better** are removed)
The first part is basic: Only match whole keywords (actually, you can find dozens of Q&As like that, e.g this)
\b(?:are|finite|get|er)\b
Which can be applied to your code like this: $re = '\b('.implode('|', $patterns).')\b';
The second part is a bit more involved: While you keep expanding substring matches to match the entire word you want to exclude words that match whole keywords.
We can use a lookahead to achieve this like that:
(?!\b(?:are|finite|get|er)\b)\S*(?:are|finite|get|er)\S*
Demo,
Sample Code:
$patterns = ["are", "finite", "get", "er"];
$string = "You are definitely getting better today";
$alternations = ''.implode('|', $patterns);
$re = '(?!\b(?:'.$alternations.')\b)\S*(?:'.$alternations.')\S*';
$string = preg_replace('#'.$re.'#', '', $string);
If the \b does not work for you and you'd like to go with space as word boundary use lookarounds:
(?<=\s)(?:are|finite|get|er)(?=\s)
Sample Code (updated) case 1.

How to not perform preg_replace if subject starts with quote

I'm trying to convert plain links to HTML links using preg_replace. However it's replacing links that are already converted.
To combat this I'd like it to ignore the replacement if the link starts with a quote.
I think a positive lookahead may be needed but everything I've tried hasn't worked.
$string = 'test http://www.example.com';
$string = preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $string);
var_dump($string);
The above outputs:
http://www.example.com">test</a> http://www.example.com
When it should output:
test http://www.example.com
You might get along with lookarounds.
Lookarounds are zero-width assertions that make sure to match/not to match anything immediately around the string in question. They do not consume any characters.
That being said, a negative lookbehind might be what you need in your situation:
(?<![">])\bhttps?://\S+\b
In PHP this would be:
<?php
$string = 'I want to be transformed to a proper link: http://www.google.com ';
$string .= 'But please leave me alone ';
$string .= '(https://www.google.com).';
$regex = '~ # delimiter
(?<![">]) # a neg. lookbehind
https?://\S+ # http:// or https:// followed by not a whitespace
\b # a word boundary
~x'; # verbose to enable this explanation.
$string = preg_replace($regex, "<a href='$0'>$0</a>", $string);
echo $string;
?>
See a demo on ideone.com. However, maybe a parser is more appropriate.
Since you can use Arrays in preg_replace, this might be convenient to use depending on what you want to achieve:
<?php
$string = 'test http://www.example.com';
$rx = array("&(<a.+https?:\/\/[\w]+[^ \,\"\n\r\t<]*>)(.*)(<\/a\>)&si", "&(\s){1,}(https?:\/\/[\w]+[^ \,\"\n\r\t<]*)&");
$rp = array("$1$2$3", "$2");
$string = preg_replace($rx,$rp, $string);
var_dump($string);
// DUMPS:
// 'testhttp://www.example.com'
The Idea
You can split your string at the already existing anchors, and only parse the pieces in between.
The Code
$input = 'test http://www.example.com';
// Split the string at existing anchors
// PREG_SPLIT_DELIM_CAPTURE flag includes the delimiters in the results set
$parts = preg_split('/(<a.*?>.*?<\/a>)/is', $input, PREG_SPLIT_DELIM_CAPTURE);
// Use array_map to parse each piece, and then join all pieces together
$output = join(array_map(function ($key, $part) {
// Because we return the delimiter in the results set,
// every $part with an uneven key is an anchor.
return $key % 2
? preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $part)
: $part;
}, array_keys($parts), $parts);

Preg_replace exact strings only

$string = "recruitment offer human resource IT for before of";
$string = str_replace(array('it', 'for', 'of'), '-', $string);
I want to remove some unnecessary words from the string (in this example - I want to replace it, for, and of with -) but I don't want others words to be affected (the above example will also affect the words recuITment, OFfer and beFORe
Result : recruitment offer human resource - - before -
Note : I need a solution that does not limit only to these words / string.
Use preg_replace() with \b, the word boundary assertion:
$string = preg_replace( '#\b(it|for|of)\b#i', '-', $string);
It's better to use lookarounds instead of word boundaries. Because \b(it|for|of)\b would match it in :it: string. I think this is not you want.
$string = preg_replace( '#(?<=\s|^)(?:it|for|of)(?=\s|$)#i', '-', $string);
DEMO

PHP trim and return a string from its right

I'm trying to take a string that's output from MySql like this (MySql outputs X characters):
$str = 'Buddy you're a boy make a big noise Playin in the stre';
and trying to start from the right side, trim whatever is there up till the first space. Sounded simple when I got down to it, but now, it has my brain and fingers in knots.
The output I'm tying to achieve is simple:
$str = 'Buddy you're a boy make a big noise Playin in the';
Notice, that characters starting from the right, till the first space, are removed.
Can you help?
My Fiddle
$str = 'Buddy you\'re a boy make a big noise Playin in the stre';
//echo rtrim($str,' ');
It's a useful idiom to remember on its own: to remove all the characters preceding a specific one from the right side of the string (including that special character), use the following:
$trimmed = substr($str, 0, strrpos($str, ' '));
... where ' ' is that special character.
Demo
If you don't know, however, whether or not the character is present, you'd check the result of sttrrpos first:
$last_space_index = strrpos($str, ' ');
$trimmed = $last_space_index !== false
? substr($str, 0, $last_space_index)
: $str;
And if there can be more than one character that you need to trim, like in 'hello there test' line, just rtrim the result:
$trimmed = rtrim(substr($str, 0, strrpos($str, ' ')), ' ');
In this case, however, a regex-based solution looks more appropriate:
$trimmed = preg_replace('/ +[^ ]*$/', '', $str);
I think your best option would be a regex replace:
preg_replace('/\s+\S*$/', '', $str);
which outputs Buddy you're a boy make a big noise Playin in the
And the Fiddle
it's probably easier to do it with regex, but I'm sooo bad with that! You shoud try this:
// Get all the words in an array
$strArray = explode(" ", $str);
// Remove the last word.
array_pop($strArray);
// Get it back into a sentence
$newString = implode(" ", $strArray);
There's a hundred ways to do this, here are some options:
array_pop'ing the last word off an array we create from explode:
$arr = explode(" ", $str);
$fixed_arr = array_pop($arr);
$result = implode(" ", $arr);
Using regular expressions:
$result = preg_replace('/\s+\S*$/', '', $str);
and using strrpos and substr:
$spacePos = strrpos($str, ' ');
$result = substr($str, 0, $spacePos);
In mysql use
left(field,length)
to output only the strlen first digits
right(field,length) having opposite effects
otherwise use substr($string,0,$length) or regex in php
As a matter of regex performance comparison, the regex engine can move faster through the string when it can perform greedy matching with minimal backtracking.
/ +[^ ]*$/ uses 68 steps. (#raina77ow)
/(?:[^ ]+\K )+.*/ uses 56 steps. (#mickmackusa)
/(?:\K [^ ]*)+/ uses 48 steps. (#mickmackusa)
\s+\S*$ uses 34 steps. (#ChrisBornhoft and #RyanKempt)
/.*\K .*/ uses just 15 steps. (#mickmackusa)
Based on these comparisons, I recommend greedily matching any characters, then restarting the fullstring match before matching the last occurring space, then matching zero or more characters until the end of the string.
Code: (Demo)
$string = "Buddy you're a boy make a big noise Playin in the stre";
var_export(
preg_replace('/.*\K .*/', '', $string)
);
Output:
'Buddy you\'re a boy make a big noise Playin in the'

Categories