regex string substitution upper and lower case - php

Given the following simple function (for a PHP page) I am trying to match all the occurences of the word $marker in a long text string. I need to highlight its occurences.
The function works, but it presents two problems:
1) it fails to match uppercase occurences of $marker
2) it also matches partial occurences: if $marker is "art", the function as it is also matches "artistic" and "cart".
How can I correct these two inconveniences?
function highlightWords($string, $marker){
$string = str_replace($marker, "<span class='highlight success'>".$marker."</span>", $string);
return $string;
}

To solve the two problems you can use preg_replace() with a regular expression. Just add the i flag for case-insensitive search and add \b word boundaries around your search term, so it can't be part of another word, e.g.
function highlightWords($string, $marker){
$string = preg_replace("/(\b" . preg_quote($marker, "/") . "\b)/i", "<span class='highlight success'>$1</span>", $string);
return $string;
}

Related

PHP Regex match one letter and one number

I'm trying to replace any occurrences when you find a single letter followed by a single number in a string.
$word = 'AB001J1'; //or ZR010F2 or ZQ10B5
echo str_replace('/^(?=.*\pL)(?=.*\p{Nd})/', '', $word);
Trying to get the result AB001 //or ZR010 or ZQ10
A regex splitting approach works well here:
$word = 'AB001J1';
$output = preg_split("/(?<=[0-9])(?=[A-Z])/", $word, 2)[0];
echo $output; // AB001
The above strategy is to split the input string at any point in between a digit and uppercase letter (in that order). This separates the various terms, and we retain only the first one.

preg_match how to return matches?

According to PHP manual "If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on."
How can I return a value from a string with only knowing the first few characters?
The string is dynamic and will always change whats inside, but the first four character will always be the same.
For example how could I return "Car" from this string "TmpsCar". The string will always have "Tmps" followed by something else.
From what I understand I can return using something like this
preg_match('/(Tmps+)/', $fieldName, $matches);
echo($matches[1]);
Should return "Car".
Your regex is flawed. Use this:
preg_match('/^Tmps(.+)$/', $fieldName, $matches);
echo($matches[1]);
$matches = []; // Initialize the matches array first
if (preg_match('/^Tmps(.+)/', $fieldName, $matches)) {
// if the regex matched the input string, echo the first captured group
echo($matches[1]);
}
Note that this task could easily be accomplished without regex at all (with better performance): See startsWith() and endsWith() functions in PHP.
"The string will always have "Tmps" followed by something else."
You don't need a regular expression, in that case.
$result = substr($fieldName, 4);
If the first four characters are always the same, just take the portion of the string after that.
An alternative way is using the explode function
$fieldName= "TmpsCar";
$matches = explode("Tmps", $fieldName);
if(isset($matches[1])){
echo $matches[1]; // return "Car"
}
Given that the text you are looking in, contains more than just a string, starting with Tmps, you might look for the \w+ pattern, which matches any "word" char.
This would result in such an regular expression:
/Tmps(\w+)/
and altogether in php
$text = "This TmpsCars is a test";
if (preg_match('/Tmps(\w+)/', $text, $m)) {
echo "Found:" . $m[1]; // this would return Cars
}

Check string for defined format and get part of it

How can I check if a string has the format [group|any_title] and give me the title back?
[group|This is] -> This is
[group|just an] -> just an
[group|example] -> example
I would do that with explode and [group| as the delimiter and remove the last ]. If length (of explode) is > 0, then the string has the correct format.
But I think that is not quite a good way, isn't it?
So you want to check if a string matches a regex?
if(preg_match('/^\[group\|(.+)\]$/', $string, $m)) {
$title = $m[1];
}
If the group part is supposed to be dynamic as well:
if(preg_match('/^\[(.+)\|(.+)\]$/', $string, $m)) {
$group = $m[1];
$title = $m[2];
}
Use regular expression matching using PHP function preg_match.
You can use for example regexr.com to create and test a regular expression and when you're done, then implement it in your PHP script (replace the first parameter of preg_match with your regular expression):
$text = '[group|This is]';
// replace "pattern" with regular expression pattern
if (preg_match('/pattern/', $text, $matches)) {
// OK, you have parts of $text in $matches array
}
else {
// $text doesn't contain text in expected format
}
Specific regular expression pattern depends on how strictly you want to check your input string. It can be for example something like /^\[.+\|(.+)\]$/ or /\|([A-Za-z ]+)\]$/. First checks if string starts with [, ends with ] and contains any characters delimited by | in between. Second one just checks if string ends with | followed by upper and lower case alphabetic characters and spaces and finally ].

Regular expressins in php

I need a regular expression to check a string for uppercase letters. Where It finds a uppercase It needs to add white space before it. I write some code for this, but the problem is that it only works if there is only one uppercase letter in the string. But I need to work with any number of uppercase letter exists in the string. I pasted my code below:
$regEx = preg_match('*[A-Z]*', $str, $matches, PREG_OFFSET_CAPTURE);
if(!empty($regEx)) {
$str = substr_replace($str,' ', $matches[0][1], 0);
}
I need a regular expression to check a string for uppercase letters. Where it finds a uppercase, it needs to add white space before it.
preg_replace() sounds a more suitable candidate to achieve this...
$str = preg_replace('/[A-Z]/', ' $0', $str);
CodePad.
Please try below code:
if(preg_match("/[A-Z]/", $string)===0) {
return true;
}

Masking all but first letter of a word using Regex

I'm attempting to create a bad word filter in PHP that will analyze the word and match against an array of known bad words, but keep the first letter of the word and replace the rest with asterisks. Example:
fook would become f***
shoot would become s**
The only part I don't know is how to keep the first letter in the string, and how to replace the remaining letters with something else while keeping the same string length.
$string = preg_replace("/\b(". $word .")\b/i", "***", $string);
Thanks!
$string = 'fook would become';
$word = 'fook';
$string = preg_replace("~\b". preg_quote($word, '~') ."\b~i", $word[0] . str_repeat('*', strlen($word) - 1), $string);
var_dump($string);
$string = preg_replace("/\b".$word[0].'('.substr($word, 1).")\b/i", "***", $string);
This can be done in many ways, with very weird auto-generated regexps...
But I believe using preg_replace_callback() would end up being more robust
<?php
# as already pointed out, your words *may* need sanitization
foreach($words as $k=>$v)
$words[$k]=preg_quote($v,'/');
# and to be collapsed into a **big regexpy goodness**
$words=implode('|',$words);
# after that, a single preg_replace_callback() would do
$string = preg_replace_callback('/\b('. $words .')\b/i', "my_beloved_callback", $string);
function my_beloved_callback($m)
{
$len=strlen($m[1])-1;
return $m[1][0].str_repeat('*',$len);
}
Here is unicode-friendly regular expression for PHP:
function lowercase_except_first_letter($s) {
// the following line SKIP the first word and pass it to callback func...
// \W it allows to keep the first letter even in words in quotes and brackets
return preg_replace_callback('/(?<!^|\s|\W)(\w)/u', function($m) {
return mb_strtolower($m[1]);
}, $s);
}

Categories