PHP string replace match whole word - php

I would like to replace just complete words using php
Example :
If I have
$text = "Hello hellol hello, Helloz";
and I use
$newtext = str_replace("Hello",'NEW',$text);
The new text should look like
NEW hello1 hello, Helloz
PHP returns
NEW hello1 hello, NEWz
Thanks.

You want to use regular expressions. The \b matches a word boundary.
$text = preg_replace('/\bHello\b/', 'NEW', $text);
If $text contains UTF-8 text, you'll have to add the Unicode modifier "u", so that non-latin characters are not misinterpreted as word boundaries:
$text = preg_replace('/\bHello\b/u', 'NEW', $text);

multiple word in string replaced by this
$String = 'Team Members are committed to delivering quality service for all buyers and sellers.';
echo $String;
echo "<br>";
$String = preg_replace(array('/\bTeam\b/','/\bfor\b/','/\ball\b/'),array('Our','to','both'),$String);
echo $String;
Result: Our Members are committed to delivering quality service to both buyers and sellers.

Array replacement list: In case your replacement strings are substituting each other, you need preg_replace_callback.
$pairs = ["one"=>"two", "two"=>"three", "three"=>"one"];
$r = preg_replace_callback(
"/\w+/", # only match whole words
function($m) use ($pairs) {
if (isset($pairs[$m[0]])) { # optional: strtolower
return $pairs[$m[0]];
}
else {
return $m[0]; # keep unreplaced
}
},
$source
);
Obviously / for efficiency /\w+/ could be replaced with a key-list /\b(one|two|three)\b/i.

You can also use T-Regx library, that quotes $ or \ characters while replacing
<?php
$text = pattern('\bHello\b')->replace($text)->all()->with('NEW');

Related

Space in # mention username and lowercase in link

I am trying to create a mention system and so far I've converted the #username in a link. But I wanted to see if it is possible for it to recognise whitespace for the names. For example: #Marie Lee instead of #MarieLee.
Also, I'm trying to convert the name in the link into lowercase letters (like: profile?id=marielee while leaving the mentioned showed with the uppercased, but haven't been able to.
This is my code so far:
<?php
function convertHashtags($str) {
$regex = '/#+([a-zA-Z0-9_0]+)/';
$str = preg_replace($regex, strtolower('$0'), $str);
return($str);
}
$string = 'I am #Marie Lee, nice to meet you!';
$string = convertHashtags($string);
echo $string;
?>
You may use this code with preg_replace_callback and an enhanced regex that will match all space separated words:
define("REGEX", '/#\w+(?:\h+\w+)*/');
function convertHashtags($str) {
return preg_replace_callback(REGEX, function ($m) {
return '$0';
}, $str);
}
If you want to allow only 2 words then you may use:
define("REGEX", '/#\w+(?:\h+\w+)?/');
You can filter out usernames based on alphanumeric characters, digits or spaces, nothing else to extract for it. Make sure that at least one character is matched before going for spaces to avoid empty space match with a single #. Works for maximum of 2 space separated words correctly for a username followed by a non-word character(except space).
<?php
function convertHashtags($str) {
$regex = '/#([a-zA-Z0-9_]+[\sa-zA-Z0-9_]*)/';
if(preg_match($regex,$str,$matches) === 1){
list($username,$name) = [$matches[0] , strtolower(str_replace(' ','',$matches[1]))];
return "<a href='profile?id=$name'>$username</a>";
}
throw new Exception('Unable to find username in the given string');
}
$string = 'I am #Marie Lee, nice to meet you!';
$string = convertHashtags($string);
echo $string;
Demo: https://3v4l.org/e2S8C
If you want the text to appear as is in the innerHTML of the anchor tag, you need to change
list($username,$name) = [$matches[0] , strtolower(str_replace(' ','',$matches[1]))];
to
list($username,$name) = [$str , strtolower(str_replace(' ','',$matches[1]))];
Demo: https://3v4l.org/dCQ4S

How to not perform preg_replace if subject starts with quote

I'm trying to convert plain links to HTML links using preg_replace. However it's replacing links that are already converted.
To combat this I'd like it to ignore the replacement if the link starts with a quote.
I think a positive lookahead may be needed but everything I've tried hasn't worked.
$string = 'test http://www.example.com';
$string = preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $string);
var_dump($string);
The above outputs:
http://www.example.com">test</a> http://www.example.com
When it should output:
test http://www.example.com
You might get along with lookarounds.
Lookarounds are zero-width assertions that make sure to match/not to match anything immediately around the string in question. They do not consume any characters.
That being said, a negative lookbehind might be what you need in your situation:
(?<![">])\bhttps?://\S+\b
In PHP this would be:
<?php
$string = 'I want to be transformed to a proper link: http://www.google.com ';
$string .= 'But please leave me alone ';
$string .= '(https://www.google.com).';
$regex = '~ # delimiter
(?<![">]) # a neg. lookbehind
https?://\S+ # http:// or https:// followed by not a whitespace
\b # a word boundary
~x'; # verbose to enable this explanation.
$string = preg_replace($regex, "<a href='$0'>$0</a>", $string);
echo $string;
?>
See a demo on ideone.com. However, maybe a parser is more appropriate.
Since you can use Arrays in preg_replace, this might be convenient to use depending on what you want to achieve:
<?php
$string = 'test http://www.example.com';
$rx = array("&(<a.+https?:\/\/[\w]+[^ \,\"\n\r\t<]*>)(.*)(<\/a\>)&si", "&(\s){1,}(https?:\/\/[\w]+[^ \,\"\n\r\t<]*)&");
$rp = array("$1$2$3", "$2");
$string = preg_replace($rx,$rp, $string);
var_dump($string);
// DUMPS:
// 'testhttp://www.example.com'
The Idea
You can split your string at the already existing anchors, and only parse the pieces in between.
The Code
$input = 'test http://www.example.com';
// Split the string at existing anchors
// PREG_SPLIT_DELIM_CAPTURE flag includes the delimiters in the results set
$parts = preg_split('/(<a.*?>.*?<\/a>)/is', $input, PREG_SPLIT_DELIM_CAPTURE);
// Use array_map to parse each piece, and then join all pieces together
$output = join(array_map(function ($key, $part) {
// Because we return the delimiter in the results set,
// every $part with an uneven key is an anchor.
return $key % 2
? preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $part)
: $part;
}, array_keys($parts), $parts);

Replace all the first character of words in a string using preg_replace()

I have a string as
This is a sample text. This text will be used as a dummy for "various" RegEx "operations" using PHP.
I want to select and replace all the first alphabet of each word (in the example : T,i,a,s,t,T,t,w,b,u,a,d,f,",R,",u,P). How do I do it?
I tried /\b.{1}\w+\b/. I read the expression as "select any character that has length of 1 followed by word of any length" but didn't work.
You may try this regex as well:
(?<=\s|^)([a-zA-Z"])
Demo
Your regex - /\b.{1}\w+\b/ - matches any string that is not enclosed in word characters, starts with any symbol that is in a position after a word boundary (thus, it can even be whitespace if there is a letter/digit/underscore in front of it), followed with 1 or more alphanumeric symbols (\w) up to the word boundary.
That \b. is the culprit here.
If you plan to match any non-whitespace preceded with a whitespace, you can just use
/(?<!\S)\S/
Or
/(?<=^|\s)\S/
See demo
Then, replace with any symbol you need.
You may try to use the following regex:
(.)[^\s]*\s?
Using the preg_match_all and implode the output result group 1
<?php
$string = 'This is a sample text. This text will be used as a dummy for'
. '"various" RegEx "operations" using PHP.';
$pattern = '/(.)[^\s]*\s?/';
$matches;
preg_match_all($pattern, $string, $matches);
$output = implode('', $matches[1]);
echo $output; //Output is TiastTtwbuaadf"R"uP
For replace use something like preg_replace_callback like:
$pattern = '/(.)([^\s]*\s?)/';
$output2 = preg_replace_callback($pattern,
function($match) { return '_' . $match[2]; }, $string);
//result: _his _s _ _ample _ext. _his _ext _ill _e _sed _s _ _ummy _or _various" _egEx _operations" _sing _HP.

PHP Regex: Remove words less than 3 characters

I'm trying to remove all words of less than 3 characters from a string, specifically with RegEx.
The following doesn't work because it is looking for double spaces. I suppose I could convert all spaces to double spaces beforehand and then convert them back after, but that doesn't seem very efficient. Any ideas?
$text='an of and then some an ee halved or or whenever';
$text=preg_replace('# [a-z]{1,2} #',' ',' '.$text.' ');
echo trim($text);
Removing the Short Words
You can use this:
$replaced = preg_replace('~\b[a-z]{1,2}\b\~', '', $yourstring);
In the demo, see the substitutions at the bottom.
Explanation
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)
[a-z]{1,2} matches one or two letters
\b another word boundary
Replace with the empty string.
Option 2: Also Remove Trailing Spaces
If you also want to remove the spaces after the words, we can add \s* at the end of the regex:
$replaced = preg_replace('~\b[a-z]{1,2}\b\s*~', '', $yourstring);
Reference
Word Boundaries
You can use the word boundary tag: \b:
Replace: \b[a-z]{1,2}\b with ''
Use this
preg_replace('/(\b.{1,2}\s)/','',$your_string);
As some solutions worked here, they had a problem with my language's "multichar characters", such as "ch". A simple explode and implode worked for me.
$maxWordLength = 3;
$string = "my super string";
$exploded = explode(" ", $string);
foreach($exploded as $key => $word) {
if(mb_strlen($word) < $maxWordLength) unset($exploded[$key]);
}
$string = implode(" ", $exploded);
echo $string;
// outputs "super string"
To me, it seems that this hack works fine with most PHP versions:
$string2 = preg_replace("/~\b[a-zA-Z0-9]{1,2}\b\~/i", "", trim($string1));
Where [a-zA-Z0-9] are the accepted Char/Number range.

Masking all but first letter of a word using Regex

I'm attempting to create a bad word filter in PHP that will analyze the word and match against an array of known bad words, but keep the first letter of the word and replace the rest with asterisks. Example:
fook would become f***
shoot would become s**
The only part I don't know is how to keep the first letter in the string, and how to replace the remaining letters with something else while keeping the same string length.
$string = preg_replace("/\b(". $word .")\b/i", "***", $string);
Thanks!
$string = 'fook would become';
$word = 'fook';
$string = preg_replace("~\b". preg_quote($word, '~') ."\b~i", $word[0] . str_repeat('*', strlen($word) - 1), $string);
var_dump($string);
$string = preg_replace("/\b".$word[0].'('.substr($word, 1).")\b/i", "***", $string);
This can be done in many ways, with very weird auto-generated regexps...
But I believe using preg_replace_callback() would end up being more robust
<?php
# as already pointed out, your words *may* need sanitization
foreach($words as $k=>$v)
$words[$k]=preg_quote($v,'/');
# and to be collapsed into a **big regexpy goodness**
$words=implode('|',$words);
# after that, a single preg_replace_callback() would do
$string = preg_replace_callback('/\b('. $words .')\b/i', "my_beloved_callback", $string);
function my_beloved_callback($m)
{
$len=strlen($m[1])-1;
return $m[1][0].str_repeat('*',$len);
}
Here is unicode-friendly regular expression for PHP:
function lowercase_except_first_letter($s) {
// the following line SKIP the first word and pass it to callback func...
// \W it allows to keep the first letter even in words in quotes and brackets
return preg_replace_callback('/(?<!^|\s|\W)(\w)/u', function($m) {
return mb_strtolower($m[1]);
}, $s);
}

Categories