Replace words and maintain case-sensitivity of the found string - php

I am creating a php application to format files. So I need to apply a find-replace process while maintaining the case.
For example, I need to replace 'employees' with 'vehicles'.
$file_content = "Employees are employees_category MyEmployees kitEMPLOYEESMATCH";
$f = 'employees';
$r = 'vehicles';
echo str_ireplace($f, $r, $file_content);
Current Output:
vehicles are vehicles_category Myvehicles kitvehiclesMATCH
Desired Output:
Vehicles are vehicles_category MyVehicles kitVEHICLESMATCH

You could use something like this by replacing for each case separately:
<?php
$file_content = "Employees are employees_category MyEmployees kitEMPLOYEESMATCH";
$f = 'employees';
$r = 'vehicles';
$res = str_replace($f, $r, $file_content); // replace for lower case
$res = str_replace(strtoupper($f), strtoupper($r), $res); // replace for upper case
$res = str_replace(ucwords($f), ucwords($r), $res); // replace for proper case (capital in first letter of word)
echo $res
?>

While the SJ11's answer is attractive for its brevity, it is prone to making unintended replacements on already replaced substrings -- though not possible with the OP's sample data.
To ensure that replacements are not replaced, you must make only one pass over the input string.
For utility, I will include preg_quote(), so that pattern does not break when the $r value contains characters with special meaning in regex.
Code: (Demo) (PHP7.4 Demo)
$file_content = "Employees are employees_category MyEmployees kitEMPLOYEESMATCH";
$f = 'employees';
$r = 'vehicles';
$pattern = '~('
. implode(
')|(',
[
preg_quote($f, '~'),
preg_quote(ucfirst($f), '~'),
preg_quote(strtoupper($f), '~')
]
) . ')~';
$lookup = [
1 => $r,
2 => ucfirst($r),
3 => strtoupper($r)
];
var_export(
preg_replace_callback(
$pattern,
function($m) use ($lookup) {
return $lookup[count($m) - 1];
},
$file_content
)
);
Output: (single quotes are from var_export())
'Vehicles are vehicles_category MyVehicles kitVEHICLESMATCH'

Related

Encode contents between Two Special Strings

All I want is to Get contents between two strings like the following line:
$content = '81Lhello82R 81Lmy82R 81Lwife82R';
I wish to get all contents between 81L and 82R, then encode them to Base64 automatically by Preg_match I think, I've done some ways to do it but didn't get what was expected!
Base Form:
81Lhello82R 81Lmy82R 81Lwife82R
Output:
81LaGVsbG8=82R 81LbXk=82R 81Ld2lmZQ==82R
Hard rules:
$leftMask = '81L';
$rightMask = '82R';
$content = '81Lhello82R 81Lmy82R 81Lwife82R';
preg_match_all('#'.$leftMask.'(.*)'.$rightMask.'#U',$content, $out);
$output = [];
foreach($out[1] as $val){
$output[] = $leftMask.base64_encode($val).$rightMask;
}
$result = str_replace($out[0], $output, $content);
RegExp rules
$leftMask = '\d{2}L';
$rightMask = '\d{2}R';
$content = '81Lhello82R 81Lmy82R 81Lwife82R';
preg_match_all('#('.$leftMask.')(.*)('.$rightMask.')#U',$content, $out);;
$output = [];
foreach($out[2] as $key=>$val){
$output[] = $out[1][$key].base64_encode($val).$out[3][$key];
}
$result = str_replace($out[0], $output, $content);
This is a job for preg_replace_callback:
$content = '81Lhello82R 81Lmy82R 81Lwife82R';
$output = preg_replace_callback(
'/(?<=\b\d\dL)(.+?)(?=\d\dR)/',
function($matches) {
return base64_encode($matches[1]); // encode the word and return it
},
$content);
echo $output,"\n";
Where
(?<=\b\d\dL) is a positive lookbehind that makes sure we have 2 digits and the letter L before the word to encode
(?=\d\dR) is a positive lookahead that makes sure we have 2 digits and the letter R after the word to encode
(.+?) is the capture group that contains the word to encode
Output:
81LaGVsbG8=82R 81LbXk=82R 81Ld2lmZQ==82R

Replacing a string while choosing the order in which the part of strings get replaced first

I'm trying to replace a string with certain words, but I also want to replace it in order of the position in the array. For example, I want to replace "b c", before I try to replace "a b", without changing the position in the original string. By the way, the letters are suppose to represent actual words, and they are not supposed be part of another word. For example, the word "sun" is part of "sunflower", and the word "sunflower" cannot be replaced just because the word "sun" is in it.
$text = "a b c";
$replacement = array("a b" => "ab","b c" => "bc");
$search = array_map(function($v){
return preg_quote($v, "/");
}, array_keys($replacement));
echo $text = preg_replace_callback("/\b(" . implode("|", $search) . ")\b/", function($m)use($replacement){
return $replacement[$m[1]];
}, $text);
First Result
ab c
Second Result
I switched the position in the array around, thinking that it would affect the order of which the strings get replaced. Sadly, it doesn't work like that, and I got the same result.
$replacement = array("b c" => "bc","a b" => "ab");
ab c
At this point, I realize that it wasn't the position of the array that affects which part of the string that get replaced first, but the order of the part of strings that shows up on the original string that determines the order in which it is replaced by.
So, my question is, is there a way to somehow make it so that it can replace the string in order according to the order in which it is in the array, or in a different way? For example, I want to replace
"b c"
before I try to replace "a b" without changing the position of the original string. Is that doable? Thanks.
[EDIT]
The idea consists to cast the original text to an array (with one element at the beginning, the text). Array items at even index are splitted for each patterns. Since PREG_SPLIT_DELIM_CAPTURE option is used, delimiters have always an odd index and stay untouched once they are matched.
$text = 'a b c';
$rep = ['b c'=>'bc', 'a b'=>'ab', ];
$pats = array_map(function($i) {
return '~\b(' . preg_quote($i, '~') . ')\b~';
}, array_keys($rep));
$parts = (array)$text; // or $parts = [ $text ]; // it's the same
foreach ($pats as $pat) {
$temp = [];
foreach ($parts as $k=>$part) {
if ($k & 1)
$temp[] = $part;
else
$temp = array_merge(
$temp,
preg_split($pat, $part, -1, PREG_SPLIT_DELIM_CAPTURE)
);
}
$parts = $temp;
}
$result = '';
foreach ($parts as $k=>$part) {
$result .= ($k & 1) ? $rep[$part] : $part;
}
echo $result;
I changed your code to represent what (I think) you wanted:
$text = "a b c a b";
$replacement = array("b c" => "bc", "a b" => "ab");
$search = array_map(function($v){
return preg_quote($v, "/");
}, array_keys($replacement));
for($i = 0; $i < count($replacement); $i++) {
$regex = "/\b(" . $search[$i] . ")\b/";
echo $text = preg_replace_callback($regex, function($m)use($replacement){
return $replacement[$m[1]];
}, $text);
echo "<br>";
}
Basically, instead of relying on the regex to do that work I do a for loop to go through each replacement and create the regex. That way the order of the array matters. I also changed the initial $text to test if it worked

preg_replace replace a prefix with a suffix plus itself

I have a list of mobile phone $numbers and i have to change them, prefixing the number 39, if the number itself starts with one in $prefixes array.
I don't now how to back referencing to the found prefix, or (it's the same) how to get the matched prefix). I've tried the following but it's not working:
$numbers = array('3284121532', '3478795687'); // the subject
$prefixes = array('328', '347'); // (will be) the pattern
// Build regex for each element of $prefix array
$pattern = array_map(function($s) { return "/^$s/"; }, $prefixes);
$replace = "39\{$1}";
var_dump(preg_replace($pattern, $replace, $numbers);
Any help would be appreciated, thanks.
$numbers = array(3284121532, 3478795687);
$prefixes = implode('|',array(328, 347));
$numbers = array_map(function($n) use ($prefixes) {
return preg_replace("/^($prefixes)/", '39$1', $n);
}, $numbers);
print_r($numbers);
The above will output
Array
(
[0] => 393284121532
[1] => 393478795687
)
If you want to include the whole match in your replacement, you can use $0:
$replace = '39$0';
Just use $1 within single quote.
$replace = '39$1;
You can do that with
$results = array_map(function($s) {
return preg_replace("/^(".join('|' . $prefixes) . "\d+)/", '39$1', $s);
}, $numbers );

mb_eregi_replace multiple matches get them

$string = 'test check one two test3';
$result = mb_eregi_replace ( 'test|test2|test3' , '<$1>' ,$string ,'i');
echo $result;
This should deliver: <test> check one two <test3>
Is it possible to get, that test and test3 was found, without using another match function ?
You can use preg_replace_callback instead:
$string = 'test check one two test3';
$matches = array();
$result = preg_replace_callback('/test|test2|test3/i' , function($match) use ($matches) {
$matches[] = $match;
return '<'.$match[0].'>';
}, $string);
echo $result;
Here preg_replace_callback will call the passed callback function for each match of the pattern (note that its syntax differs from POSIX). In this case the callback function is an anonymous function that adds the match to the $matches array and returns the substitution string that the matches are to be replaced by.
Another approach would be to use preg_split to split the string at the matched delimiters while also capturing the delimiters:
$parts = preg_split('/test|test2|test3/i', $string, null, PREG_SPLIT_DELIM_CAPTURE);
The result is an array of alternating non-matching and matching parts.
As far as I know, eregi is deprecated.
You could do something like this:
<?php
$str = 'test check one two test3';
$to_match = array("test", "test2", "test3");
$rep = array();
foreach($to_match as $val){
$rep[$val] = "<$val>";
}
echo strtr($str, $rep);
?>
This too allows you to easily add more strings to replace.
Hi following function used to found the any word from string
<?php
function searchword($string, $words)
{
$matchFound = count($words);// use tha no of word you want to search
$tempMatch = 0;
foreach ( $words as $word )
{
preg_match('/'.$word.'/',$string,$matches);
//print_r($matches);
if(!empty($matches))
{
$tempMatch++;
}
}
if($tempMatch==$matchFound)
{
return "found";
}
else
{
return "notFound";
}
}
$string = "test check one two test3";
/*** an array of words to highlight ***/
$words = array('test', 'test3');
$string = searchword($string, $words);
echo $string;
?>
If your string is utf-8, you could use preg_replace instead
$string = 'test check one two test3';
$result = preg_replace('/(test3)|(test2)|(test)/ui' , '<$1>' ,$string);
echo $result;
Oviously with this kind of data to match the result will be suboptimal
<test> check one two <test>3
You'll need a longer approach than a direct search and replace with regular expressions (surely if your patterns are prefixes of other patterns)
To begin with, the code you want to enhance does not seem to comply with its initial purpose (not at least in my computer). You can try something like this:
$string = 'test check one two test3';
$result = mb_eregi_replace('(test|test2|test3)', '<\1>', $string);
echo $result;
I've removed the i flag (which of course makes little sense here). Still, you'd still need to make the expression greedy.
As for the original question, here's a little proof of concept:
function replace($match){
$GLOBALS['matches'][] = $match;
return "<$match>";
}
$string = 'test check one two test3';
$matches = array();
$result = mb_eregi_replace('(test|test2|test3)', 'replace(\'\1\')', $string, 'e');
var_dump($result, $matches);
Please note this code is horrible and potentially insecure. I'd honestly go with the preg_replace_callback() solution proposed by Gumbo.

Create acronym from a string containing only words

I'm looking for a way that I can extract the first letter of each word from an input field and place it into a variable.
Example: if the input field is "Stack-Overflow Questions Tags Users" then the output for the variable should be something like "SOQTU"
$s = 'Stack-Overflow Questions Tags Users';
echo preg_replace('/\b(\w)|./', '$1', $s);
the same as codaddict's but shorter
For unicode support, add the u modifier to regex: preg_replace('...../u',
Something like:
$s = 'Stack-Overflow Questions Tags Users';
if(preg_match_all('/\b(\w)/',strtoupper($s),$m)) {
$v = implode('',$m[1]); // $v is now SOQTU
}
I'm using the regex \b(\w) to match the word-char immediately following the word boundary.
EDIT:
To ensure all your Acronym char are uppercase, you can use strtoupper as shown.
Just to be completely different:
$input = 'Stack-Overflow Questions Tags Users';
$acronym = implode('',array_diff_assoc(str_split(ucwords($input)),str_split(strtolower($input))));
echo $acronym;
$initialism = preg_replace('/\b(\w)\w*\W*/', '\1', $string);
If they are separated by only space and not other things. This is how you can do it:
function acronym($longname)
{
$letters=array();
$words=explode(' ', $longname);
foreach($words as $word)
{
$word = (substr($word, 0, 1));
array_push($letters, $word);
}
$shortname = strtoupper(implode($letters));
return $shortname;
}
Regular expression matching as codaddict says above, or str_word_count() with 1 as the second parameter, which returns an array of found words. See the examples in the manual. Then you can get the first letter of each word any way you like, including substr($word, 0, 1)
The str_word_count() function might do what you are looking for:
$words = str_word_count ('Stack-Overflow Questions Tags Users', 1);
$result = "";
for ($i = 0; $i < count($words); ++$i)
$result .= $words[$i][0];
function initialism($str, $as_space = array('-'))
{
$str = str_replace($as_space, ' ', trim($str));
$ret = '';
foreach (explode(' ', $str) as $word) {
$ret .= strtoupper($word[0]);
}
return $ret;
}
$phrase = 'Stack-Overflow Questions IT Tags Users Meta Example';
echo initialism($phrase);
// SOQITTUME
$s = "Stack-Overflow Questions IT Tags Users Meta Example";
$sArr = explode(' ', ucwords(strtolower($s)));
$sAcr = "";
foreach ($sArr as $key) {
$firstAlphabet = substr($key, 0,1);
$sAcr = $sAcr.$firstAlphabet ;
}
using answer from #codaddict.
i also thought in a case where you have an abbreviated word as the word to be abbreviated e.g DPR and not Development Petroleum Resources, so such word will be on D as the abbreviated version which doesn't make much sense.
function AbbrWords($str,$amt){
$pst = substr($str,0,$amt);
$length = strlen($str);
if($length > $amt){
return $pst;
}else{
return $pst;
}
}
function AbbrSent($str,$amt){
if(preg_match_all('/\b(\w)/',strtoupper($str),$m)) {
$v = implode('',$m[1]); // $v is now SOQTU
if(strlen($v) < 2){
if(strlen($str) < 5){
return $str;
}else{
return AbbrWords($str,$amt);
}
}else{
return AbbrWords($v,$amt);
}
}
}
As an alternative to #user187291's preg_replace() pattern, here is the same functionality without needing a reference in the replacement string.
It works by matching the first occurring word characters, then forgetting it with \K, then it will match zero or more word characters, then it will match zero or more non-word characters. This will consume all of the unwanted characters and only leave the first occurring word characters. This is ideal because there is no need to implode an array of matches. The u modifier ensures that accented/multibyte characters are treated as whole characters by the regex engine.
Code: (Demo)
$tests = [
'Stack-Overflow Questions Tags Users',
'Stack Overflow Close Vote Reviewers',
'Jean-Claude Vandàmme'
];
var_export(
preg_replace('/\w\K\w*\W*/u', '', $tests)
);
Output:
array (
0 => 'SOQTU',
1 => 'SOCVR',
2 => 'JCV',
)

Categories