php preg_replace not putting dash back in? - php

So, I'm doing some manipulation on lat/long pairs, and I need to turn this:
39.1889375383777,-94.48019109594397
into:
39.1889375383777 -94.48019109594397
I can't use str_replace, unless I want to have an array of 10 search and 10 replace strings, so I was hoping to use preg_replace:
$query1 = preg_replace( "/([0-9-]),([0-9-])/", "\1 \2", $query );
The problem is that the "-" gets lost:
39.1889375383777 94.48019109594397
Note, that I have a string containing a list of these, trying to do all at once:
[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]
I managed to make this work with preg_replace_callback:
$str = preg_replace_callback( "/([0-9-]),([0-9-])/",
function ($matches) {return $matches[1] . " " . $matches[2];},
$query
);
But still not sure why the simpler preg_match didn't work?

Your main issue is that "\1 \2" define a "\x1\x20\x2" string, where the first character is a SOH char and the third one is STX char (see the ASCII table). To define backreferences, you need to use a literal backslash, "\\", or, better, use $n notation, and better inside a single-quoted string literal.
You can also use a solution without backreferences:
preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str)
Details:
(?<=\d) - a location that is immediately preceded with a digit
, - a comma
(?=-?\d) - a location that is immediately followed with an optional - and a digit.
See the PHP demo:
$str = '[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]';
echo preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str);
// => [[39.1889375383777 -94.48019109594397],[39.18425796890108 -94.28288005131176],[39.41972019529712 -94.19956344733345],[39.41412315915102 -94.41932608390658],[39.34785744845041 -94.4893603307242],[39.1889375383777 -94.48019109594397]]

Related

Avoid backreference replacement in php's preg_replace

Consider the below use of preg_replace
$str='{{description}}';
$repValue='$0.0 $00.00 $000.000 $1.1 $11.11 $111.111';
$field = 'description';
$pattern = '/{{'.$field.'}}/';
$str =preg_replace($pattern, $repValue, $str );
echo $str;
// Expected output: $0.0 $00.00 $000.000 $1.1 $11.11 $111.11
// Actual output: {{description}}.0 {{description}}.00 {{description}}0.000 .1 .11 1.111
Here is a phpFiddle showing the issue
It's clear to me that the actual output is not as expected because preg_replace is viewing $0, $0, $0, $1, $11, and $11 as back references for matched groups replacing $0 with the full match and $1 and $11 with an empty string since there are no capture groups 1 or 11.
How can I prevent preg_replace from treating prices in my replacement value as back references and attempting to fill them?
Note that $repValue is dynamic and it's content will not be know before the operation.
Escape the dollar character before using a character translation (strtr):
$repValue = strtr('$0.0 $00.00 $000.000 $1.1 $11.11 $111.111', ['$'=>'\$']);
For more complicated cases (with dollars and escaped dollars) you can do this kind of substitution (totally waterproof this time):
$str = strtr($str, ['%'=>'%%', '$'=>'$%', '\\'=>'\\%']);
$repValue = strtr($repValue, ['%'=>'%%', '$'=>'$%', '\\'=>'\\%']);
$pattern = '/{{' . strtr($field, ['%'=>'%%', '$'=>'$%', '\\'=>'\\%']) . '}}/';
$str = preg_replace($pattern, $repValue, $str );
echo strtr($str, ['%%'=>'%', '$%'=>'$', '\\%'=>'\\']);
Note: if $field contains only a literal string (not a subpattern), you don't need to use preg_replace. You can use str_replace instead and in this case you don't have to substitute anything.

Regex rules in an array

Maybe it can not be solved this issue as I want, but maybe you can help me guys.
I have a lot of malformed words in the name of my products.
Some of them has leading ( and trailing ) or maybe one of these, it is same for / and " signs.
What I do is that I am explode the name of the product by spaces, and examines these words.
So I want to replace them to nothing. But, a hard drive could be 40GB ATA 3.5" hard drive. I need to process all the word, but I can not use the same method for 3.5" as for () or // because this 3.5" is valid.
So I only need to replace the quotes, when it is at the start of the string AND at end of the string.
$cases = [
'(testone)',
'(testtwo',
'testthree)',
'/otherone/',
'/othertwo',
'otherthree/',
'"anotherone',
'anothertwo"',
'"anotherthree"',
];
$patterns = [
'/^\(/',
'/\)$/',
'~^/~',
'~/$~',
//Here is what I can not imagine, how to add the rule for `"`
];
$result = preg_replace($patterns, '', $cases);
This is works well, but can it be done in one regex_replace()? If yes, somebody can help me out the pattern(s) for the quotes?
Result for quotes should be this:
'"anotherone', //no quote at end leave the leading
'anothertwo"', //no quote at start leave the trailin
'anotherthree', //there are quotes on start and end so remove them.
You may use another approach: rather than define an array of patterns, use one single alternation based regex:
preg_replace('~^[(/]|[/)]$|^"(.*)"$~s', '$1', $s)
See the regex demo
Details:
^[(/] - a literal ( or / at the start of the string
| - or
[/)]$ - a literal ) or / at the end of the string
| - or
^"(.*)"$ - a " at the start of the string, then any 0+ characters (due to /s option, the . matches a linebreak sequence, too) that are captured into Group 1, and " at the end of the string.
The replacement pattern is $1 that is empty when the first 2 alternatives are matched, and contains Group 1 value if the 3rd alternative is matched.
Note: In case you need to replace until no match is found, use a preg_match with preg_replace together (see demo):
$s = '"/some text/"';
$re = '~^[(/]|[/)]$|^"(.*)"$~s';
$tmp = '';
while (preg_match($re, $s) && $tmp != $s) {
$tmp = $s;
$s = preg_replace($re, '$1', $s);
}
echo $s;
This works
preg_replace([[/(]?(.+)[/)]?|/\"(.+)\"/], '$1', $string)

PHP Array str_replace Whole Word

I'm doing str_replace on a very long string and my $search is an array.
$search = array(
" tag_name_item ",
" tag_name_item_category "
);
$replace = array(
" tag_name_item{$suffix} ",
" tag_name_item_category{$suffix} "
);
echo str_replace($search, $replace, $my_really_long_string);
The reason why I added spaces on both $search and $replace is because I want to only match whole words. As you would have guessed from my code above, if I removed the spaces and my really long string is:
...
tag_name_item ...
tag_name_item_category ...
...
Then I would get something like
...
tag_name_item_sfx ...
tag_name_item_sfx_category ...
...
This is wrong because I want the following result:
...
tag_name_item_sfx ...
tag_name_item_category_sfx ...
...
So what's wrong?
Nothing really, it works. But I don't like it. Looks dirty, not well coded, inefficient.
I realized I can do something like this using regular expressions using the \b modifier but I'm not good with regex and so I don't know how to preg_replace.
A possible approach using regular expressions would/could look like this:
$result = preg_replace(
'/\b(tag_name_item(_category)?)\b/',
'$1' . $suffix,
$string
);
How it works:
\b: As you say are word boundaries, this is to ensure we're only matching words, not word parts
(: We want to use part of our match in the replacement string (tag_name_index has to be replaced with itself + a suffix). That's why we use a match group, so we can refer back to the match in the replacement string
tag_name_index is a literal match for that string.
(_category)?: Another literal match, grouped and made optional through use of the ? operator. This ensures that we're matching both tag_name_item and tag_name_item_category
): end of the first group (the optional _category match is the second group). This group, essentially, holds the entire match we're going to replace
\b: word boundary again
These matches are replaced with '$1' . $suffix. The $1 is a reference to the first match group (everything inside the outer brackets in the expression). You could refer to the second group using $2, but we're not interested in that group right now.
That's all there is to it really
More generic:
So, you're trying to suffix all strings starting with tag_name, which judging by your example, can be followed by any number of snake_cased words. A more generic regex for that would look something like this:
$result = preg_replace(
'/\b(tag_name[a-z_]*)\b/',
'$1' . $suffix,
$string
);
Like before, the use of \b, () and the tag_name literal remains the same. what changed is this:
[a-z_]*: This is a character class. It matches characters a-z (a to z), and underscores zero or more times (*). It matches _item and _item_category, just as it would match _foo_bar_zar_fefe.
These regex's are case-sensitive, if you want to match things like tag_name_XYZ, you'll probably want to use the i flag (case-insensitive): /\b(tag_name[a-z_]*)\b/i
Like before, the entire match is grouped, and used in the replacement string, to which we add $suffix, whatever that might be
To avoid the problem, you can use strtr that parses the string only once and chooses the longest match:
$pairs = [ " tag_name_item " => " tag_name_item{$suffix} ",
" tag_name_item_category " => " tag_name_item_category{$suffix} " ];
$result = strtr($str, $pairs);
This function replaces the entire whole word but not the substring with an array element which matches the word
<?PHP
function removePrepositions($text){
$propositions=array('/\b,\b/i','/\bthe\b/i','/\bor\b/i');
if( count($propositions) > 0 ) {
foreach($propositions as $exceptionPhrase) {
$text = preg_replace($exceptionPhrase, '', trim($text));
}
$retval = trim($text);
}
return $retval;
}
?>
See the entire example

PHP Remove Brackets on outside of string

I need to remove the outside brackets of a string, but not the inside ones.
For example:
"(-58)" -> "-58"
"('test')" -> "'test'"
"('st())" -> "st()"
" (hd)h(l() ) " -> "hd)h(l() " --> removed all chars up to the bracket
Hopefully you can see what I mean.
I know how to remove all the brackets inside a string, but am not sure how to remove just the first and last ones. I also need it to remove all the chars UP TO the bracket, as there could be a space before/after the bracket which I do not want.
Any help would be greatly appreciated.
One way is to use preg_replace(). This regex replaces leading and trailing brackets (only one) and spaces according to your examples:
/(^\s*\()|(\)\s*$)/
You can use it like this:
$string = ' (hd)h(l() ) ';
$pattern = '/(^\s*\()|(\)\s*$)/';
$replacement = '';
echo preg_replace($pattern, $replacement, $string); // Output: "hd)h(l() "
Using php's trim() may lead to unexpected over-matching. Consider the following implementation:
$strings=[
"(-58)", // -> "-58"
"('test')", // -> "'test'"
"('st())", // -> "st()"
" (hd)h(l() ) ", // -> "hd)h(l() " --> removed all chars up to the bracket
" ((2x parentheses))" // -> assumed to be "(2x parentheses)"
];
foreach($strings as $s){
// use double trim() to strip leading/trailing spaces, then parentheses
var_export(trim(trim($s,' '),'()'));
echo "\n";
}
Output:
'-58'
'\'test\''
'\'st'
'hd)h(l() '
'2x parentheses' // notice this string had two sets of parentheses removed!
Although several string manipulating functions could be used to generate the desired string, using regex is a more direct approach.
Given the following input data:
(-58)
('test')
('st())
(hd)h(l() )
((2x parentheses))
Gergo's pattern will accurately replace the desired characters with and empty string in 261 steps.
I would like to suggest a more efficient and brief pattern with equivalent accuracy given the OP's sample strings.
/^ ?\(|\) ?$/ #142 steps
Demo Link
Enhancements:
Capture groups are not needed for the pipe to separate the two alternatives. (improves efficiency and brevity)
Replace white-space character \s with literal space character. (improves brevity *and potentially accuracy)
Reduce quantifier on spaces from zero or more to zero or one. (more literal to sample data)
You can use trim() function as shown below
$a="(-58)";
$str=trim($a,"()");
echo $str;

Regular expression to change user input

I am doing a search on my database. If the user enters eg. "foo bar" I want to display rows that have both "foo" AND "bar".
My SQL syntax looks like this:
SELECT * FROM t1 WHERE MATCH (t1.foo_desc, t2.bar_desc) AGAINST ('+foo* +bar*')
How can I change the user input "foo bar" into "+foo* +bar*" using a regular expression in PHP?
$match = '+' . implode('* +', explode(' ', $input)) . '*';
This assumes that the input isn't an empty string;
Edit: As #Bart S points out, str_replace (or mb_str_replace if you're dealing with multibyte characters) would be even simpler...
$match = '+' . str_replace(' ', '* +', $input) . '*';
You should use \b and \B to identify the word/not-word boundaries, and thus insert your +/*.
First, trim the user input and remove anything strange (punctuation, quotes, basically everything "\W", except white space of course).
Then, substitute:
(?<=\w)\b
with
"*"
and:
\b(?=\w)
with
"+"
For those interested in the MATCH...AGAINST syntax This article is a decent starting point if you haven't used it before.
I would use a regular expression to replace the spaces between two words by * +:
'+' . preg_replace('/\s+/', '* +', trim($input)) . '*'
Edit   Basically this is an implementation to what Tomalak said just with \s instead of \W.

Categories