Regex escape specific characters - php

I'm using preg_split to make array with some values.
If I have value such as 'This*Value', preg_split will split the value to array('This', 'Value') because of the * in the value, but I want to split it to where I specified, not to the * from the value.How can escape the value, so symbols of the string not to take effect on the expression ?
Example:
// Cut into {$1:$2}
$str = "{Some:Value*Here}";
$result = preg_split("/[\{(.*)\:(.*)\}]+/", $str, -1, PREG_SPLIT_NO_EMPTY);
// Result:
Array(
'Some',
'Value',
'Here'
);
// Results wanted:
Array(
'Some',
'Value*Here'
);

The [ and ] are interpreted as character classes, so any character inside them matches. Try this one, but don't split on it, use preg_match and look in the match's captured groups.
"/(\{([^:]*)\:([^:]*)\})+/"
Original answer (which does not apply to the OP's problem):
If you want to escape * in your values with \ like this\*value, you can split on this regex:
(?<!\\)\*

Your current regular expression is a little... wild. Most special characters inside a character class are treated literally, so it can be greatly simplified:
$str = "{Some:Value*Here}";
$result = preg_split("/[{}:]+/", $str, -1, PREG_SPLIT_NO_EMPTY);
And now $result looks like this:
array(2) {
[0] => string(4) "Some"
[1] => string(10) "Value*Here"
}

The correct and safest solution to your problem is to use preg_quote. If the string contains chars that shall not be quoted, you need to str_replace them back after quoting.

Related

Php regexp for escaping characters

I have a string that the user may split manually using comma's.
For example, the string value1,value2,value3 should result in the array:
["value1", "value2", "value3"]
Now what if the user wishes to allow a comma as a substring? I would like to solve that problem by letting the user escape a comma using two comma's or a backslash. For example, the string
"Hi, Stackoverflow" would be written as "Hi,, Stackoverflow" or "Hi\, Stackoverflow".
I find it difficult to evaluate such a string however. I have attempted preg splitting, but there is no way to see if a lookbehind or lookahead series of characters consists of an even or odd number. Furthermore, backslashes and double comma's meant for escaping must be removed as well, which probably requires an additional replace function.
$text = 'Hello, World \,asdas, 123';
$data = preg_split('/(?<=[^\\\]),/',$text);
print_r($data);
Result
Array ( [0] => Hello [1] => World \,asdas [2] => 123 )
For this I would run preg_replace_callback which allows you to count escape characters used and determine what to do with them. If it turns out that coma is not escaped, replace it to some non-printable character that should not be used by user in his input and then explode by this character:
<?php
$str = "One,Two\\, Two\\\\,Three";
$delimiter = chr(0x0B); // vertical tab, hope you do not expect it in the input?
$escaped = preg_replace_callback('/(\\\\)*,?/', function($m) use($delimiter){
if(!isset($m[1]) || strlen($m[0])%2) {
return str_replace(',',$delimiter,preg_replace('/\\\\{2}/','\\',$m[0]));
} else {
return str_replace('\\,',',', preg_replace('/\\\\{2}/','\\',$m[0]));
}
}, $str);
$array = explode($delimiter, $escaped);

Regular expression needed for PHP preg_split

I need help with a regular expression in PHP.
I have one string containing a lot of data and the format could be like this.
key=value,e4354ahj\,=awet3,asdfa\=asdfa=23f23
So I have 2 delimiters , and = where , is the set of key and value. The thing is that key and value can contain the same symbols , and = but they will always be escaped. So I cant use explode. I need to use preg_split but I am no good at regular expressions.
Could someone give me a hand with this one?
You need to use negative lookbehind:
// 4 backslashes because they are in a PHP string, so PHP translates them to \\
// and then the regex engine translates the \\ to a literal \
$keyValuePairs = preg_split('/(?<!\\\\),/', $input);
This will split on every , that is not escaped, so you get key-value pairs. You can do the same for each pair to separate the key and value:
list($key, $value) = preg_split('/(?<!\\\\)=/', $pair);
See it in action.
#Jon's answer is awesome. I though of providing a solution by matching the string:
preg_match_all('#(.*?)(?<!\\\\)=(.*?)(?:(?<!\\\\),|$)#', $string, $m);
// You'll find the keys in $m[1] and the values in $m[2]
$array = array_combine($m[1], $m[2]);
print_r($array);
Output:
Array
(
[key] => value
[e4354ahj\,] => awet3
[asdfa\=asdfa] => 23f23
)
Explanation:
(.*?)(?<!\\\\)= : match anything and group it until = not preceded by \
(.*?)(?:(?<!\\\\),|$) : match anything and group it until , not preceded by \ or end of line.

Regex To Break String Apart By Capitalzation

So I have a regex that breaks a string apart that assumes camelCase or PascalCase and converts it into lowercase_with_underscores. That regex looks like this (php):
strtolower(preg_replace('/(?!^)[[:upper:]]/','_\0', $string));
I want to modify this so that it will be able to also break up the string where it assume a string of capitalizes in a row as one unit. For example, I would to be able to break up the following strings:
'GUID' => 'guid'
'SOME_VALUES' => 'some_value'
'someThingELSE' => 'some_thing_else'
Any suggestions on how to modify the regex to do this?
How about:
$result = strtolower(preg_replace('/([a-z])([A-Z])/', '$1_$2', $string));

Get integer value from malformed query string

I'm looking for an way to parse a substring using PHP, and have come across preg_match however I can't seem to work out the rule that I need.
I am parsing a web page and need to grab a numeric value from the string, the string is like this
producturl.php?id=736375493?=tm
I need to be able to obtain this part of the string:
736375493
$matches = array();
preg_match('/id=([0-9]+)\?/', $url, $matches);
This is safe for if the format changes. slandau's answer won't work if you ever have any other numbers in the URL.
php.net/preg-match
<?php
$string = "producturl.php?id=736375493?=tm";
preg_match('~id=(\d+)~', $string, $m );
var_dump($m[1]); // $m[1] is your string
?>
$string = "producturl.php?id=736375493?=tm";
$number = preg_replace("/[^0-9]/", '', $string);
Unfortunately, you have a malformed url query string, so a regex technique is most appropriate. See what I mean.
There is no need for capture groups. Just match id= then forget those characters with \K, then isolate the following one or more digital characters.
Code (Demo)
$str = 'producturl.php?id=736375493?=tm';
echo preg_match('~id=\K\d+~', $str, $out) ? $out[0] : 'no match';
Output:
736375493
For completeness, there 8s another way to scan the formatted string and explicitly return an int-typed value. (Demo)
var_dump(
sscanf($str, '%*[^?]?id=%d')[0]
);
The %*[^?] means: greedily match one or more non-question mark characters, but do not capture the substring. The remainder of the format parameter matches the literal sequence ?id=, then greedily captures one or more numbers. The returned value will be cast as an integer because of the %d placeholder.

Split text using multiple delimiters into an array of trimmed values

I've got a group of strings which I need to chunk into an array.
The string needs to be split on either /, ,, with, or &.
Unfortunately it is possible for a string to contain two of the strings which needs to be split on, so I can't use split() or explode().
For example, a string could say first past/ going beyond & then turn, so I am trying to get an array that would return:
array('first past', 'going beyond', 'then turn')
The code I am currently using is
$splittersArray=array('/', ',', ' with ','&');
foreach($splittersArray as $splitter){
if(strpos($string, $splitter)){
$splitString = split($splitter, $string);
foreach($splitString as $split){
I can't seem to find a function in PHP that allows me to do this.
Do I need to be passing the string back into the top of the funnel, and continue to go through the foreach() after the string has been split again and again?
This doesn't seem very efficient.
Use a regular expression and preg_split.
In the case you mention, you would get the split array with:
$splitString = preg_split('/(\/|\,| with |\&/)/', $string);
To concisely write the pattern use a character class for the single-character delimiters and add the with delimiter as a value after the pipe (the "or" character in regex). Allow zero or more spaces on either side of the group of delimiters so that the values in the output don't need to be trimmed.
I am using the PREG_SPLIT_NO_EMPTY function flag in case a delimiter occurs at the start or end of the string and you don't want to have any empty elements generated.
Code: (Demo)
$string = 'first past/ going beyond & then turn with everyone';
var_export(
preg_split('~ ?([/,&]|with) ?~', $string, 0, PREG_SPLIT_NO_EMPTY)
);
Output:
array (
0 => 'first past',
1 => 'going beyond',
2 => 'then turn',
3 => 'everyone',
)

Categories