Extract all strings values from code - php

everyone. I have a problem and I can't resolve it.
Pattern: \'(.*?)\'
Source string: 'abc', 'def', 'gh\'', 'ui'
I need [abc], [def], [gh\'], [ui]
But I get [abc], [def], [gh\], [, ] etc.
Is it possible? Thanks in advance

PHP Code: Using negative lookbehind
$s = "'abc', 'def', 'ghf\\\\', 'jkl\'f'";
echo "$s\n";
if (preg_match_all("~'.*?(?<!(?:(?<!\\\\)\\\\))'~", $s, $arr))
var_dump($arr[0]);
OUTOUT:
array(4) {
[0]=>
string(5) "'abc'"
[1]=>
string(5) "'def'"
[2]=>
string(7) "'ghf\\'"
[3]=>
string(8) "'jkl\'f'"
}
Live Demo: http://ideone.com/y80Gas

Yes, those matches are possible.
But if you mean to ask whether it's possible to get what's inside the quotes, the easiest here would be to split by comma (through a CSV parser preferably) and trim any trailing spaces.
Otherwise, you could try something like:
\'((?:\\\'|[^\'])+)\'
Which will match either \' or a non-quote character, but will fail against stuff like \\'...
A longer, and slower regex you might use for this case is:
\'((?:(?<!\\)(?:\\\\)*\\\'|[^\'])+)\'
In PHP:
preg_match_all('/\'((?:(?<!\\)\\\'|[^\'])+)\'/', $text, $match);
Or if you use double quotes:
preg_match_all("/'((?:(?<!\\\)\\\'|[^'])+)'/", $text, $match);
Not sure why there's an error with (?<!\\) (I really mean one literal backslash) when it should be working fine. It works if the pattern is changed to (?<!\\\\).
ideone demo
EDIT: Found a simpler, better, faster regex:
preg_match_all("/'((?:[^'\\]|\\.)+)'/", $text, $match);

<?php
// string to extract data from
$string = "'abc', 'def', 'gh\'', 'ui'";
// make the string into an array with a comma as the delimiter
$strings = explode(",", $string);
# OPTION 1: keep the '
// or, if you want to keep that escaped single quote
$replacee = ["'", " "];
$strings = str_replace($replacee, "", $strings);
$strings = str_replace("\\", "\'", $strings);
# OPTION 2: remove the ' /// uncomment tripple slash
// replace the single quotes, spaces, and the backslash
/// $replacee = ["'", "\\", " "];
// do the replacement, the $replacee with an empty string
/// $strings = str_replace($replacee, "", $strings);
var_dump($strings);
?>

Instead you should use str_getcsv
str_getcsv("'abc', 'def', 'gh\'', 'ui'", ",", "'");

Related

PHP - How to remove specific character at a string?

i have more + symbol in my string and i want to remove last one and any character after it
ex
Giza+badrashen+test
You can explode your string on '+' and then join it ignoring the last element of the split (with array_slice with negative index), like this (assuming $str is your string)
$result = join('', array_slice(explode('+', $str), -1));
In case you suspect your string may not contain a '+', you can check for its presence first with strpos
if(strpos($str, '+') !== false) {
$result = join('', array_slice(explode('+', $str), -1));
}
A simple regex solution:
Assuming you have Giza+badrashen+test and want Giza+badrashen as result.
echo preg_replace("/\+[^\+]*$/", "", "Giza+badrashen+test");
Tests:
var_dump(preg_replace("/\+[^\+]*$/", "", "Giza+badrashen+test"));
var_dump(preg_replace("/\+[^\+]*$/", "", "Giza+badrashen+test+"));
var_dump(preg_replace("/\+[^\+]*$/", "", "Giza"));
Output:
string(14) "Giza+badrashen"
string(19) "Giza+badrashen+test"
string(4) "Giza"
$string = "abc1234+12+3455+xzyabc";
$string = substr($string, 0, strrpos($string,"+"));
echo $string;
> abc1234+12+3455
EDIT: and gives an empty string if there is no + but it doesn't crash/fail
EDIT2: I slightly mis-read the question the first time, my edited answer is even simpler
Regex with a negative lookahead might be the most compact solution:
$myString="foo-bar+foo+foobar";
$result = preg_split("/\+(?!.*\+)/", $myString);
echo $result[0];
//result: foo-bar+foo
No need of additional check, cause in case no + is found it just gives back the original string.
It's just worth pointing that the + must be escaped having special meaning in all flowers of regex...

Explode the string to array in php

I have a string
string(22) ""words,one","words2""
and need to explode to an array having structure
array( [0] => words,one ,[1] => words2)
To continue on the explode option you mentioned trying, you could try the following:
$str = '"words,one","words2"';
$arr = explode('","', trim($str, '"'));
print_r($arr);
Notice the trim to remove the beginning and ending quote marks, while explode uses the inner quote marks as part of the delimiter.
Output
Array
(
[0] => words,one
[1] => words2
)
I assume your "" is a typo for "\" or '".
I use regex to capture what is inside of " with (.*?) where the ? means be lazy.
I escape the " with \" to make it read them literal.
You will have your words in $m[1].
$str = '"words,one","words2"';
Preg_match_all("/\"(.*?)\"/", $str, $m);
Var_dump($m);
https://3v4l.org/G4m4f
In case that is not a typo you can use this:
Preg_match_all("/\"+(.*?)\"+/", $str, $m);
Here I add a + to each of the " which means "there can be more than one"
Using preg_split you can try :
$str = '"words,one","words2"';
$matches = preg_split("/\",\"/", trim($str, '"'));
print_r($matches);
check : https://eval.in/945572
Assuming the the input string can be broken down as follows:
The surrounding double-quotes are always present and consist of one double-quote each.
"words,one","words2" is left after removing the surrounding double-quotes.
We can extract a csv formatted string that fgetcsv can parse.
Trimming the original and wrapping it in a stream allows us to use fgetcsv. See sample code on eval.in
$fullString= '""words,one","words2""';
$innerString = substr($fullString, 1, -1)
$tempFileHandle = fopen("php://memory", 'r+');
fputs($tempFileHandle , $innerString);
rewind($tempFileHandle);
$explodedString = fgetcsv($tempFileHandle, 0, ',', '"');
fclose($tempFileHandle);
This method also supports other similarly formatted strings:
""words,one","words2""
""words,one","words2","words3","words,4""

RegEx Fails on Known Good Value

I have a regex designed to detect plausible Base64 strings. It works in tests at https://regex101.com for all expected test values.
~^((?:[a-zA-Z0-9/+]{4})*(?:(?:[a-zA-Z0-9/+]{3}=)|(?:[a-zA-Z0-9/+]{2}==))?)$~
However, when I use this pattern in PHP, I find some values inexplicably fail.
$tests = array(
'MFpGQkVBJTNkJTNkfTxCUj4NCg0KICAgIDwvZm9udD4=',
'MFpGRkVBJTNkJTNkfTxCUj4NCg0KICAgIDwvZm9udD4=',
'MFpGSkVBJTNkJTNkfTxCUj4NCg0KICAgIDwvZm9udD4=',
);
foreach ($tests as $str) {
$result = preg_match(
'~^((?:[a-zA-Z0-9/+]{4})*(?:(?:[a-zA-Z0-9/+]{3}=)|(?:[a-zA-Z0-9/+]{2}==))?)$~i',
preg_replace('~[\s\R]~u', "", $str)
);
var_dump($result);
}
results:
int(1)
int(0)
int(1)
Question: Why does this pattern fail for the second test string?
Problem is in your preg_replace call:
preg_replace('~[\s\R]~u', "", $str)
Inside character class \R is matching and removing literal R from 2nd element in array and thus causing preg_match to fail.
Change it to:
preg_replace('~\s|\R~u', "", $str)
As \s will also match \R you can just do:
preg_replace('~\s+~u', "", $str)

php preg_match get numbers between two strings

Hi I'm starting to learn php regex and have the following problem:
I need to extract the numbers inside $string.
The regex I use returns "NULL".
$string = 'Clasificación</a> (2194) </li>';
$regex = '/Clasificación</a>((.*?))</li>/';
preg_match($regex , $string, $match);
var_dump($match);
Thanks in advance.
There are three problems with your regex:
You aren't escaping the forward slash. You're using the forward slash as a delimiter, so if you want to use it as a literal character inside the expression, you need to escape it
((.*?)) doesn't do what you think it does. It creates two capturing groups -- one nested inside the other. I assume, you're trying to capture what's inside the parentheses. For that, you'll need to escape the ( and ) characters. The expression would become: \((.*?)\)
Your expression doesn't handle whitespace. In the string you've given, there is whitespace between the </a> and the beginning of the number -- </a> (2194). To ignore the whitespace and capture just the number, you need to use \s (which matches any whitespace character). For that, you need to write \s*\((.*?)\)\s*.
The final regular expression after fixing all the above errors, will look like:
$regex = '~Clasificación</a>\s*\((.*?)\)\s*</li>~';
Full code:
$string = 'Clasificación</a> (2194) </li>';
$regex = '~Clasificación</a>\s*\((.*?)\)\s*</li>~';
preg_match($regex , $string, $match);
var_dump($match);
Output:
array(2) {
[0]=>
string(32) "Clasificación (2194) "
[1]=>
string(4) "2194"
}
Demo.
You forget to espace / in your regex, since you're using the / as a delimiter:
$regex = '/Clasificación<\/a>((.*?))<\/li>/';
// ^ delimiter ^^ ^ delimiter
// ^^ / in a string which is escaped
Another way can be to change that delimiter, and then you will not have to escape it:
$regex = '#Clasificación<\/a>((.*?))<\/li>#';
See the PHP documentation for more information.
you will have to escape out the special characters that you want to match:
$regex = '/Clasificación<\/a> \((.*?)\) <\/li>/'
and may want to make your match a little more specific where it matters (depending on your use case)
$regex = '/Clasificación<\/a>\s*\(([0-9]+)\)\s*<\/li>/';
that will allow for 0 or more spaces before or after the (1234) and only match if there are only numbers in the ()
I just tried this in php:
php > preg_match($regex , $string, $match);
php > var_dump($match);
array(2) {
[0]=>
string(30) "Clasificacin</a> (2194) </li>"
[1]=>
string(4) "2194"
}

Regex with multiple newlines in sequence

I'm trying to use PHP's split() (preg_split() is also an option if your answer works with it) to split up a string on 2 or more \r\n's. My current effort is:
split("(\r\n){2,}",$nb);
The problem with this is it matches every time there is 2 or 3 \r\n's, then goes on and finds the next one. This is ineffective with 4 or more \r\n's.
I need all instances of two or more \r\n's to be treated the same as two \r\n's. For example, I'd need
Hello\r\n\r\nMy\r\n\r\n\r\n\r\n\r\n\r\nName is\r\nShadow
to become
array('Hello','My','Name is\r\nShadow');
preg_split() should do it with
$pattern = "/(\\r\\n){2,}/";
What about the following suggestion:
$nb = implode("\r\n", array_filter(explode("\r\n", $nb)));
It works for me:
$nb = "Hello\r\n\r\nMy\r\n\r\n\r\n\r\n\r\n\r\nName is\r\nShadow";
$parts = split("(\r\n){2,}",$nb);
var_dump($parts);
var_dump($parts === array('Hello','My',"Name is\r\nShadow"));
Prints:
array(3) {
[0]=>
string(5) "Hello"
[1]=>
string(2) "My"
[2]=>
string(15) "Name is
Shadow"
}
bool(true)
Note the double quotes in the second test to get the characters represented by \r\n.
Adding the PREG_SPLIT_NO_EMPTY flag to preg_replace() with Tomalak's pattern of "/(\\r\\n){2,}/" accomplished this for me.
\R is shorthand for matching newline sequences across different operating systems. You can prevent empty elements being created at the start and end of your output array by using the PREG_SPLIT_NO_EMPTY flag or you could call trim() on the string before splitting.
Code: (Demo)
$string = "\r\n\r\nHello\r\n\r\nMy\r\n\r\n\r\n\r\n\r\n\r\nName is\r\nShadow\r\n\r\n\r\n\r\n";
var_export(preg_split('~\R{2,}~', $string, 0, PREG_SPLIT_NO_EMPTY));
echo "\n---\n";
var_export(preg_split('~\R{2,}~', trim($string)));
Output from either technique:
array (
0 => 'Hello',
1 => 'My',
2 => 'Name is
Shadow',
)

Categories