get all words from string with regular expression - php

I need preg_match that can find me all words from string.
for example:
$str = "string: hi, it is string.";
I would like get this:
[0] => string
[1] => hi
[2] => it
[3] => is
[4] => string
I use with '/[a-z]+/ui', but I get this:
[0] => string:
[1] => hi,
[2] => it
[3] => is
[4] => string.

You said preg_match(), instead you should be using preg_match_all() and there is no need to use the u modifier in your regular expression here.
$str = "string: hi, it is string.";
preg_match_all('/[a-z]+/i', $str, $matches);
print_r($matches[0]);
Output
Array
(
[0] => string
[1] => hi
[2] => it
[3] => is
[4] => string
)

Related

PHP preg_match() doesn't match all subpatterns

I have a preg_match() which matches the pattern but doesn't receive the expected matches (in third param).
My regex patterns have multiple subpatterns.
$pattern = "~^&multi&[^&]+(&(?:(p-(?<sad>[1-9]\d*)|page-(?<sad>[1-9]\d*))))?&[^&]+(&(?:(p-(?<gogosi>[1-9]\d*)|page-(?<gogosi>[1-9]\d*))))?&?$~J";
$string = "&multi&mickael&p-23&george&page-34";
preg_match($pattern, $string, $matches);
This is what $matches contains:
Array
(
[0] => &multi&mickael&p-23&george&page-34
[1] => &p-23
[2] => p-23
[sad] =>
[3] => 23
[4] =>
[5] => &page-34
[6] => page-34
[gogosi] => 34
[7] =>
[8] => 34
)
The problem is [sad] should have 23 value.
If I don't include in $string second page (page-34), 'cause is optional [...]
$string = "&multi&mickael&p-23&george";
[...] I have good $matches 'cause my [sad] got his value:
Array
(
[0] => &multi&mickael&p-23&george
[1] => &p-23
[2] => p-23
[sad] => 23
[3] => 23
)
But I want regex to return properly value even when I have both paginations in $string.
What to do such that all subpatterns will have their value ?
Note: Words as ('p', 'page') are only examples. Can be any words there.
Note: Above data is just an example. Don't give me workaround solutions, but something good for any input data.
You may use a branch reset group, (?|...|...):
'~^&multi&[^&]+(&((?|p-(?<sad>[1-9]\d*)|page-(?<sad>[1-9]\d*))))?&[^&]+(&((?|p-(?<gogosi>[1-9]\d*)|page-(?<gogosi>[1-9]\d*))))?&?$~J'
See the regex demo.
See the PHP demo:
$pattern = "~^&multi&[^&]+(&((?|p-(?<sad>[1-9]\d*)|page-(?<sad>[1-9]\d*))))?&[^&]+(&((?|p-(?<gogosi>[1-9]\d*)|page-(?<gogosi>[1-9]\d*))))?&?$~J";
$string = "&multi&mickael&p-23&george&page-34";
if (preg_match($pattern, $string, $matches)) {
print_r($matches);
}
Output:
Array
(
[0] => &multi&mickael&p-23&george&page-34
[1] => &p-23
[2] => p-23
[sad] => 23
[3] => 23
[4] => &page-34
[5] => page-34
[gogosi] => 34
[6] => 34
)

split string and make array, regular expression

$content = "[2][6][11]";
This i would like to split into an array with values [2], [6] and [11].
preg_split("/\[*\]/i", $content);
Wrong output: Array ( [0] => [2 [1] => [5 [2] => )
Any help what's wrong on the regular expression.
thanks.
You can use lookarounds for this split:
$content = "[2][6][11]";
print_r(preg_split('/(?<=\])(?=\[)/', $content));
Output:
Array
(
[0] => [2]
[1] => [6]
[2] => [11]
)
You can use lookarounds to test what are the characters around the position you want to find without matching them.
print_r(preg_split('~(?<=])(?=\[)~', $content));
Note that if you already know how your string is formatted, you can also use preg_match_all with a more simple pattern: ~\[\d+]~
You can also do it with preg_match_all :
$content = "[2][6][11]";
preg_match_all("/\[.*\]/Ui", $content, $matches);
$result = $matches[0];
print_r($result);
Output:
Array
(
[0] => [2]
[1] => [6]
[2] => [11]
)

What is the regex for the text between quotes?

Ok, I have tried looking at other answers, but couldn't get mine solved. So here is the code:
{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}
I need to get every second value in the quotes (as the "name" values are constant). I actually worked out that I need to get text between :" and " but i can't manage to write a regex for that.
EDIT: I'm doing preg_match_all in php. And its between :" and ", not " and " as someone else edited.
Why on earth would you attempt to parse JSON with regular expressions? PHP already parses JSON properly, with built-in functionality.
Code:
<?php
$input = '{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}';
print_r(json_decode($input, true));
?>
Output:
Array
(
[chg] => -0.71
[vol] => 40700
[time] => 11.08.2011 12:29:09
[high] => 1.417
[low] => 1.360
[last] => 1.400
[pcl] => 1.410
[turnover] => 56,560.25
)
Live demo.
You may need to escape characters or add a forward slash to the front or back depending on your language. But it's basically:
:"([^"].*?)"
or
/:"([^"].*?)"/
I've test this in groovy as below and it works.
import java.util.regex.*;
String test='{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}'
// Create a pattern to match breaks
Pattern p = Pattern.compile(':"([^"]*)"');
// Split input with the pattern
// Run some matches
Matcher m = p.matcher(test);
while (m.find())
System.out.println("Found comment: "+m.group().replace('"','').replace(":",""));
Output was:
Found comment: -0.71
Found comment: 40700
Found comment: 11.08.2011 12:29:09
Found comment: 1.417
Found comment: 1.360
Found comment: 1.400
Found comment: 1.410
Found comment: 56,560.25
PHP Example
<?php
$subject = '{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}';
$pattern = '/(?<=:")[^"]*/';
preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?>
Output is:
Array ( [0] => Array ( [0] => Array ( [0] => -0.71 [1] => 8 ) [1] => Array ( [0] => 40700 [1] => 22 ) [2] => Array ( [0] => 11.08.2011 12:29:09 [1] => 37 ) [3] => Array ( [0] => 1.417 [1] => 66 ) [4] => Array ( [0] => 1.360 [1] => 80 ) [5] => Array ( [0] => 1.400 [1] => 95 ) [6] => Array ( [0] => 1.410 [1] => 109 ) [7] => Array ( [0] => 56,560.25 [1] => 128 ) ) )

How to preg_split using PREG_SPLIT_DELIM_CAPTURE

$str = "blabla and, some more blah";
$delimiters = " ,¶.\n";
$char_buff = preg_split("/(,) /", $str, -1, PREG_SPLIT_DELIM_CAPTURE);
print_r($char_buff);
I get:
Array (
[0] => blabla and
[1] => ,
[2] => some more blah
)
I was able to figure out how to use the parenthesis to get the comma to show up in its own array element -- but how can I do this with multiple different delimiters (for example, those in the $delimiters variable)?
You need to create a character class by wrapping the delimiters with [ and ].
<?php
$str = "blabla and, some more blah. Blah.\nSecond line.";
$delimiters = " ,¶.\n";
$char_buff = preg_split('/([' . $delimiters . '])/', $str, -1,
PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
print_r($char_buff);
You also need to use PREG_SPLIT_NO_EMPTY so that in places where you get two matches in a row, for instance a comma followed by a space, you don't get an empty match.
Output
Array
(
[0] => blabla
[1] =>
[2] => and
[3] => ,
[4] =>
[5] => some
[6] =>
[7] => more
[8] =>
[9] => blah
[10] => .
[11] =>
[12] => Blah
[13] => .
[14] =>
[15] => Second
[16] =>
[17] => line
[18] => .
)
Depending on what you are doing, using strtok may be a more appropriate way of doing it though.
Use something like:
'/([,.])/'
That is put each delimiter in that square bracket.
Each delimiter expression needs to be inside its own group.
print_r(preg_split('/2\d4/' , '12345', null, PREG_SPLIT_DELIM_CAPTURE));
Array ( [0] => 1 [1] => 5 )
print_r(preg_split('/(2)(\d)(4)/', '12345', null, PREG_SPLIT_DELIM_CAPTURE));
Array ( [0] => 1 [1] => 2 [2] => 3 [3] => 4 [4] => 5 )

preg_match return all parts in array

I've got following php code:
$match = array();
if (preg_match("%^(/\d+)(/test)(/\w+)*$%", "/25/test/t1/t2/t3/t4", $match))
print_r($match);
I'm getting this result:
Array ( [0] => /25/test/t1/t2/t3/t4 [1] => /25 [2] => /test [3] => /t4 )
What do i need to change in my regexp to get this result:
Array ( [0] => /25/test/t1/t2/t3/t4 [1] => /25 [2] => /test [3] => /t1 [4] => /t2 [5] => /t3 [6] => /t4)
you need preg_match_all
preg_match_all( '~(/\w+)~', $str, $matches );
in your situation you can use explode too
<?php
$str = '/a/b/1/2/3/4';
if(preg_match('/^(\/\w+)*$/', $str) && preg_match_all('/\/\w+/', $str, $matches)) {
$matches = $matches[0];
print_r($matches);
}
?>
Prints:
Array
(
[0] => /a
[1] => /b
[2] => /1
[3] => /2
[4] => /3
[5] => /4
)
Using your original example, you could use a recursive expression:
"%(/\w+)(?>[^(/\w+)]?|(?R))%"
This works my matching (/\w+) subexpressions in turn. Therfore the match for
"/a/b/1/2/3/4"
Would be:
Array
(
[0] => Array
(
[0] => /a [1] => /b [2] => /1 [3] => /2 [4] => /3 [5] => /4
)
...
However your later examples complicate things. A simple 0 or more match will only return the last (greedy) or first (ungreedy) match - not all submatches. preg_match_all won't be able to handle your dynamic expression.
You will have to clarify what you're trying to achieve in more detail before a suitable solution can be provided.

Categories