I've been trying the next regular expression
$pattern = '/([0-9]{1,2}[.]{1}([a-z|A-Z|0-9]+[\s]*)+)/';
in order to detect the following occurences in a text : [number].[text],
for example : 1.text text text
For the next text"test number one 1.first 2.second" the result is
array(3) { [0]=> string(9) "1.first 2" [1]=> string(9) "1.first 2" [2]=> string(1) "2" }
Where is the problem in the regular expression I wrote?
Thanks in advance !
Is this what you are trying to do:
\d+[.][\w\d\s]+?(?=$|\s\d+[.])
It will match all the 3 occurrences in this text: "1. text test 2. some more22 3. and more"
Your problem is the plus at the end, which means you can match things after the space. It should be:
$pattern = '/([0-9]{1,2}[.]{1}([a-z|A-Z|0-9]+[\s]*))/';
Try this:
$pattern = '/\d\.\w+/';
$str = "test number one 1.first 2.second 3.third 4.xyz 5.abcd";
preg_match_all($pattern, $str, $m);
print_r( $m );
The output is:
Array
(
[0] => Array
(
[0] => 1.first
[1] => 2.second
[2] => 3.third
[3] => 4.xyz
[4] => 5.abcd
)
)
is that what you need ? I'm not sure that I get well what you need to match so please try to explain it a bit more
You can say for example:
I would like to match x numbers followed by . followed by n letters
or something like that
Related
I am trying to extract the words from this string with PHP :
$string= '\'banana\', "orange", "apple"';
Using this pattern :
/([\'"])(.*?)\1/
But it gives me this results :
array(3) {
[0] array(3) {
[0] "'banana'"
[1] ""orange""
[2] ""apple""
}
[1] array(3) {
[0] "'"
[1] ""
[2] ""
}
[2] array(3) {
[0] "banana"
[1] "orange"
[2] "apple"
}
}
Is these a way I can clean it up to just :
array(3) {
[0] "banana"
[1] "orange"
[2] "apple"
}
Thanks for your help.
With regex, you can just extract word characters using the \w matcher.
Put it together with a Global flag, and I believe you'll get what you want :)
See this example
Use preg_split and describe all you want to remove and where you want to cut the string, then add the flag PREG_SPLIT_NO_EMPTY to filter empty parts.
$subject = '\'banana\', "orange", "apple"';
$pattern = '~["\'](?:,\s*["\']|\z)|\A["\']~';
$result = preg_split($pattern, $subject, -1, PREG_SPLIT_NO_EMPTY);
print_r($result);
Obviously this way doesn't check if each item is well enclosed between the same type of quotes, but is it really necessary?
I want my php to recognize multiple strings in a string starting with the # symbol. Those shall then be converted into variables
//whole string
$string = "hello my name is #mo and their names are #tim and #tia."
//while loop now?
#mo #tim #tia shall then be converted to variables like:
$user1 = "mo";
$user2 = "tim";
$user3 = "tia";
Is there a php command you can use to collect them all in an array?
Regular expressions are a very flexible tool for pattern recognition:
<?php
$subject = "hello my name is #mo and their names are #tim and #tia.";
$pattern = '/#(\w+)/';
preg_match_all($pattern, $subject, $tokens);
var_dump($tokens);
The output is:
array(2) {
[0] =>
array(3) {
[0] =>
string(3) "#mo"
[1] =>
string(4) "#tim"
[2] =>
string(4) "#tia"
}
[1] =>
array(3) {
[0] =>
string(2) "mo"
[1] =>
string(3) "tim"
[2] =>
string(3) "tia"
}
}
So $token[1] is the array you are interested in.
Perhaps, you use a regex to match all those string starting with "#" and put it in an array?
preg_match_all("|\#(.*)[ .,]|U",
"hello my name is #mo and their names are #tim and #tia.",
$out, PREG_PATTERN_ORDER);
out now has the matched strings..
PS: Am not a PHP developer. Just tried out something using online
compiler.!
I'm terrible at regex, hard to understand for me so I need some help. I have a variable which looks something like this:
["...=", "...=", "...="]
Those are 3 values which I want to split into an array. The way I see it, I want to split it at the comma which comes after a quote ", ". Can someone please help me with the regex for preg_split?
You could try the below code to split the input string according to ", "
<?php
$yourstring = '["...=", "...=", "...="]';
$regex = '~", "~';
$splits = preg_split($regex, $yourstring);
print_r($splits);
?>
Output:
Array
(
[0] => ["...=
[1] => ...=
[2] => ...="]
)
If you don't want "[,]" in the output then you could try the below code.
<?php
$data = '["...=", "...=", "...="]';
$regex = '~(?<=\["|", ")[^"]*~';
preg_match_all($regex, $data, $matches);
print_r($matches);
?>
Output:
Array
(
[0] => Array
(
[0] => ...=
[1] => ...=
[2] => ...=
)
)
$string = '["...=", "...=", "...="]';
$parts = preg_split('/,\s/', $string);
var_dump($parts);
Program output:
array(3) {
[0]=>
string(34) ""...=""
[1]=>
string(36) ""...=""
[2]=>
string(37) ""...=""
}
So long as the double-quote symbol cannot occur within the double-quotes that contain the content, this pattern should validate and capture the three values:
^\["([^"]+)"\], \["([^"]+)"\], \["([^"]+)"\]$
If double-quotes can appear within the content, or the number of values is variable, then this pattern will not work.
I'm using PHP and I have text like:
first [abc] middle [xyz] last
I need to get what's inside and outside of the brackets. Searching in StackOverflow I found a pattern to get what's inside:
preg_match_all('/\[.*?\]/', $m, $s)
Now I'd like to know the pattern to get what's outside.
Regards!
You can use preg_split for this as:
$input ='first [abc] middle [xyz] last';
$arr = preg_split('/\[.*?\]/',$input);
print_r($arr);
Output:
Array
(
[0] => first
[1] => middle
[2] => last
)
This allows some surrounding spaces in the output. If you don't want them you can use:
$arr = preg_split('/\s*\[.*?\]\s*/',$input);
preg_split splits the string based on a pattern. The pattern here is [ followed by anything followed by ]. The regex to match anything is .*. Also [ and ] are regex meta char used for char class. Since we want to match them literally we need to escape them to get \[.*\]. .* is by default greedy and will try to match as much as possible. In this case it will match abc] middle [xyz. To avoid this we make it non greedy by appending it with a ? to give \[.*?\]. Since our def of anything here actually means anything other than ] we can also use \[[^]]*?\]
EDIT:
If you want to extract words that are both inside and outside the [], you can use:
$arr = preg_split('/\[|\]/',$input);
which split the string on a [ or a ]
$inside = '\[.+?\]';
$outside = '[^\[\]]+';
$or = '|';
preg_match_all(
"~ $inside $or $outside~x",
"first [abc] middle [xyz] last",
$m);
print_r($m);
or less verbose
preg_match_all("~\[.+?\]|[^\[\]]+~", $str, $matches)
Use preg_split instead of preg_match.
preg_split('/\[.*?\]/', 'first [abc] middle [xyz] last');
Result:
array(3) {
[0]=>
string(6) "first "
[1]=>
string(8) " middle "
[2]=>
string(5) " last"
}
ideone
As every one says that you should use preg_split, but only one person replied with an expression that meets your needs, and i think that is a little complex - not complex, a little to verbose but he has updated his answer to counter that.
This expression is what most of the replies have stated.
/\[.*?\]/
But that only prints out
Array
(
[0] => first
[1] => middle
[2] => last
)
and you stated you wanted whats inside and outside the braces, sio an update would be:
/[\[.*?\]]/
This gives you:
Array
(
[0] => first
[1] => abc
[2] => middle
[3] => xyz
[4] => last
)
but as you can see that its capturing white spaces as well, so lets go a step further and get rid of those:
/[\s]*[\[.*?\]][\s]*/
This will give you a desired result:
Array
(
[0] => first
[1] => abc
[2] => middle
[3] => xyz
[4] => last
)
This i think is the expression your looking for.
Here is a LIVE Demonstration of the above Regex
I have a string of the form "a-b""c-d""e-f"...
Using preg_match, how could I extract them and get an array as:
Array
(
[0] =>a-b
[1] =>c-d
[2] =>e-f
...
[n-times] =>xx-zz
)
Thanks
You can do:
$str = '"a-b""c-d""e-f"';
if(preg_match_all('/"(.*?)"/',$str,$m)) {
var_dump($m[1]);
}
Output:
array(3) {
[0]=>
string(3) "a-b"
[1]=>
string(3) "c-d"
[2]=>
string(3) "e-f"
}
Regexp are not always the fastest solution:
$string = '"a-b""c-d""e-f""g-h""i-j"';
$string = trim($string, '"');
$array = explode('""',$string);
print_r($array);
Array ( [0] => a-b [1] => c-d [2] => e-f [3] => g-h [4] => i-j )
Here's my take on it.
$string = '"a-b""c-d""e-f"';
if ( preg_match_all( '/"(.*?)"/', $string, $matches ) )
{
print_r( $matches[1] );
}
And a breakdown of the pattern
" // match a double quote
( // start a capture group
. // match any character
* // zero or more times
? // but do so in an ungreedy fashion
) // close the captured group
" // match a double quote
The reason you look in $matches[1] and not $matches[0] is because preg_match_all() returns each captured group in indexes 1-9, whereas the entire pattern match is at index 0. Since we only want the content in the capture group (in this case, the first capture group), we look at $matches[1].