php string regex - php

my code is:
$tt='This is a tomato test';
$rr=preg_match('/is(.*)to/',$tt,$match);
print_r($match);
From this I am trying to get " a toma" output only...but it is giving me:
Array
(
[0] => is is a tomato
[1] => is a toma
)
For this regex how can I make it not display the "is" at the beginning of the output strings?

The simplest solution is to note that "this" includes the substring "is" so...
$tt='This is a tomato test';
$rr=preg_match('/ is(.*)to/',$tt,$match); // add a space before is.
print_r($match);
And [1] will be "a toma"

Another trick is to use a lookbehind assertion (?<= whose contents will not be part of the result match:
preg_match('/(?<=\bis)(.*)to/', $tt, $match);

What you need ia called a lookbehind, described here http://www.regular-expressions.info/lookaround.html
specicifaclly you want something like '/(?<= is)(.*)to/'

Related

regex - finding multiple occurances of a pattern and extracting a string [duplicate]

I have tried the non capturing group option ?:
Here is my data:
hello:"abcdefg"},"other stuff
Here is my regex:
/hello:"(.*?)"}/
Here is what it returns:
Array
(
[0] => Array
(
[0] => hello:"abcdefg"}
)
[1] => Array
(
[0] => abcdefg
)
)
I wonder, how can I make it so that [0] => abdefg and that [1] => doesnt exist?
Is there any way to do this? I feel like it would be much cleaner and improve my performance. I understand that regex is simply doing what I told it to do, that is showing me the whole string that it found, and the group inside the string. But how can I make it only return abcdefg, and nothing more? Is this possible to do?
Thanks.
EDIT: I am using the regex on a website that says it uses perl regex. I am not actually using the perl interpreter
EDIT Again: apparently I misread the website. It is indeed using PHP, and it is calling it with this function: preg_match_all('/hello:"(.*?)"}/', 'hello:"abcdefg"},"other stuff', $arr, PREG_PATTERN_ORDER);
I apologize for this error, I fixed the tags.
EDIT Again 2: This is the website http://www.solmetra.com/scripts/regex/index.php
preg_match_all
If you want a different captured string, you need to change your regex. Here I'm looking for anything not a double quote " between two quote " characters behind a : colon character.
<?php
$string = 'hello:"abcdefg"},"other stuff';
$pattern = '!(?<=:")[^"]+(?=")!';
preg_match_all($pattern,$string,$matches);
echo $matches[0][0];
?>
Output
abcdefg
If you were to print_r($matches) you would see that you have the default array and the matches in their own additional arrays. So to access the string you would need to use $matches[0][0] which provides the two keys to access the data. But you're always going to have to deal with arrays when you're using preg_match_all.
Array
(
[0] => Array
(
[0] => abcdefg
)
)
preg_replace
Alternatively, if you were to use preg_replace instead, you could replace all of the contents of the string except for your capture group, and then you wouldn't need to deal with arrays (but you need to know a little more about regex).
<?php
$string = 'hello:"abcdefg"},"other stuff';
$pattern = '!^[^:]+:"([^"]+)".+$!s';
$new_string = preg_replace($pattern,"$1",$string);
echo $new_string;
?>
Output
abcdefg
preg_match_all is returning exactly what is supposed to.
The first element is the entire string that matched the regex. Every other element are the capture groups.
If you just want the the capture group, then just ignore the 1st element.
preg_match_all('/hello:"(.*?)"}/', 'hello:"abcdefg"},"other stuff', $arr, PREG_PATTERN_ORDER);
$firstMatch = $arr[1];

a regular expression in preg_match that returns 3 matches

My input is as below:
*test*
The output I want is content inside two asterisks (test).
My code is as below:
preg_match('/^(\*(.*)\*)$/','*test*',$matches);
Its output is:
Array ( [0] => *test* [1] => *test* [2] => test )
The third one is the one I want. I know why it does this, but I don't know how to solve it. How to write an RE that returns just test nothing else.
You can use look-ahead and look-behind assertions:
/(?<=^\*).*(?=\*$)/
you can try this by using explode function :
<?php $str = '*test*';
$str1 = explode('*', $str);
echo $str1[1];
?>

regex not matching pattern correctly

The data looks like this
cityID=123456789&sharing=blahblahblah
Currently doing
$cityID = preg_grep("/cityID=.\d\&$/", $sometext);
print_r($cityID);
Currently printing
array(
)
I want it to print
123456789
The problem is that $ is marking the end of line, where as this pattern isn't necessarily at the end of a line. Also \d is not allowing for more than one digit before the ampersand, so I added a +. (Also, be aware that . matches any character; it's not clear that is what you want, which is why I asked above.)
This should match for you:
preg_match("/cityID=\d+&/", $input_line, $output_array);
To experiment more with this pattern, visit http://www.phpliveregex.com/p/1WH
You could use preg_match_all()
$str = "cityID=123456789&sharing=blahblahblahcityID=123456789&sharing=blahblahblahcityID=123456789&sharing=blahblahblah";
// or
// $str = "cityID=123456789&sharing=blahblahblah
// cityID=123456789&sharing=blahblahblah
// cityID=123456789&sharing=blahblahblah";
$result = preg_match_all("/cityID=(\d+)/", $str, $matches);
print_r($matches[1]);
Ouput:
Array ( [0] => 123456789 [1] => 123456789 [2] => 123456789 )

PHP regex, how can I make my regex only return one group?

I have tried the non capturing group option ?:
Here is my data:
hello:"abcdefg"},"other stuff
Here is my regex:
/hello:"(.*?)"}/
Here is what it returns:
Array
(
[0] => Array
(
[0] => hello:"abcdefg"}
)
[1] => Array
(
[0] => abcdefg
)
)
I wonder, how can I make it so that [0] => abdefg and that [1] => doesnt exist?
Is there any way to do this? I feel like it would be much cleaner and improve my performance. I understand that regex is simply doing what I told it to do, that is showing me the whole string that it found, and the group inside the string. But how can I make it only return abcdefg, and nothing more? Is this possible to do?
Thanks.
EDIT: I am using the regex on a website that says it uses perl regex. I am not actually using the perl interpreter
EDIT Again: apparently I misread the website. It is indeed using PHP, and it is calling it with this function: preg_match_all('/hello:"(.*?)"}/', 'hello:"abcdefg"},"other stuff', $arr, PREG_PATTERN_ORDER);
I apologize for this error, I fixed the tags.
EDIT Again 2: This is the website http://www.solmetra.com/scripts/regex/index.php
preg_match_all
If you want a different captured string, you need to change your regex. Here I'm looking for anything not a double quote " between two quote " characters behind a : colon character.
<?php
$string = 'hello:"abcdefg"},"other stuff';
$pattern = '!(?<=:")[^"]+(?=")!';
preg_match_all($pattern,$string,$matches);
echo $matches[0][0];
?>
Output
abcdefg
If you were to print_r($matches) you would see that you have the default array and the matches in their own additional arrays. So to access the string you would need to use $matches[0][0] which provides the two keys to access the data. But you're always going to have to deal with arrays when you're using preg_match_all.
Array
(
[0] => Array
(
[0] => abcdefg
)
)
preg_replace
Alternatively, if you were to use preg_replace instead, you could replace all of the contents of the string except for your capture group, and then you wouldn't need to deal with arrays (but you need to know a little more about regex).
<?php
$string = 'hello:"abcdefg"},"other stuff';
$pattern = '!^[^:]+:"([^"]+)".+$!s';
$new_string = preg_replace($pattern,"$1",$string);
echo $new_string;
?>
Output
abcdefg
preg_match_all is returning exactly what is supposed to.
The first element is the entire string that matched the regex. Every other element are the capture groups.
If you just want the the capture group, then just ignore the 1st element.
preg_match_all('/hello:"(.*?)"}/', 'hello:"abcdefg"},"other stuff', $arr, PREG_PATTERN_ORDER);
$firstMatch = $arr[1];

PHP regex alphanumeric bounded by nonalpha numeric

I would like to get all occurrences of
#something
bounded by any nonalphanumeric character or space.
I tried
[^A-Za-z0-9\s]#(\S)[^A-Za-z0-9]
but it keeps including space after word.
I'll be glad for any help, thanks.
Edit:
So issue would be clear, I want to get match from
Line start #word1 something #word2,#word3
all '#word1', '#word2', '#word3'
Is this what you want?
#\w+
Demo
preg_match_all('#(#\w+)#', 'Line start #word1 something #word2,#word3', $matches);
print_r($matches[1]);
Taking from Madbreak comment, to exclude # preceded by any character, use this instead
(?<!\w)#\w+(?=\b)
Demo
This
preg_match_all('/[^#]*#(\S*)/', 'blabla #something1 blabla #something2 blabla', $matches);
print_r($matches[1]);
prints
Array
(
[0] => something1
[1] => something2
)

Categories