PHP regex alphanumeric bounded by nonalpha numeric - php

I would like to get all occurrences of
#something
bounded by any nonalphanumeric character or space.
I tried
[^A-Za-z0-9\s]#(\S)[^A-Za-z0-9]
but it keeps including space after word.
I'll be glad for any help, thanks.
Edit:
So issue would be clear, I want to get match from
Line start #word1 something #word2,#word3
all '#word1', '#word2', '#word3'

Is this what you want?
#\w+
Demo
preg_match_all('#(#\w+)#', 'Line start #word1 something #word2,#word3', $matches);
print_r($matches[1]);
Taking from Madbreak comment, to exclude # preceded by any character, use this instead
(?<!\w)#\w+(?=\b)
Demo

This
preg_match_all('/[^#]*#(\S*)/', 'blabla #something1 blabla #something2 blabla', $matches);
print_r($matches[1]);
prints
Array
(
[0] => something1
[1] => something2
)

Related

Matching whole words between commas, or a comma at the beginning, or a comma at the end with Regex

I have a string like this:
page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags
I made this regex that I expect to get the whole tags with:
(?<=\,)(rss-latest|listing-latest-no-category|category-128|page-9000)(?=\,)
I want it to match all the ocurrences.
In this case:
page-9000 and rss-latest.
This regex checks whole words between commas just fine but it ignores the first and the last because it's not between commas (obviously).
I've also tried that it checks if it's between commas OR one comma at the beginning OR one comma to the end, however it would give me false positives, as it would match:
category-128
while the string contains:
page-category-128
Any help?
Try using the following pattern:
(?<=,|^)(rss-latest|listing-latest-no-category|category-128|page-9000)(?=,|$)
The only change I have made is to add boundary markers ^ and $ to the lookarounds to also match on the start and end of the input.
Script:
$input = "page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags";
preg_match_all("/(?<=,|^)(rss-latest|listing-latest-no-category|category-128|page-9000)(?=,|$)/", $input, $matches);
print_r($matches[1]);
This prints:
Array
(
[0] => page-9000
[1] => rss-latest
)
Here is a non-regex way using explode and array_intersect:
$arr1 = explode(',', 'page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags');
$arr2 = explode('|', 'rss-latest|listing-latest-no-category|category-128|page-9000');
print_r(array_intersect($arr1, $arr2));
Output:
Array
(
[0] => page-9000
[6] => rss-latest
)
The (?<=\,) and (?=,) require the presence of , on both sides of the matching pattern. You want to match also at the start/end of string, and this is where you need to either explicitly tell to match either , or start/end of string or use double-negating logic with negated character classes inside negative lookarounds.
You may use
(?<![^,])(?:rss-latest|listing-latest-no-category|category-128|page-9000)(?![^,])
See the regex demo
Here, (?<![^,]) matches the start of string position or a , and (?![^,]) matches the end of string position or ,.
Now, you do not even need a capturing group, you may get rid of its overhead using a non-capturing group, (?:...). preg_match_all won't have to allocate memory for the submatches and the resulting array will be much cleaner.
PHP demo:
$re = '/(?<![^,])(?:rss-latest|listing-latest-no-category|category-128|page-9000)(?![^,])/m';
$str = 'page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
// => Array ( [0] => page-9000 [1] => rss-latest )

Matching all characters except spaces in regex

Right now I have a regex, and I want to change one part of the regex.
(.{3,}?) ~
^---This part of the code, where it says, (any characters that are 3 or more in length, and matches up to the nearest space), I want to change it to (any characters, except spaces , that are 3 or more in length, and matches up to the nearest space). How would I say that in regex?
$text = "my name is to habert";
$regex = "~(?:my name is |my name\\\'s |i am |i\\\'m |it is |it\\\'s |call me )?(.{3,}?) ~i";
preg_match($regex, $text, $match);
print_r($match);
Result:
Array ( [0] => my name [1] => my name )
Need Result:
Array ( [0] => name [1] => name )
Gravedigger here... Since this question does not have an answer yet, I'll post mine.
(\S{3,}) will work for your needs
Regex Explanation:
( Open capture group
\S Everything but whitespaces (same as [^\s], you can use [^ ] too, but the latter works only for spaces.)
{3,} Must contain three or more characters
) Close capture group
Test it here!

How to split 123abcd#abcd.com like 123 and abcd#abcd.com in PHP

I have like a bunch of texts in a txt file like that.
I just want to split the starting numbers and email separately like above
Can someone make a function or something for that alone please? will be very thankful.
Any other suggestion is also gladly welcome. !
You could try the below code.
<?php
$yourstring = "123abcd#abcd.com";
$regex = '~^\d+\K~';
$splits = preg_split($regex, $yourstring);
print_r($splits);
?>
Output:
Array
(
[0] => 123
[1] => abcd#abcd.com
)
Explanation:
^ Asserts that we are at the start.
\d+ Matches one or more digits.
\K discards the previously matched characters. So after ^\d+\K, the matching marker would be on the boundary exists between the starting number and the email id. Splitting according to that boundary will give you the desired result.

regex not matching pattern correctly

The data looks like this
cityID=123456789&sharing=blahblahblah
Currently doing
$cityID = preg_grep("/cityID=.\d\&$/", $sometext);
print_r($cityID);
Currently printing
array(
)
I want it to print
123456789
The problem is that $ is marking the end of line, where as this pattern isn't necessarily at the end of a line. Also \d is not allowing for more than one digit before the ampersand, so I added a +. (Also, be aware that . matches any character; it's not clear that is what you want, which is why I asked above.)
This should match for you:
preg_match("/cityID=\d+&/", $input_line, $output_array);
To experiment more with this pattern, visit http://www.phpliveregex.com/p/1WH
You could use preg_match_all()
$str = "cityID=123456789&sharing=blahblahblahcityID=123456789&sharing=blahblahblahcityID=123456789&sharing=blahblahblah";
// or
// $str = "cityID=123456789&sharing=blahblahblah
// cityID=123456789&sharing=blahblahblah
// cityID=123456789&sharing=blahblahblah";
$result = preg_match_all("/cityID=(\d+)/", $str, $matches);
print_r($matches[1]);
Ouput:
Array ( [0] => 123456789 [1] => 123456789 [2] => 123456789 )

php string regex

my code is:
$tt='This is a tomato test';
$rr=preg_match('/is(.*)to/',$tt,$match);
print_r($match);
From this I am trying to get " a toma" output only...but it is giving me:
Array
(
[0] => is is a tomato
[1] => is a toma
)
For this regex how can I make it not display the "is" at the beginning of the output strings?
The simplest solution is to note that "this" includes the substring "is" so...
$tt='This is a tomato test';
$rr=preg_match('/ is(.*)to/',$tt,$match); // add a space before is.
print_r($match);
And [1] will be "a toma"
Another trick is to use a lookbehind assertion (?<= whose contents will not be part of the result match:
preg_match('/(?<=\bis)(.*)to/', $tt, $match);
What you need ia called a lookbehind, described here http://www.regular-expressions.info/lookaround.html
specicifaclly you want something like '/(?<= is)(.*)to/'

Categories