php regex for detecting #number

php regex for detecting #number - php

i have the following regex that i am trying to detect #x, x being a number. I was able to get it working when there is nothing around match 2, however if there is then it breaks. can someone help me with how to make this work both ways?
/(\G|\s+|^)#(\d+)((?=\s+)|(?=::)|$)/i
that will work with the line
This is a test #1234 end test
but that will not work with
This is a test #1234end test
This is a test#1234 end test
This is a test.#1234 end test
This is a test #1234. End test
anyone know what needs to be changed to achieve this?
edit, i am trying to allow anything but alphanumeric in the 3rd group, right now there is :: and whitespace. is there a way to combine these into 1 and not detect letters or numbers

Running a preg match using /#\d+/i should get you what you are looking for. So running the following:
$items = [
"This is a test #1234end test",
"This is a test#1234 end test",
"This is a test.#1234 end test",
"This is a test #1234. End test"
];
foreach($items as $test){
preg_match("/#\d+/i", $test, $matches);
var_dump($matches);
}
You will get this result:
array(1) {
[0]=>
string(5) "#1234"
}
array(1) {
[0]=>
string(5) "#1234"
}
array(1) {
[0]=>
string(5) "#1234"
}
array(1) {
[0]=>
string(5) "#1234"
}
If you don't want the # in the results, then you can then do a subpattern of /#(\d+)/i
Which will then result in the following:
array(2) {
[0]=>
string(5) "#1234"
[1]=>
string(4) "1234"
}
array(2) {
[0]=>
string(5) "#1234"
[1]=>
string(4) "1234"
}
array(2) {
[0]=>
string(5) "#1234"
[1]=>
string(4) "1234"
}
array(2) {
[0]=>
string(5) "#1234"
[1]=>
string(4) "1234"
}

(\G|\s+|^)#(\d+)((?=[^[:alnum:]])|$)
i wanted to keep the three groups that i had, but i only changed the 3rd group. i removed the :: and \S whitespace characters from the 3rd group and just added a simple NOT alphanumeric check, as this will contain those 2 conditions as well.
(\G|\s+|^)
#(\d+)
((?=[^[:alnum:]])|$)
[^[:alnum:]]

Related

preg_match with multiple find

i have this code
$a='-t40-';
preg_match('/^-t(.*?)-$/', $a,$match);
var_dump($match);
Result:
array(2) { [0]=> array(1) { [0]=> string(5) "-t40-" }
[1]=> array(1) { [0]=> string(2) "40" } }
if i add some text after last "-" code will not be valid.
if $a='-t40-some text'; i need a result similar with:
array(3) { [0]=> array(1) { [0]=> string(5) "-t40-" }
[1]=> array(1) { [0]=> string(2) "40" }
[2]=> array(1) { [0]=> string(9) "some text" }}
How to edit pattern to find "some text"?
Thanks in advance.

$a='-t40-some text';
preg_match('/^-t(.*?)-(.*?)$/', $a,$match);
var_dump($match);
Output:
array(3) {
[0]=>
string(14) "-t40-some text"
[1]=>
string(2) "40"
[2]=>
string(9) "some text"
}
Explanation:
^ : beginning of line
-t : literally "-t"
(.*?) : group 1, 0 or more any charater but newline, not greedy
- : literally "-"
(.*?) : group 2, 0 or more any charater but newline, not greedy
$ : end of line

PHP Regex - Match multiple possibilities (pipe)

I want to grab all IDs (integers) from several URLs within a text. These URLs could look like these:
http://url.tld/index.php/p1
http://url.tld/p2#abc
http://url.tld/index.php/Page/3-xxx
http://url.tld/Page/4
For this, I've built two regexes (the URLs are enclosed by an URL bbcode):
\[url\](http\://url\.tld/index\.php/p(\d+).*?\)[/url\]
\[url\](http\://url\.tld(?:/index\.php)?/Page/(\d+).*?\)[/url\]
However, if i do a preg_match_all with every single regex, I get an array that looks like this (and which is correct):
array(3) {
[0]=>
array(2) {
[0]=>
string(62) "[url]http://url.tld/index.php/Page/6-fdgfh/[/url]"
[1]=>
string(50) "[url]http://url.tld/Page/7[/url]"
}
[1]=>
array(2) {
[0]=>
string(51) "http://url.tld/index.php/Page/6-fdgfh/"
[1]=>
string(39) "http://url.tld/Page/7"
}
[2]=>
array(2) {
[0]=>
string(1) "6"
[1]=>
string(1) "7"
}
}
But if I combine both regexes with a pipe:
\[url\](http\://url\.tld/index\.php/p(\d+).*?|http\://url\.tld(?:/index\.php)?/Page/(\d+).*?)\[/url\]
it builds an array like this (which is wrong):
array(4) {
[0]=>
array(3) {
[0]=>
string(71) "[url]http://url.tld/index.php/p9-abc#hashtag[/url]"
[1]=>
string(62) "[url]http://url.tld/index.php/Page/6-fdgfh/[/url]"
[2]=>
string(50) "[url]http://url.tld/Page/7[/url]"
}
[1]=>
array(3) {
[0]=>
string(60) "http://url.tld/index.php/t9-abc#hashtag"
[1]=>
string(51) "http://url.tld/index.php/Page/6-fdgfh/"
[2]=>
string(39) "http://url.tld/Page/7"
}
[2]=>
array(3) {
[0]=>
string(1) "9"
[1]=>
string(0) ""
[2]=>
string(0) ""
}
[3]=>
array(3) {
[0]=>
string(0) ""
[1]=>
string(1) "6"
[2]=>
string(1) "7"
}
}
====
So, my question is: How can I fix this? What I need is the array structure from the first example, while using both regular expressions as one regular expression, because I need a consistent structure to do a preg_replace_callback later.

I think you're looking for the Branch Reset group:
\[url]((?|http://url\.tld/index\.php/p(\d+).*?|http://url\.tld(?:/index\.php)?/Page/(\d+).*?))\[/url]
Or, for the line-noise-challenged among us:
\[url]
(
(?|
http://url\.tld/index\.php/p(\d+)[^[]*
|
http://url\.tld(?:/index\.php)?/Page/(\d+)[^[]*
)
)
\[/url]
This captures the numbers in group #2, no matter which part of the regex matched it. The whole URL is still captured in group #1.

PHP - REGEX TO ARRAY like MP3TAG

I would like to ask how to convert a string to array using
a string pattern like mp3tag does
%ALBUM% - %SOMETHING% - %SOMETHING%,
the ' - ' are custom chars that are not static.
If i didnt made myself clear
i want fro custom sting to make it an array
but the pattern is custom not static
Is this possible in php and if so how.

$str = "%ALBUM% & %SOMETHING% (ノ゜-゜)ノ ︵ ┬──┬ %SOMETHING%,";
preg_match_all("/%([a-z]+)%/i", $str, $matches);
var_dump($matches);
Outputs
array(2) {
[0]=>
array(3) {
[0]=>
string(7) "%ALBUM%"
[1]=>
string(11) "%SOMETHING%"
[2]=>
string(11) "%SOMETHING%"
}
[1]=>
array(3) {
[0]=>
string(5) "ALBUM"
[1]=>
string(9) "SOMETHING"
[2]=>
string(9) "SOMETHING"
}
}

Regex quantified capture

php > preg_match("#/m(/[^/]+)+/t/?#", "/m/part/other-part/t", $m);
php > var_dump($m);
array(2) {
[0]=>
string(20) "/m/part/other-part/t"
[1]=>
string(11) "/other-part"
}
php > preg_match_all("#/m(/[^/]+)+/t/?#", "/m/part/other-part/t", $m);
php > var_dump($m);
array(2) {
[0]=>
array(1) {
[0]=>
string(20) "/m/part/other-part/t"
}
[1]=>
array(1) {
[0]=>
string(11) "/other-part"
}
}
With said example I would like the capture to match both /part and /other-part, unfortunately with regex /m(/[^/]+)+/t/? doesn't capture both, as I expect.
This capture should not be bound to only match this sample, it should capture an undefined number of repetitions of the capture group; e.g. /m/part/other-part/and-another/more/t
UPDATE:
Given that this is expected behavior my question stands as of how I would be able to achieve this matching of mine?

Try this one out:
preg_match_all("#(?:/m)?/([^/]+)(?:/t)?#", "/m/part/other-part/another-part/t", $m);
var_dump($m);
It gives:
array(2) {
[0]=>
array(3) {
[0]=>
string(7) "/m/part"
[1]=>
string(11) "/other-part"
[2]=>
string(15) "/another-part/t"
}
[1]=>
array(3) {
[0]=>
string(4) "part"
[1]=>
string(10) "other-part"
[2]=>
string(12) "another-part"
}
}
//EDIT
IMO the best way to do what you want is to use preg_match() from #stema and explode result by / to get list of parts you want.

Thats the way capturing groups are working. repeated capturing groups have only the last match stored after the regex finished. Thats in your test "/other-part".
Try this instead
/m((?:/[^/]+)+)/t/?
See it here on Regexr, while hovering over the match, you can see the content of the capturing group.
Just make your group non-capturing by adding a ?: at the start and put another one around the whole repetition.
In php
preg_match_all("#/m((?:/[^/]+)+)/t/?#", "/m/part/other-part/t", $m);
var_dump($m);
Output:
array(2) {
[0]=> array(1) {
[0]=>
string(20) "/m/part/other-part/t"
}
[1]=> array(1) {
[0]=>
string(16) "/part/other-part"
}
}

As already written in a comment, you can't do this at once because preg_match does not allow you to return the same subgroup matches as well (like you can do with Javascript or .Net, see Get repeated matches with preg_match_all()). So you can divide the operation onto multiple steps:
Match the subject, extract the part you're interested in.
Match the interested part only.
Code:
$subject = '/m/part/other-part/t';
$subpattern = '/[^/]+';
$pattern = sprintf('~/m(?<path>(?:%s)+)/t/?~', $subpattern);
$r = preg_match($pattern, $subject, $matches);
if (!$r) return;
$r = preg_match_all("~$subpattern~", $matches['path'], $matches);
var_dump($matches);
Output:
array(1) {
[0]=>
array(2) {
[0]=>
string(5) "/part"
[1]=>
string(11) "/other-part"
}
}

preg_match not returning expected results

I'm attempting to use regexp to parse a search string that from time to time may contain special syntax. The syntax im looking for is [special keyword : value] and i want each match put into an array. Keep in mind that the search string will contain other text that is not intended to be parsed.
$searchString = "[StartDate:2010-11-01][EndDate:2010-11-31]";
$specialKeywords = array();
preg_match("/\[{1}.+\:{1}.+\]{1}/", $searchString, $specialKeywords);
var_dump($specialKeywords);
Output:
array(1) { [0]=> string(43) "[StartDate:2010-11-01] [EndDate:2010-11-31]" }
Desired Output:
array(2) { [0]=> string() "[StartDate:2010-11-01]"
[1]=> string() "[EndDate:2010-11-01]"}
Please let me know if i am not being clear enough.

Your .+ matches across the boundaries between the two [...] parts because it matches any character, and as many of them as possible. You could be more restrictive about which characters may be matched. Also {1} is redundant and can be dropped.
/\[[^:]*:[^\]]*\]/
should work more reliably.
Explanation:
\[ # match a [
[^:]* # match any number of characters except :
: # match a :
[^\]]* # match any number of characters except ]
\] # match a ]

This:
$searchString = "[StartDate:2010-11-01][EndDate:2010-11-31]";
preg_match_all('/\[.*?\]/', $searchString, $match);
print_r($match);
gives the expected result, I'm not sure if it matches all the constraints.

Try the following:
$searchString = "[StartDate:2010-11-01][EndDate:2010-11-31]";
$specialKeywords = array();
preg_match_all("/\[\w+:\d{4}-\d\d-\d\d\]/i", $searchString, $specialKeywords);
var_dump($specialKeywords[0]);
Outputs:
array(2) {
[0]=>
string(22) "[StartDate:2010-11-01]"
[1]=>
string(20) "[EndDate:2010-11-31]"
}

Use this regex: "/\[(.*?)\:(.*?)\]{1}/" and also use preg_match_all, it will return
array(3) {
[0]=>
array(2) {
[0]=>
string(22) "[StartDate:2010-11-01]"
[1]=>
string(20) "[EndDate:2010-11-31]"
}
[1]=>
array(2) {
[0]=>
string(9) "StartDate"
[1]=>
string(7) "EndDate"
}
[2]=>
array(2) {
[0]=>
string(10) "2010-11-01"
[1]=>
string(10) "2010-11-31"
}
}

/\[.+?\:.+?\]/
I suggest this method, less complex but it handles the same as tim's

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

php regex for detecting #number - php

Related

preg_match with multiple find

PHP Regex - Match multiple possibilities (pipe)

PHP - REGEX TO ARRAY like MP3TAG

Regex quantified capture

preg_match not returning expected results

Categories

Resources