PHP regular expression with nth occurrence - php

Here's a string:
n%3A171717%2Cn%3A%747474%2Cn%3A555666%2Cn%3A1234567&bbn=555666
From this string how can I extract 1234567 ? Need a good logic / syntax.
I guess preg_match would be a better option than explode function in PHP.
It's about a PHP script that extracts data. The numbers can vary and the occurrence of numbers can vary as well only %2Cn%3A will always be there in front of the numbers.the end will always have a &bbn=anyNumber.

That looks like part of an encoded URL so there's bound to be better ways to do it, but urldecoded() your string looks like:
n:171717,n:t7474,n:555666,n:1234567&bbn=555666
So:
preg_match_all('/n:(\d+)/', urldecode($string), $matches);
echo array_pop($matches[1]);
Parenthesized matches are in $matches[1] so just array_pop() to get the last element.
If &bbn= can be anywhere (except for at the beginning) then:
preg_match('/n:(\d+)&bbn=/', urldecode($string), $matches);
echo $matches[1];

only %2Cn%3A will always be there in front of the numbers
urldecoded equivalent of %2Cn%3A is ,n:.The last "enclosing boundary" &bbn remains as is.
preg_match function will do the job:
preg_match("/(?<=,n:)\d+(?=&bbn)/", urldecode("n%3A171717%2Cn%3A%747474%2Cn%3A555666%2Cn%3A1234567&bbn=555666"), $m);
print_r($m[0]); // "1234567"

Related

preg_replace - similar patterns

I have a string that contains something like "LAB_FF, LAB_FF12" and I'm trying to use preg_replace to look for both patterns and replace them with different strings using a pattern match of;
/LAB_[0-9A-F]{2}|LAB_[0-9A-F]{4}/
So input would be
LAB_FF, LAB_FF12
and the output would need to be
DAB_FF, HAD_FF12
Problem is, for the second string, it interprets it as "LAB_FF" instead of "LAB_FF12" and so the output is
DAB_FF, DAB_FF
I've tried splitting the input line out using 2 different preg_match statements, the first looking for the {2} pattern and the second looking for the {4} pattern. This sort of works in that I can get the correct output into 2 separate strings but then can't combine the two strings to give the single amended output.
\b is word boundary. Meaning it will look at where the word ends and not only pattern match.
https://regex101.com/r/upY0gn/1
$pattern = "/\bLAB_[0-9A-F]{2}\b|\bLAB_[0-9A-F]{4}\b/";
Seeing the comment on the other answer about how to replace the string.
This is one way.
The pattern will create empty entries in the output array for each pattern that fails.
In this case one (the first).
Then it's just a matter of substr.
$re = '/(\bLAB_[0-9A-F]{2}\b)|(\bLAB_[0-9A-F]{4}\b)/';
$str = 'LAB_FF12';
preg_match($re, $str, $matches);
var_dump($matches);
$substitutes = ["", "DAB", "HAD"];
For($i=1; $i<count($matches); $i++){
If($matches[$i] != ""){
$result = $substitutes[$i] . substr($matches[$i],3);
Break;
}
}
Echo $result;
https://3v4l.org/gRvHv
You can specify exact amounts in one set of curly braces, e.g. `{2,4}.
Just tested this and seems to work:
/LAB_[0-9A-F]{2,4}/
LAB_FF, LAB_FFF, LAB_FFFF
EDIT: My mistake, that actually matches between 2 and 4. If you change the order of your selections it matches the first it comes to, e.g.
/LAB_([0-9A-F]{4}|[0-9A-F]{2})/
LAB_FF, LAB_FFFF
EDIT2: The following will match LAB_even_amount_of_characters:
/LAB_([0-9A-F]{2})+/
LAB_FF, LAB_FFFF, LAB_FFFFFF...

Get specific string content inside big string

I have a big string like this:
[/az_column_text][/vc_column_inner][vc_column_inner width="3/4"]
[az_latest_posts post_layout="listed-layout" post_columns_count="2clm" post_categories="assemblea-soci-2015"]
[/vc_column_inner][/vc_row_inner][/vc_column]
What I need to extract:
assemblea-soci-2015
Of course this value can change, and also the big string can change too. I need a regex or something else to extract this value (it will be always from post_categories="my-value-to-extract") from this big string.
I think to take post_categories=" as the beginning of a possible substring and the next char " as the end of my portion, but no idea how to do this.
Is there an elegant way to do this also for future values with, of course, different length?
You can use this regex in PHP:
post_categories="\K[^"]+
RegEx Demo
You can use this regex:
(?<=post_categories=")[^"]+(?=")
?<= (lookbehind) looks for post_categories=" before the desired match, and (?=) (lookahead) looks for " after the desired match.
[^"] gets the match (which is assumed not to contain any ")
Demo
Example PHP code:
$text='[/az_column_text][/vc_column_inner][vc_column_inner width="3/4"]
[az_latest_posts post_layout="listed-layout" post_columns_count="2clm" post_categories="assemblea-soci-2015"]
[/vc_column_inner][/vc_row_inner][/vc_column]';
preg_match ("/(?<=post_categories=\")[^\"]+(?=\")/", $text,$matches);
echo $matches[0];
Output:
assemblea-soci-2015
This should extract what you want.
preg_match ("/post_categories=\"(.*)\"\[\]/", $text_you_want_to_use)

Regex to extract substring

really struggling with this...hopefully someone can put me on the right path to a solution.
My input string is structured like this:
66-2141-A-AC107-7
I'm interested in extracting the string 'AC107' using a single regular expression. I know how to do this with other PHP string functions, but I have to do this with a regular expression.
What I need is to extract all data between the third and fourth hyphens. The structure of each section is not fixed (i.e, 66 may be 8798709 and 2141 may be 38). The presence of the number of hyphens is guaranteed (i.e., there will always be a total of four (4) hyphens).
Any help/guidance is greatly appreciated!
This will do what you need:
(?:[^-]*-){3}([^-]+)
Debuggex Demo
Explanation:
(?:[^-]*-) Look for zero or more non-hyphen characters followed by a hyphen
{3} Look for three of the blocks just described
([^-]+) Capture all the consecutive non-hyphen characters from that point forward (will automatically cut off before the next hyphen)
You can use it in PHP like this:
$str = '66-2141-A-AC107-7';
preg_match('/^(?:[^-]*-){3}([^-]+)/', $str, $matches);
echo $matches[1]; // prints AC107
This should look for anything followed by a hyphen 3 times and then in group 2 (the second set of parenthesis) it will have your value, followed by another hyphen and anything else.
/^(.*-){3}(.*)-(.*)/
You can access it by using $2. In php, it would be like this:
$string = '66-2141-A-AC107-7';
preg_match('/^(.*-){3}(.*)-(.*)/', $string, $matches);
$special_id = $matches[2];
print $special_id;

pregmatch between characters and any numeric

I'm stuck writing a preg_match
I have a string:
XPMG_ar121023.txt
and need to extract the 2 letters between XPMG_ and the first digit - be it a 0-9
$str = 'XPMG_ar121023.txt';
preg_match('/('XPMG_')|[0-9\,]))/', $str, $match);
print_r($match);
Maybe this isn't the best option: My characters will always be
You can just do
$str = "XPMG_ar121023.txt" ;
preg_match('/_([a-z]+)/i', $str, $match);
var_dump($match[1]);
Output
string 'ar' (length=2)
This is too simple for a regular expression. Just $match = substr($str,5,3) would get what you're asking for.
Let me walk through this step by step so as to help you solve similar problems in the future. Suppose we have the following format for our filenames:
XPMG_ar121023.txt
We know what we want to capture, we want the "ar" right after the _ and just before the numbers begin. So our expression would look something like this:
_[a-z]+
This is pretty straight-forward. We're starting by looking for an underscore, followed by any number of letters between a and z. The square brackets define a character class. Our class consists of the alphabet, but you can push specific numbers in there and more if you like.
Now because we want to capture only the letters, we need to put parenthesis around that part of the pattern:
_([a-z]+)
In the result we will now have access to only that subpattern. Next we put our delimiters in place to specify where our pattern begins, and ends:
/_([a-z]+)/
And lastly, after our closing delimiter we can add some modifiers. As it is written, our pattern only looks for lower-case letters. We can add the i modifier to make this case-insensitive:
/_([a-z]+)/i
Voila, we're done. Now we can pass it into preg_match to see what it spits out:
preg_match( "/_([a-z]+)/i", "XPMG_ar121023.txt", $match );
This function takes a pattern as the first parameter, a string to match it against as the second, and lastly a variable to spit the results into. When all is said and done, we can check $match for our data.
The results of this operation follow:
array(2) {
[0]=> string(3) "_ar"
[1]=> string(2) "ar"
}
This is the contents of $match. Notice our full pattern is found in the first index of the array, and our captured portion is provided in the second index of the array.
echo $match[1]; // ar
Hope this helps.
Well, why not:
$letters = $str[5].$str[6];
:)
After all, you'll always need the 2 chars after the fixed prefix, there are many ways that do not require a regexp (substr() being the best anyway)

Regex to get string from last numeric values

I have some php string like below
abc-1987-mp3-songs
xyz-1999-india-mp3-songs
dec-2001-mp3-songs
ryu-2012-freemp3-songs
Now I want these string splited at last found numeric values like below
abc-1987
xyz-1999
dec-2001
ryu-2012
Please help me that which regex can be used to do this. thanks.
Ok, I had a look (do take some time to learn regex - but meanwhile):
$split = (preg_split('/(^.*?[0-9]+)\-?[^0-9]+/', 'foo-xyz-1999-india-mp3-songs', -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY));
echo $split[0];//<--- foo-xyz-1999, just like you wanted
Dumps an array with foo-xyz-1999 as first value, which is what you need. If you want to know what every part of the regex does read it here
The only difference is that, though the whole string becomes its own delimiter, there are two delimiters (the first part, always ending on a series of numbers and the rest of the string, that doesn't contain any more digits)
Use explode insted of regular expression
for example:-
$str="abc-1987-mp3-songs";
$f=explode("-",$str);
echo $final_result=$f[0]."-".$f[1];
or if you want to use reg exp.then
<?php
$str="abc-1987-mp3-songs";
echo $f=preg_replace('/[^0-9]/','', $str);
?>
Above code give you all the numeric digits of your string.
This would match last occurrence of numeric value from given string:
([\w\d-]*-[\d]+)
This is the link: Regex

Categories