Extract last instance of a regular expression from string in PHP - php

I have a URL that is in the following structure: http://somewebsite.com/directory1/directory2/directory3...
I'm trying to get the last directory name from this url, but the depth of the url isn't always constant so i don't think i can use a simple substr or preg_match call - is there a function to get the last instance of a regular expression match from a string?

Just use:
basename( $url )
It should have the desired effect

Torben's answer is the correct way to handle this specific case. But for posterity, here is how you get the last instance of a regular expression match:
preg_match_all('/pattern/', 'subject', $matches, PREG_SET_ORDER);
$last_match = end($matches); // or array_pop(), but it modifies the array
$last_match[0] contains the complete match, $last_match[1] contains the first parenthesized subpattern, etc.
Another point of interest: your regular expression '/\/([^/])$/' should work as-is because the $ anchors it to the end.

Related

Get last parameter from the URL

I want to get the last parameter from the following type of structure:
$current_url = "/wp/author/admin/1";
So, from above url I will like to get "1"
The following code will return it correctly, but here I'm specifying the exact position of the variable. How can I get the last parameter without specifying its position (eg. no matter how many parameters are in the URL, just get the last one):
$parts = explode('/', $current_url);
var_dump($parts[4]);
I would suggest using a regular expression for this, as you can do quite a few nice things, e.g. also allow URLs that end in /:
if (!preg_match('/\/([^\/]*)\/?$/', $current_url, $matches)
// do something if the URL does not match the pattern
$lastComponent = $matches[1];
What's happening here? The regular expression matches if it can find a forward slash (the \/) followed by any number of characters that are not slashes (the ([^\/]*)), which may then optionally be followed by another slash (the \/?), and then arrives at the end of the string (the $).
The function returns a value that evaluates to false if the regular expression did not match, so you are prepared for garbage input and may emit a warning if appropriate. Notice the parentheses in ([^\/]*), which will take all the characters mathced here (everything from the slash to the end of the input string or the last slash), and put it into its own match ($matches[1]).
I recommend you try regexpal.com if you want to debug and check your regular expressions. They are very powerful tools and quite underused in programming. Especially in PHP, where you get nice functions for them (e.g. preg_match, preg_match_all, and preg_match_split).
after you explode the array use the end() function. That will always grab the last element in the array.
http://us1.php.net//manual/en/function.end.php
I'm sure there are other methods, I would use array_pop
$parts = explode('/', $current_url);
var_dump(array_pop($parts));
http://php.net/manual/en/function.array-pop.php
"array_pop() pops and returns the last value of the array, shortening the array by one element."
but the last note is important as it affects the contents of $parts array
$parts = explode("/", $url);
echo end($parts);

Error trying to pass regex match to function

I'm getting Syntax error, unexpected T_LNUMBER, expecting T_VARIABLE or '$'
This is the code i'm using
function wpse44503_filter_content( $content ) {
$regex = '#src=("|\')'.
'(/images/(19|20)(0-9){2}/(0|1)(0-9)/[^.]+\.(jpg|png|gif|bmp|jpeg))'.
'("|\')#';
$replace = 'src="'.get_site_url( $2 ).'"';
$output = preg_replace( $regex, $replace, $content );
return $output;
}
This is the line where i'm getting that error $replace = 'src="'.get_site_url( $2 ).'"';
Can anyone help me to fix it?
Thanks
You can't have '$2' as a variable name. It must start with a letter or underscore.
http://php.net/manual/en/language.variables.basics.php
Variable names follow the same rules as other labels in PHP. A valid variable name starts with a letter or underscore, followed by any number of letters, numbers, or underscores. As a regular expression, it would be expressed thus: '[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*'
Edit Above was my original answer and is the correct answer to the simple "syntax error" question. More in-depth answer below...
You are trying to use $2 to represent "the second capture group", but you haven't done anything at that point to match your regex. Even if $2 was a valid PHP variable name, it still wouldn't be set at that point in your script. Because of this, you can determine that you are using preg_replace improperly and that it may not suit your actual needs.
Note that the preg_replace documentation doesn't support using $n as a separate variable outside of the replacement operation. In other words, 'foo' . $1 . 'bar' is not a valid replacement string, but 'foo$1bar' is.
Depending on the complexity of get_site_url, you have 2 options:
If get_site_url is simply adding a root directory or server name, you could change your replacement string to src="/myotherlocation$2". This will effectively replace "/image/..." with "/myotherlocation/image/..." in the img src. This will not work if get_site_url is doing something more complex.
If get_site_url is complex, you should use preg_replace_callback per other answers. Give the documentation a read and post a new question (or I guess update this question?) if you have trouble with the implementation.
What you're trying to do (ie replacing the matched string with the result of a function call) can't be done using preg_replace, you'll need to use preg_replace_callback instead to get a function called for every match.
A short example of preg_replace_callback;
$get_site_url = // Returns replacement
function($row) {
return '!'.$row[1].'!'; // row[1] is first "backref"
};
$str = 'olle';
$regex = '/(ll)/'; // String to match
$output = preg_replace_callback( // Match, calling get_site_url for replacement
$regex,
$get_site_url,
$str);
var_dump($output); // output "o!ll!e"
PHP variable names cant begin with a number.
$2 is not a valid PHP variable. If you meant the second group in the regex then you want to put \2 in a string. However, since you're passing it to a function then you'll need to use preg_replace_callback() instead and substitute appropriately in the callback.
if PHP variable begins with number use following:
when I was getting the following as the result set from thrid party API
Code Works
$stockInfo->original->data[0]->close_yesterday
Code Failed
$stockInfo->original->data[0]->52_week_low
Solution
$stockInfo->original->data[0]->{'52_week_high'}

How to get my regular expression to extract information, not just check

I have a regular expression for checking whether a string is zip/postal code or not. But I would really like to also to be able to extract that from a full address (or, if possible, any string).
Here is my current regular expression:
/^((\d{5}-\d{4})|(\d{5})|([a-zA-Z]\d[a-zA-Z]\s\d[a-zA-Z]\d)|([a-zA-Z]\d[a-zA-Z]\d[a-zA-Z]\d))$/
If necessary I'm willing to settle for a function (I'm checking with PHP) but I'd rather the regexp do the work if possible.
preg_match, which I assume you're already using when you're checking a string against your regular expression, also gives you back the actual text that matched your pattern.
preg_match($regex, $input, $matches);
echo $matches[0];
The third argument is filled with the results of trying to match the regex against your input. $matches[0] will contain text that matched the whole pattern, while higher indexes will contain text that matched against capturing subpatterns (the parts of the pattern enclosed in parentheses).
However, in your case, you've enclosed your pattern with the start-of-input ^ and end-of-input $ characters, which means that any matches must include the entire input string (or an entire line in multiline mode). You'd have to get rid of the ^ and $ before trying to use this pattern to extract a postal code from a larger string.
PHP will extract the groupings in () into an array with preg_match():
$matches = array();
$pattern = "/^((\d{5}-\d{4})|(\d{5})|([a-zA-Z]\d[a-zA-Z]\s\d[a-zA-Z]\d)|([a-zA-Z]\d[a-zA-Z]\d[a-zA-Z]\d))$/";
preg_match($pattern, $your_source, $matches);
print_r($matches);
Since you're working with a full address, why not rely on a service that can accurately extract and verify an address and parse it's components (including the full ZIP Code), providing a nice response? It would certainly eliminate any guessing. The screenshot below shows a tool by SmartyStreets that can extract addresses from all sorts of text. In the interest of full disclosure, I'm a software developer at SmartyStreets.
https://smartystreets.com/account/extract

How can use a match in the same regex in php?

I have this string (that is a serialized variable in php):
s:12:"hello "world";
and I wanna to find "hello "world" only with regex, I try this, but seems it is stupid :P
(s:(?P<num>[0-9]+):".{\k{num}}";)
I only want to know how I can use "num" result in the its regex?
this regex is used in a big regex so I can't check for end of string.
thanks advance!
You can use your named capturing groups as backreference like this
Back references to the named subpatterns can be achieved by (?P=name)
or, since PHP 5.2.2, also by \k or \k'name'. Additionally PHP
5.2.4 added support for \k{name} and \g{name}.
According to php.net
But I think this can be used only to match the found pattern again, but not as a number in a quantifier. (At least I didn't got it to work.)
You can use preg_match function, which will populate an array of matches:
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches1 will have the text that matched the first captured parenthesized subpattern, and so on.
More information about preg_match: PHP: preg_match
$text = 's:12:"hello "world";s:12:"good bue world";';
$pattern = "(.*:[0-9]+:\"(.*)\";.*)U";
preg_match_all($pattern,$text,$r);

preg_match returning weird results

I am searching a string for urls...and my preg_match is giving me an incorrect amount of matches for my demo string.
String:
Hey there, come check out my site at www.example.com
Function:
preg_match("#(^|[\n ])([\w]+?://[\w]+[^ \"\n\r\t<]*)#ise", $string, $links);
echo count($links);
The result comes out as 3.
Can anybody help me solve this? I'm new to REGEX.
$links is the array of sub matches:
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
The matches of the two groups plus the match of the full regular expression results in three array items.
Maybe you rather want all matches using preg_match_all.
If you use preg_match_pattern, (as Gumbo suggested), please note that if you run your regex against this string, it will both match the value of your anchor attribute "href" as well as the linked Text which in this case happens to comtain an url. This makes TWO matches.
It would be wise to run an array_unique on your resultset :)
In addition to the advice on how to use preg_match, I believe there is something seriously wrong with the regular expression you are using. You may want to trying something like this instead:
preg_match("_([a-zA-Z]+://)?([0-9a-zA-Z$-\_.+!*'(),]+\.)?([0-9a-zA-Z]+)+\.([a-zA-Z]+)_", $string, $links);
This should handle most cases (although it wouldn't work if there was a query string after the top-level domain). In the future, when writing regular expressions, I recommend the following web-sites to help: http://www.regular-expressions.info/ and especially http://regexpal.com/ for testing them as you're writing them.

Categories