I'm using the following pattern to match URLs in a string:
$pattern = '%\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))%s';
This works pretty well. However, the match fails with URLs like this:
https://twitter.com/search/from:username(exclude:replies)min_faves:20
It seems to stop at the parentheses. Any ideas on how I could modify the pattern to match this type of URL? Thanks in advance!
Take the parentheses of of your negated character class and it works.
[^\s<>]+
https://regex101.com/r/4amF6u/1/
Full version:
$pattern = '%\b(([\w-]+://?|www[.])[^\s<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))%s';
Related
I use following regex pattern in PHP (Symfony framework) to match URL
^/api/v1/account/verify
^/api/v1/account/register
^/api/v1/account/forgot-password
I now have a following URL
/api/v1/payment/{token}/success/jJBePenWo0eN
{token} will consist of dynamic values, but string "jJBePenWo0eN" will always be static how do I make a match that will satisfy the above URL?
Update: 1
I am looking for something like this
^/api/v1/payment/[a-zA-Z0-9]/success/jJBePenWo0eN
However this is not working
You can make a regex like below:
'/^\/api\/v1\/payment\/(.+?)\/success\/jJBePenWo0eN$/'
where we match the string from start to end(notice the ^ and $ symbols) and match the {token} value greedily in (.+?) group.
Snippet:
<?php
if(preg_match('/^\/api\/v1\/payment\/(.+?)\/success\/jJBePenWo0eN$/','/api/v1/payment/{token}/success/jJBePenWo0eN',$matches) === 1){
print_r($matches);
}
Demo: https://3v4l.org/2pjjq
I am working on SEO thing in project to match best URL from possible url's, so
i am trying to match request url using preg_match() function from url pattern
can anyone please help me create regular expression to match only specific urls from all urls, i am newbie in Regular expression, please see my stuff
following 3 urls
1). http://domain.com/pressrelease/index/1
2). http://domain.com/pressrelease/123
3). http://domain.com/pressrelease/blah
i want to match 2,3 urls, urls have not contain of index/(:num) after pressrelease
i create this regular expression but it's does not working
(\/pressrelease\/)+((?!index).)*$
Since you're passing the regex to preg_match, the below regex would be fine.
\/pressrelease\/(?!index\/\d+\b).*
DEMO
(?!index\/\d+\b) negative lookahead assertion which asserts that the match /pressrelease/ won't be followed by the string which is in the format like index/number.
It actually works, but it doesn't match the first part of the URL.
^.*\/pressrelease\/(?!index).*$
Take a look at this demo on Rubular to check it.
I have a regex that's matching urls and converting them into html links.
If the url is already part of a link I don't want to to match, for example:
http://stackoverflow.com/questions/ask
Should match, but:
Stackoverflow
Shouldn't match
How can I create a regex to do this?
If your url matching regular expression is $URL then you can use the following pattern
(?<!href[\"'])$URL
In PHP you'd write
preg_match("/(?<!href[\"'])$URL/", $text, $matches);
You can use a negative lookbehind to assert that the url is not preceded by href="
(?<!href=")
(Your url-matching pattern should go immediately after that.)
This link provides information. The accepted solution is like so:
<a\s
(?:(?!href=|target=|>).)*
href="http://
(?:(?!target=|>).)*
By removing the references to "target" this should work for you.
Try this
/(?:(([^">']+|^)https?\:\/\/[^\s]+))/m
/any_string/any_string/any_number
with this regular expression:
/(\w+).(\w+).(\d+)/
It works, but I need this url:
/specific_string/any_string/any_string/any_number
And I don't know how to get it. Thanks.
/(specific_string).(\w+).(\w+).(\d+)/
Though note that the .s in your regular expression technically match any character and
not just the /
/(specific_string)\/(\w+)\/(\w+)\/(\d+)/
This will have it match only slashes.
This one will match the second url:
"/(\w+)\/(\w+)\/(\w+)\/(\d+)/"
/\/specific_string\/(\w+).(\w+).(\d+)/
Just insert the specific_string in the regexp:
/specific_string\/(\w+)/(\w+)/\d+)/
Another variant with the outer delimiters changed to avoid extraneous escaping:
preg_match("#/FIXED_STRING/(\w+)/(\w+)/(\d+)#", $_SERVER["REQUEST_URI"],
I would use something like this:
"/\/specific_string\/([^\/]+)\/([^\/]+)\/(\d+)/"
I use [^\/]+ because that will match anything that is not a slash. \w+ will work almost all the time, but this will also work if there is an unexpected character in the path somewhere. Also note that my regex requires the leading slash.
If you want to get a little more complicated, the following regex will match both of the patterns you provided:
"/^(?:\/specific_string)*\/([^\/]+)\/([^\/]+)\/(\d+)$/"
This will match:
"/any_string/any_string/any_number"
"/specific_string/any_string/any_string/any_number"
but it will not match
"/some_other_string/any_string/any_string/any_number"
i have a regular expression to remove certain parts from a URI. However it doesn't take into account multiple parts in a way that works :-). Can somebody assist?
$regex = '~/{(.*?)}\*~'
$uri = '/user/{action}/{id}*/{subAction}*';
$newuri = preg_replace($regex, '' , $uri);
//$newuri = /user/
//Should be: $newuri = /user/{action}/
I know it matches the following part as one match:
/{action}/{id}/{subAction}
But it should match the following two seperately:
/{id}*
/{subAction}*
To me it looks like your {(.*?)}\* test is matching all of {action}/{id}*, which judging from what you've written isn't what you want.
So change the Kleene closure to be less greedy:
'~/{([^}]*)}\*~'
But do you really need to capture the part inside the curly braces? Seems to me you could go with this one instead:
'~/{[^}]*}\*~'
Either way, the [^}]* part guarantees that the expression will not match {action}/ because it doesn't end in an asterisk.