preg_match an url without the sublinks

preg_match an url without the sublinks - php

I have an expression which I only want to find /settings and not if it contains a sublink like /settings/test1
Right now my expression only finds the one which contain a sublink, but not the one I want one.
^\/settings\/
/settings
/settings/test1
/settings/test2
http://www.phpliveregex.com/p/jPr

You match sublinks because your pattern contains / at the end. You need to remove the / and anchor the pattern at the end of the string with the $ anchor:
Use
^\/settings$
See the regex demo

Related

PHP preg_replace_callback match string but exclude urls

What I'm trying to do is find all the matches within a content block, but ignore anything that is inside tags, for use inside preg_replace_callback().
For example:
test
test title
test
In this case, I want the first line to match, and the third line to match, but NOT the url match, nor the title match in between the a tags.
I've got a regex that I feel like is close:
#(?!<.*?)(\btest\b)(?![^<>]*?>)#si
(and this will not match the url part)
But how do I modify the regex to also exclude the "test" between a and /a?

If it's always the same pattern you can use [A-Z] or a combination like [A-Za-z]

I ended up solving it myself. This regex pattern will do what I wanted:
#(?!<a[^>]*?>)(\btest\b)(?![^<]*?<\/a>)#si

PHP Regex to match url contains url fragment

I have one url fragment: page/login and i need to know if another url fragment contains them.
These, will match:
/admin/page/login/
/admin/page/login
admin/page/login
http://www.dot.com/admin/page/login
/admin/page/login?id=10
/admin/page/login/id/10
/admin/page/login/?id=10
/admin/page/login/user?id=10
/admin/page/login/user/?id=10
page/login
page/login/
page/login/id/10
/page/login/id/10
And these not:
/admin/firstpage/login
admin/page/loginOk
/admin/page/loginOk/id/10
mypage/login/id/10
/mypage/login/id/10
mypage/login
I tried: page\/login[\/\s\?], \/?page\/login[\/\s\?] without any result

You can use a word boundary so partial matches aren't matched.
\bpage\/login[\/\s?]
Demo: https://regex101.com/r/yhNsdw/1/
Also if you change your delimiter none of the forward slashes will need to be escaped.

Regular expression to replace all url from string but skip one

I have regular expression that's is removing all url from a string but I want to change this and add exception for my site link.
$url = 'This is url for example to remove www.somewbsite.com but i want to skip removing this url www.mywebsite.com';
$no_url = preg_replace("/(https|http|ftp)\:\/\/|([a-z0-9A-Z]+\.[a-z0-9A-Z]+\.[a-zA-Z]{2,4})|([a-z0-9A-Z]+\.[a-zA-Z]{2,4})|\?([a-zA-Z0-9]+[\&\=\#a-z]+)/i", "★", $url);

First of all, since you are replacing with a hard-coded symbol, and you are using a case-insensitive modifier, your regex can be reduced to
'~(?:https?|ftp)://|(?:[a-z0-9]+\.)?[a-z0-9]+\.[a-z]{2,4}|\?[a-z0-9]+[&=#a-z]+~i'
whatever it means to match. Note that 2 alternatives here were too similar ([a-z0-9A-Z]+\.[a-z0-9A-Z]+\.[a-zA-Z]{2,4})|([a-z0-9A-Z]+\.[a-zA-Z]{2,4}), they are merged into 1 with the help of an optional non-capturing group ((?:[a-z0-9]+\.)?).
Now, if you want to avoid matching a specific pattern, you may use a SKIP-FAIL technique: match what you want to preserve and skip it.
'~www\.mywebsite\.com(*SKIP)(*FAIL)|(?:https?|ftp)://|(?:[a-z0-9]+\.)?[a-z0-9]+\.[a-z]{2,4}|\?[a-z0-9]+[&=#a-z]+~i'
See this regex demo.

How do I extract one group from a URL using regex for use in a redirect?

I've read the Best RegEx Trick Ever and tried to wrap my head around the other answers here on Stack Exchange and just can't seem to get it right. Take these three strings:
http://www.test.com/newyork/class-schedule
http://www.test.com/location/newyork/class-schedule
http://www.test.com/location/newyork/training
I need a regex that will extract the newyork from the first string and save it for a replace later, but will NOT match any part of the other strings. Also, for obscure reasons, I can not include http://www.test.com as a condition for matching (so I can't use anything before the slash that precedes newyork). Note that in this scenario, newyork could easily be chicago, atlanta, or any other city name with no spaces or punctuation.
The only thing I've been able to figure out that isolates only newyork in the first string is the following:
/.*\.com\/(.[^\/]*)\/class-schedule/g
However, this relies on using the URL first which I can't use.
Any ideas on how to achieve this WITHOUT using the URL?
[EDIT]
To clarify what I'm looking for, I'm trying to take the results from the first string and add "location" to it, still using regex. So:
http://www.test.com/newyork/class-schedule
would become
http://www.test.com/location/newyork/class-schedule
using something like
http://www.test.com/location/$1/class-schedule

Try this: ~/(\w+)/[-a-z]+?/?(?:\?.*?)*(:?\s|$)~gm
See it working here: https://regex101.com/r/4VMazZ/3.
So it will use the end of URL instead of the beginning and match only the word between slash 2 and 3 from the end. There can be a query string it will still work.
[EDIT 1]
I exchanged 2 chars doing typo in the end so it was capturing one extra group: /(\w+)/[-a-z]+?/?(?:\?.*?)*(?:\s|$). here: https://regex101.com/r/4VMazZ/4
If you use preg_match($pattern, $string, $matches); the result you want (newyork) will be in $matches[1];, $matches[0] contains everything.
You can see the captures in 'MATCH INFORMATION' panel on regex101 in my example!
[EDIT 2] after your comment.
If you want to replace the whole url you have to match the whole URL, something like this: .*?/(\w+)/[-a-z]+?/?(?:\?.*?)*(?:\s|$) will do in this example. See it working here: https://regex101.com/r/4VMazZ/5
[EDIT 3] Add capturing of last part for replacement.
So as you want to reuse last part you need to add capturing parenthesis: .*?/(\w+)/([-a-z]+?)/?(?:\?.*?)*(?:\s|$).
See it working here: https://regex101.com/r/4VMazZ/6

Could this work? See it here.
(?<=location\/|\.\w{3}\/|\.\w{2}\/)(?!location).*?(?=\/|$)
It matches everything following .xxx/ or .xx/ or location/. I don't know if one letter domain exist, in this case, you can add |\.\w\/ to the lookahead at the start of the regex.
(?<=location\/|\.\w{3}\/|\.\w{2}\/) is a lookahead, so it matches the following pattern only if preceded by location/ or .xxx or .xx
.*? matches every character (lazy)
(?=\/|$) end match if next character is / or on line end
Note: If location is counted as part of the url, I don't think what you are asking is possible in regex, as the city name could be anywhere in string. If so, then you could have a list of cities and check what part of the url matches one of them.
EDIT: You need the multiline m flag so $ also matches end of line

php Regular Expression Issues - Can't remove/strip out and replace a string within a string

I have never worked with regular expressions before and I need them now and I am having some issues getting the expected outcome.
Consider this for example:
[x:3xerpz1z]Some Text[/x:3xerpz1z] Some More Text
Using the php preg_replace() function, I want to replace [x:3xerpz1z] with <start> and [/x:3xerpz1z] with </end> but I can't figure this out. I have read some regular expression tutorials but I am still confused.
I have tried this for the starting tag:
preg_replace('/(.*)\[x:/','<start>', $source_string);
The above would return:<start>3xerpz1z
As you can see, the "3xerpz1z" isn't getting removed and it needs to be stripped out. I can't hard code and search and replace "3xerpz1z" because the "3xerpz1z" chars are randomly generated and the characters are always different but the length of the tag is the same.
This is the desired output I want:
<start>Some Text</end> Some More Text
I haven't event tried processing [/x:3xerpz1z] because I can't even get the first tag going.

You must use capturing groups (....):
$data = '[x:3xerpz1z]Some Text[/x:3xerpz1z] Some More Text';
$result = preg_replace('~\[x:([^]]+)](.*?)\[/x:\1]~s', '<start>$2</end>', $data);
pattern details:
~ # pattern delimiter: better than / here (no need to escape slashes)
\[x:
([^]]+) # capture group 1: all that is not a ]
]
(.*?) # capture group 2: content
\[/x:\1] # \1 is a backreference to the first capturing group
~s # s allows the dot to match newlines

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_match an url without the sublinks - php

I have an expression which I only want to find /settings and not if it contains a sublink like /settings/test1 Right now my expression only finds the one which contain a sublink, but not the one I want one. ^\/settings\/ /settings /settings/test1 /settings/test2 http://www.phpliveregex.com/p/jPr

You match sublinks because your pattern contains / at the end. You need to remove the / and anchor the pattern at the end of the string with the $ anchor: Use ^\/settings$ See the regex demo

Related

PHP preg_replace_callback match string but exclude urls

PHP Regex to match url contains url fragment

Regular expression to replace all url from string but skip one

How do I extract one group from a URL using regex for use in a redirect?

php Regular Expression Issues - Can't remove/strip out and replace a string within a string

Categories

Resources