Including a literal string in the regex

Including a literal string in the regex - php

Current URLs:
http://domain.com/index.php?route=common/home
http://domain.com/index.php?route=account/register
http://domain.com/index.php?route=checkout/cart
http://domain.com/index.php?route=checkout/checkout
Desired URLs:
http://domain.com/home
http://domain.com/register
http://domain.com/cart
http://domain.com/checkout
Regex:
(?=\=)(.*?)(?<=\/).+$
... almost works, but it matches (for example the last URL) =checkout/ whereas I need it to match index.php?route= as well so I can remove the whole index.php?route=checkout/ from the URL.
I tried index.php?route=(?=\=)(.*?)(?<=\/).+$ but ofcourse it doesn't work.

To remove the desired part, you should substitute:
index.php\?route\=[^\/]*\/
with an empty string.
To be more strict and precise, you should use lookbehind:
(?<=http:\/\/domain.com\/)index.php\?route\=[^\/]*\/
Check the regex here: https://regex101.com/r/sY7aV6/1

Related

PHP preg_replace_callback match string but exclude urls

What I'm trying to do is find all the matches within a content block, but ignore anything that is inside tags, for use inside preg_replace_callback().
For example:
test
test title
test
In this case, I want the first line to match, and the third line to match, but NOT the url match, nor the title match in between the a tags.
I've got a regex that I feel like is close:
#(?!<.*?)(\btest\b)(?![^<>]*?>)#si
(and this will not match the url part)
But how do I modify the regex to also exclude the "test" between a and /a?

If it's always the same pattern you can use [A-Z] or a combination like [A-Za-z]

I ended up solving it myself. This regex pattern will do what I wanted:
#(?!<a[^>]*?>)(\btest\b)(?![^<]*?<\/a>)#si

PHP regex last occurrence of words

My string is: /var/www/domain.com/public_html/foo/bar/folder/another/..
I want to remove the root folder from this string, to get only public folder, because some servers have multiple websites inside.
My actual regex is: /^(.*?)(www|public_html|public|html)/s
My actual result is: /domain.com/public_html/foo/bar/folder/another/..
But i want to remove the last ocorrence, and get somethig like this: /foo/bar/folder/another/..
Thanks!

You have to use a greedy quantifier and to check if the alternative is enclosed between slashes using lookarounds:
/^.*(?<![^\/])(?:www|public(?:_html)?|html)(?![^\/])/
About the lookarounds: I use negative lookarounds with a negated character class to check if there is a slash or the limit of the string at the same time. This way you are sure that for instance html is a folder and not the part of another folder name.
I removed the s modifier that is useless. I removed the capture groups too since the goal is to replace all with an empty string.

The ? makes your expression non-greedy which is not actually what you want here. Try:
^(.*)(www|public_html|public|html)
which should keep going until the last match.
Demo: https://regex101.com/r/v5WbB3/1/

PHP Regex to match url contains url fragment

I have one url fragment: page/login and i need to know if another url fragment contains them.
These, will match:
/admin/page/login/
/admin/page/login
admin/page/login
http://www.dot.com/admin/page/login
/admin/page/login?id=10
/admin/page/login/id/10
/admin/page/login/?id=10
/admin/page/login/user?id=10
/admin/page/login/user/?id=10
page/login
page/login/
page/login/id/10
/page/login/id/10
And these not:
/admin/firstpage/login
admin/page/loginOk
/admin/page/loginOk/id/10
mypage/login/id/10
/mypage/login/id/10
mypage/login
I tried: page\/login[\/\s\?], \/?page\/login[\/\s\?] without any result

You can use a word boundary so partial matches aren't matched.
\bpage\/login[\/\s?]
Demo: https://regex101.com/r/yhNsdw/1/
Also if you change your delimiter none of the forward slashes will need to be escaped.

Regular expression to replace all url from string but skip one

I have regular expression that's is removing all url from a string but I want to change this and add exception for my site link.
$url = 'This is url for example to remove www.somewbsite.com but i want to skip removing this url www.mywebsite.com';
$no_url = preg_replace("/(https|http|ftp)\:\/\/|([a-z0-9A-Z]+\.[a-z0-9A-Z]+\.[a-zA-Z]{2,4})|([a-z0-9A-Z]+\.[a-zA-Z]{2,4})|\?([a-zA-Z0-9]+[\&\=\#a-z]+)/i", "★", $url);

First of all, since you are replacing with a hard-coded symbol, and you are using a case-insensitive modifier, your regex can be reduced to
'~(?:https?|ftp)://|(?:[a-z0-9]+\.)?[a-z0-9]+\.[a-z]{2,4}|\?[a-z0-9]+[&=#a-z]+~i'
whatever it means to match. Note that 2 alternatives here were too similar ([a-z0-9A-Z]+\.[a-z0-9A-Z]+\.[a-zA-Z]{2,4})|([a-z0-9A-Z]+\.[a-zA-Z]{2,4}), they are merged into 1 with the help of an optional non-capturing group ((?:[a-z0-9]+\.)?).
Now, if you want to avoid matching a specific pattern, you may use a SKIP-FAIL technique: match what you want to preserve and skip it.
'~www\.mywebsite\.com(*SKIP)(*FAIL)|(?:https?|ftp)://|(?:[a-z0-9]+\.)?[a-z0-9]+\.[a-z]{2,4}|\?[a-z0-9]+[&=#a-z]+~i'
See this regex demo.

Regex pattern to match any character except the last one

I am trying to match a string using two different patterns to work together.
My source string is something like this:
Text, white-spaces, new lines and more text then ^^^^<customtag>
I need to get a group (the second one) that would capture one caret or none then a formatted HTML-like tag. So the first group would capture anything else.
It means that the string above should output this:
(Group 1)Text, white-spaces, new lines and more text then ^^^
(Group 2)^<customtag>
In the source string carets may be one, none or up to two thousands.
I need a good pattern that matches all those carets except the last one.
The code below is what I tried.
preg_match_all('/([\s\S]*\^*)(\^?<\w+>)$/', $string, $matches);
Please note: I used [\s\S] instead of the dot to match any character as well as white-spaces and new lines too.

You may follow the below regex:
(?s)(.*)((\^|(?<!\^))<[^>]+>)
Live demo
PHP code:
preg_match_all('/(?s)(.*)((\^|(?<!\^))<[^>]+>)/', $string, $matches);

You can use as this:
preg_match_all('/(.*)((\^<[^>]*>)|([^\^]<[^>]*>))$/', $string, $matches);
See it working here: http://regexr.com?383g9
In this other link it is working fine: http://regex101.com/r/eQ3vV7

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Including a literal string in the regex - php

To remove the desired part, you should substitute: index.php\?route\=[^\/]\/ with an empty string. To be more strict and precise, you should use lookbehind: (?<=http:\/\/domain.com\/)index.php\?route\=[^\/]\/ Check the regex here: https://regex101.com/r/sY7aV6/1

Related

PHP preg_replace_callback match string but exclude urls

PHP regex last occurrence of words

PHP Regex to match url contains url fragment

Regular expression to replace all url from string but skip one

Regex pattern to match any character except the last one

Categories

Resources

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Including a literal string in the regex - php

To remove the desired part, you should substitute: index.php\?route\=[^\/]*\/ with an empty string. To be more strict and precise, you should use lookbehind: (?<=http:\/\/domain.com\/)index.php\?route\=[^\/]*\/ Check the regex here: https://regex101.com/r/sY7aV6/1

Related

PHP preg_replace_callback match string but exclude urls

PHP regex last occurrence of words

PHP Regex to match url contains url fragment

Regular expression to replace all url from string but skip one

Regex pattern to match any character except the last one

Categories

Resources

To remove the desired part, you should substitute: index.php\?route\=[^\/]\/ with an empty string. To be more strict and precise, you should use lookbehind: (?<=http:\/\/domain.com\/)index.php\?route\=[^\/]\/ Check the regex here: https://regex101.com/r/sY7aV6/1