I have created the next function to replace an url by a div with its id.
function twitterIzer($string){
$pattern = '~https?://twitter\.com/.*?/status/(\d+)~';
$string = preg_replace($pattern, "<div class='tweet' id='tweet$1' tweetid='$1'></div>", $string);
return $string;
}
It works well when I use this type of url
https://twitter.com/Minsa_Peru/status/1260658846143401984
but it retrieve an excedent ?s=20 when I use this url
https://twitter.com/Minsa_Peru/status/1262730246668922885?s=20
How can I remove this ?s=20 text, in order to make work my function ? Anything I know is I need to improve my regex pattern. thank you.
If you want just regex:
$pattern = '/https?:\/\/twitter\.com\/.*?\/status\/(\d+)(.*)?/';
Because ? is not a digit so it will seperate with (.*), this mean every thing rest and in this case is ?s=xyz, last question mark ? is to say that is can exist or not.
Learn regex
Related
I have two regex(s) on the way of my input, these:
// replace a URL with a link which is like this pattern: [LinkName](LinkAddress)
$str= preg_replace("/\[([^][]*)]\(([^()]*)\)/", "<a href='$2' target='_blank'>$1</a>", $str);
// replace a regular URL with a link
$str = preg_replace("/(\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|])/i","untitled", $str);
Now there is a problem (somehow a collision). For regular URLs everything is fine. But for a pattern-based URLs, there is a problem: The first regex create a link of that and second regex again create a link of its href-attribute value.
How can I fix it?
Edit: According to the comments, how can I create a single regex instead of those two regex? (using preg_replace_callback). Honestly I tried it but it doesn't work for none kind of URLs ..
Is combining them possible? Because the output of those isn't identical. The first one has a LinkName and the second one has a constant string untitled as its LinkName.
$str = preg_replace_callback('/\[([^][]*)]\(([^()]*)\)|(\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|])/i',
function($matches) {
if(isset($matches[3])) {
// replace a regular URL with a link
return "<a href='".$matches[3]."' target='_blank'>untitled</a>";
} else {
// replace a URL with a link which is like this pattern: [LinkName](LinkAddress)
return "<a href=".$matches[2]." target='_blank'>".$matches[1]."</a>";
}
}, $str);
echo $str;
One way would be to do it like this. You merge your two expressions together with the alternative character |. Then in your callback function you just check if your third capture group is set (isset($matches[3])) and if yes, then your second regular expression matched the string and you replace a normal link, otherwise you replace with link/linktext.
I hope you understand everything and I could help you.
I have a url that looks some what like this
for-sale/stuff/state/used-bla-bla2-bla3-bla4-(bla5)---f10-85934.html
i'm trying to validate the format, in my function using this regex.
if (preg_match('/(?:^|(?:\-))(\w+)/g', $pathInfo, $matches)) {
echo $digit = $matches[0];
}
$pathInfo is the url given above.
Basically i want to match
make sure the directory is for-sale/stuff/
used-bla-bla2-bla3-bla4-(bla5)---f10-85934.html file must start with either used/new and end with a integer.html
no spaces are allowed.
After i validate, i want to get the ID. which in this case is 85934
Seems like you want something like this,
'~^for-sale/stuff/\S+/(?:used|new)\S*?(\d+)\.html$~'
DEMO
I'd suggest this sample piece of code and the following regex:
$re = "~\\bfor\\-sale\\/stuff\\/[^<> ]*?\\/(?:used|new)[^/ ]*?\\-(\\d+)\\.html\\b~";
$str = "\n";
preg_match_all($re, $str, $matches);
Regex: \bfor\-sale\/stuff\/[^<> ]*?\/(?:used|new)[^/ ]*?\-(\d+)\.html\b
I assume you have several URLs to validate in a variable string of text, thus I sugget using \b, and that the URL is inside some tag, so I'd use [^<> ]*? in order to limit capturing to just inside a tag.
The ID will be in the first capturing group (captured by \d+).
Spaces are also disallowed: [^<> ]*?, [^/ ]*?.
I've created my own newsletter module and come across one (big) problem.
The system formats all urls with additional parameters to keep track of the clicks in google analytics.
e.g.
A url like this
http://www.domain.com
becomes like this
http://www.domain.com/&utm_source=newsletter&utm_medium=e-mail&utm_campaign=test
and a url like this
http://www.domain.com/?page=1
becomes like this
http://www.domain.com/?page=1&utm_source=newsletter&utm_medium=e-mail&utm_campaign=test
The first example is bogus. I know the first ampersand has to be replaced by an ampersand and that's where the problem occurs.
I'm using this pattern to extract url's
$pattern = array('#[a-zA-Z]+://([-]*[.]?[a-zA-Z0-9_/-?&%\{\}])*#');
$replace = array('\\0&utm_source=newsletter&utm_medium=e-mail&utm_campaign=test');
$body = preg_replace($pattern,$replace,$body);
Can anybody help me with a correct and working regex, so the first url parameter always contains a questionmark in stead of an ampersand?
just use
if(strpos($string,'?') !== false)
//add with ampersand
else
//add with question mark
Not regex, but it would work. All it does is check for a ? and if it isn't found, change the first & to a question mark.:
$url = (substr_count($url, '?')>0) ? $url : str_replace('&', '?', $url, 1);
A very simple approach would be to look for a string like http://...& where the ... contains no ? question mark or other delimiters:
= preg_replace('#(http://[^\s"\'<>?&]+)&#', '$1?', $src);
But it's probably best if you use a restricted instead of a negated character class:
$src = preg_replace('#(http://[\w/.]+)&#', '$1?', $src);
This solution fixes all urls which have a query beginning with a & (and are missing the ?):
$re = '%([a-zA-Z]+://[^?&\s]+)&(utm_source=newsletter)%';
$body = preg_replace($re, '$1?$2', $body);
I use a preg_replace to auto insert HTML links within paragraphs.
Here's what I currently use:
$pattern = "~(?!(?:[^<\[]+[>\]]|[^>\]]+<\/a>))(".preg_quote($find_keyword, '/').")\b~msUi";
$replacement = "\$0";
$article_content = preg_replace($pattern, $replacement, stripslashes($article_content), 1, $added );
It works great, except 1 problem:
It doesn't match and replace if the keyword is a URL.
If: $find_keyword="http://www.mysite.com/" it won't come up with any matches even though it's in the content.
I already tried escaping $find_keyword with preg_quote, which didn't make any different.
Any regex experts know a solution? Thanks.
The forward slashes in your $find_keywords are not escaped which is breaking the pattern.
You can run your find_keyword through
$find_keyword=preg_quote("http://www.mysite.com/", '/');
http://www.php.net/manual/en/function.preg-quote.php
I have a feeling that I might be missing something very basic. Anyways heres the scenario:
I'm using preg_replace to convert ===inputA===inputB=== to inputA
This is what I'm using
$new = preg_replace('/===(.*?)===(.*?)===/', '$1', $old);
Its working fine alright, but I also need to further restrict inputB so its like this
preg_replace('/[^\w]/', '', every Link or inputB);
So basically, in the first code, where you see $2 over there I need to perform operations on that $2 so that it only contains \w as you can see in the second code. So the final result should be like this:
Convert ===The link===link's page=== to The link
I have no idea how to do this, what should I do?
Although there already is an accepted answer: this is what the /e modifier or preg_replace_callback() are for:
echo preg_replace(
'/===(.*?)===(.*?)===/e',
'"$1"',
'===inputA===in^^putB===');
//Output: inputA
Or:
function _my_url_func($vars){
return ''.$vars[2].'';
}
echo preg_replace_callback(
'/===(.*?)===(.*?)===/',
'_my_url_func',
'===inputA===inputB===');
//Output: inputB
Try preg_match on the first one to get the 2 matches into variables, and then use preg_replace() on the one you want further checks on?
Why don't you do extract the matches from the first regex (preg_match) and treat thoses results and then put them back in a HTML form ?