Regex syntax issue [duplicate] - php

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Finetune Regex to skip tags
Currently my function looks like this. It converts plain text URLs into HTML links.
function UrlsToLinks($text){
return preg_replace('#(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.-]*(\?\S+)?)?)?)#', '$1', $text);
}
But there are some problems. What I'm trying to do is skip existing links, the src attribute in <img> tags, etc.. Can't figure out what I need to modify in this function.

This would work, assuming that the URLs we want to replace are not already inside a tag.
function UrlsToLinks($text){
$matches = array();
$strippedText = strip_tags($text);
preg_match_all('#(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.-]*(\?\S+)?)?)?)#', $strippedText, $matches);
foreach ($matches[0] as $match) {
if (filter_var($match, FILTER_VALIDATE_URL)) {
$text = str_replace($match, ''.$match.'', $text);
}
}
return $text;
}

Related

PHP Remove anything after specific characters (file extensions) [duplicate]

This question already has answers here:
How to remove the querystring and get only the URL?
(16 answers)
Closed 2 years ago.
I would like to remove anything that follows after a specific set of characters (i.e. filetypes / extensions). I have tried numerous scripts I found online, but none really manage to do what I need, they either remove the file extension as well, or keep parts of the arguments that follow.
$urls = array(
'http://www.example.com/images/image1.jpg',
'http://www.example.com/images/image2.png?arg=value',
'http://www.example.com/images/image3.jpg?foo=bar',
'http://www.example.com/images/image4.gif?v=1',
'http://www.example.com/images/image5.bmp?x=y',
'http://www.example.com/images/image6.tiff?werdfs=234234'
);
Desired outcome:
http://www.example.com/images/image1.jpg
http://www.example.com/images/image2.png
http://www.example.com/images/image3.jpg
http://www.example.com/images/image4.gif
http://www.example.com/images/image5.bmp
http://www.example.com/images/image6.tiff
Maybe this one help you.
$re = '/^.*(?:\.)[a-zA-Z]+/m';
$urls = array(
'http://www.example.com/images/image1.jpg',
'http://www.example.com/images/image2.png?arg=value',
'http://www.example.com/images/image3.jpg?foo=bar',
'http://www.example.com/images/image4.gif?v=1',
'http://www.example.com/images/image5.bmp?x=y',
'http://www.example.com/images/image6.tiff?werdfs=234234',
'asdasd'
);
foreach ($urls as $url) {
preg_match($re, $url, $matches);
if ($matches) {
echo $matches[0];
echo "\n";
}
}
Output
http://www.example.com/images/image1.jpg
http://www.example.com/images/image2.png
http://www.example.com/images/image3.jpg
http://www.example.com/images/image4.gif
http://www.example.com/images/image5.bmp
http://www.example.com/images/image6.tiff
How about PHP's parse_url() and basename?
$inName = $urls[0]; // for example
$newName = parse_url($inName,PHP_URL_SCHEME)
. parse_url($inName,PHP_URL_HOST)
. parse_url($inName,PHP_URL_PATH)
. basename($inName);

get the word inside a special character and append to an array [duplicate]

This question already has answers here:
Extracting all values between curly braces regex php
(3 answers)
Closed 3 years ago.
so I have this email template that consist of something of a template strings
<p>My name is {name}</p>
that later on, when I trigger email function, I can just do file_get_contents and use str_replace to replace that part with the correct value
$temp = file_get_contents(__DIR__.'/template/advertise.html');
$temp = str_replace('{img}','mysite.com/assets/img/face.jpg',$temp);
Now what wanted this time is to get those word inside {} and append to an array $words = [];
What I've tried is using explode technique e.g.
$word = 'Hello {img}, age {age}, I live in {address}';
$word = explode('}',$word);
$words = [];
foreach( $word as $w ){
$words[] = explode('{',$w)[1];
}
print_r($words);
but is there other better way to do this? any ideas, help?
Try the following:
$word = 'Hello {img}, age {age}, I live in {address}';
if (preg_match_all('/\{([^\}]+)\}/', $word, $matches, PREG_PATTERN_ORDER)) {
var_dump($matches[1]);
}
It should return all the variable names inside the curly braces.

How to clean duplicate BB tags [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
[i][b][i][b](This is a paragraph with BBcode.)[/b][/i][/b][/i]
Some of my BB code has double tags, whats the best way to remove this?
I've tried a few things mostly regex, but I'm honestly a novice when it comes to regex.
This is absolutely horrible, but it works.
<?php
$bb = '[i][b][i][b](This is a paragraph with BBcode.)[/b][/i][/b][/i]';
// regex with start, paragraph, and end capture groups
$regex = '#(?<start>(\[[a-z]*\])*+)(?<paragraph>.*)(?<end>(\[\/[a-z]*\])*+)#U';
// put matches into $matches array
preg_match_all($regex, $bb, $matches);
// get the stuff we need
$start = $matches['start'][0]; // string(12) "[i][b][i][b]"
$paragraph = implode('', $matches['paragraph']);
// now we will grab each tag
$regex = '#\[(?<tag>[a-z])\]#';
preg_match_all($regex, $start, $matches);
$tags = array_unique($matches['tag']);
// and build up the new string
$newString = '';
foreach($tags as $tag) {
$newString .= '[' . $tag . ']';
}
// create the end tags
$end = str_replace('[', '[/', $newString);
// put it all together
$newString .= $paragraph . $end;
echo $newString; // [i][b](This is a paragraph with BBcode.)[/i][/b]
Which gives you [i][b](This is a paragraph with BBcode.)[/i][/b]
Check it here https://3v4l.org/O8UHO
You can try to extract all the tags with a regexp (maybe like /\[.*\]/U) and then iterate throught them, removing all the duplicates
Quick example :
$bb = '[i][b][i][b](This is a paragraph with BBcode.)[/b][/i][/b][/i]';
preg_match_all('/(\[.*\])/gU', $bb, $tags);
$metTags = [];
foreach($tags[0] as $tag) {
// tag has not been met, save it
if (in_array($tag, $metTags) === false) {
$metTags[] = $tag;
// tag have been met already
} else {
// remove it ONCE
$bb = preg_replace('#'.preg_quote($tag).'#', '', $bb, 1);
}
}
echo $bb;
That is probably not the best solution due to preg_replace() use, but still doing the job great.
Edit : add code
So i wrote a long winded method in php only. It basically loops through all the characters, then i when i find an "[" i check the rest for accepted tags. I end up with an array of all the styles and just text allowing me to remove the duplicate styles. Would show code but dont want to get laughed at :D

Regex to extract youtube embed url from an iframe in a string [duplicate]

This question already has answers here:
how to extract links and titles from a .html page?
(6 answers)
Closed 3 years ago.
Hi know there's already a lot of question about this so far. But I've tried a lot of them and can't get it quite where I need it.
I need a regex that will extract a youtube url from a string that contains an iframe.
Sample text:
<p>
</p><p>Garbage text</p><p><iframe width="560" height="315" src="//www.youtube.com/embed/PZlJFGgFTfA" frameborder="0" allowfullscreen=""></iframe></p>
Here's the regex I come up with :
(\bhttps?:)?\/\/[^,\s()<>]+(?:\([\w\d]+\)|(?:[^,[:punct:]\s]|\/))
Regex101 test
I'm using it on a function and it returned an empty array. Do someone have an idea what's wrong with my function ?
function extractEmbedYT($str) {
preg_match('/(\bhttps?:)?\/\/[^,\s()<>]+(?:\([\w\d]+\)|(?:[^,[:punct:]\s]|\/))/', $str, $matches, PREG_OFFSET_CAPTURE, 0);
return $matches;
}
EDIT 1 : Changed capture group in my regex so it don't capture last the last char
EDIT 2 : Added some PHP Code to put in context, since it's working in Regex101 but not on my script.
You need to convert the capturing group to a non-capturing one:
/(\bhttps?:)?\/\/[^,\s()<>]+(?:\(\w+\)|(?:[^,[:punct:]\s]|\/))/s
^^^
Also, in the code, you need to pass $string to the function, not $str:
function stripEmptyTags ($result)
{
$regexps = array (
'~<(\w+)\b[^\>]*>([\s]| )*</\\1>~',
'~<\w+\s*/>~',
);
do
{
$string = $result;
$result = preg_replace ($regexps, '', $string);
}
while ($result != $string);
return $result;
}
function extractEmbedYT($str) {
// Find all URLS in $str
preg_match_all('/(\bhttps?:)?\/\/[^,\s()<>]+(?:\(\w+\)|(?:[^,[:punct:]\s]|\/))/s', $str, $matches);
// Remove all iframes from $str
$str = preg_replace('/<iframe.*?<\/iframe>/i','', $str);
$str = stripEmptyTags($str);
return [$str, $matches[0]];
}
$string = '<p>
</p><p>UDA Stagiaire</p><p><iframe width="560" height="315" src="//www.youtube.com/embed/PZlJFGgFTfA" frameborder="0" allowfullscreen=""></iframe></p>';
$results = extractEmbedYT($string);
print_r($results);
See the online PHP demo.

Using a PHP function in conjunction with a regular expression match [duplicate]

This question already has answers here:
calling function inside preg_replace thats inside a function
(2 answers)
Closed 8 years ago.
I'm trying to come up with a way to find all RegEx matches in a string, then run all of those matches through a function that I've written, but I'm having no luck.
Specifically, I'm trying to find all email addresses in a string and then use a function to convert those addresses into useful mailto links that hide the addresses from spam bots.
So I start with plain old RegEx to turn the addresses into mailto links, just so I know that the matches are working.
$pattern = '#([0-9a-z]([-_.]?[0-9a-z])*#[0-9a-z]([-.]?[0-9a-z])*\\.[a-wyz][a-z](fo|g|l|m|mes|o|op|pa|ro|seum|t|u|v|z)?)#i';
$replacement = "<a href='mailto:\\1'>\\1</a>";
$description = preg_replace($pattern, $replacement, $description);
Works great. So far, so good. But when I try to use my function to manipulate the address string, email addresses are no longer matched.
$pattern = '#([0-9a-z]([-_.]?[0-9a-z])*#[0-9a-z]([-.]?[0-9a-z])*\\.[a-wyz][a-z](fo|g|l|m|mes|o|op|pa|ro|seum|t|u|v|z)?)#i';
$replacement = myFunction('\\1');
$description = preg_replace($pattern, $replacement, $description);
What am I doing wrong?
$description = preg_replace_callback($pattern, 'myFunction', $description);
Check preg_replace_callback() on manual to understand how it works.
Your myFunction() function should be coded like that:
function myFunction($matches)
{
return sprintf('%s', $matches[1], $matches[1]);
}
Using preg_replace_callback() with a closure (PHP 5.3+):
$description = preg_replace_callback($pattern, function ($matches) {
return myFunction($matches[1]);
}, $description);

Categories