Match and replace with the captured/matched characters - php

I currently have a code that finds and replaces urls into complete html links. It works fine but now i need to update it so that if there is image url then it should convert it into a html img tag and display it. Function im using now is...
function auto_link_text($text) {
$pattern = '#\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))#';
$callback = create_function('$matches', '
$url = array_shift($matches);
$url_parts = parse_url($url);
return sprintf(\'<a rel="nowfollow" target="_blank" href="%s">%s</a>\', $url, $url);
');
return preg_replace_callback($pattern, $callback, $text);
}
Got it from...
How to add anchor tag to a URL from text input
Here is an example of the text i would to it to go through...
asdf
http://google.com/
asfd
http://yahoo.com/logo.jpg
http://www.apple.com/sdfsd.php?page_id=13&id=18210&status=active#1
http://youtube.com/logo.png
like it updated function to output...
asdf
<a rel="nowfollow" target="_blank" href="http://google.com/">http://google.com/</a>
asfd
<img src="http://yahoo.com/logo.jpg" class="example">
<a rel="nowfollow" target="_blank" href="http://www.apple.com/sdfsd.php?page_id=13&id=18210&status=active#1">http://www.apple.com/sdfsd.php?page_id=13&id=18210&status=active#1</a>
<img src="http://youtube.com/logo.png" class="example">
Big thanks in advance!

You can use this for example:
function create_anchor_tag($url, $text = false) {
if ($text===false) $text = $url;
return '<a rel="no-follow" target="_blank" href="' . $url . '">'
. $text . '</a>';
}
function create_image_tag($url) {
return '<img src="' . $url . '"/>';
}
function auto_link_text($text) {
$pattern = '~\b(?:(?:ht|f)tps?://|www\.)\S+(?<=[\PP?])~i';
$callback = function ($m) {
$img_ext = array('jpg', 'jpeg', 'gif', 'png');
$path = parse_url($m[0], PHP_URL_PATH);
$ext = substr(strrchr($path, '.'), 1);
if (in_array(strtolower($ext), $img_ext))
return create_image_tag($m[0]);
return create_anchor_tag($m[0]);
};
return preg_replace_callback($pattern, $callback, $text);
}
I used several functions to make it more clea[rn], but you can easily adapt it as you like.

Here is the nice post about the best suitable regex pattern for valid URL. I picked one from there to group all the URLs.
Online demo
Steps to follow:
simply extract the url.
put a check on the url and based on your own logic substitute the tag as shown in demo.
sample code: (get all the valid urls in groups. get it from index 1)
$re = "/(([A-Za-z]{3,9}:(?:\\/\\/)?(?:[-;:&=\\+\\$,\\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\\+\\$,\\w]+#)[A-Za-z0-9.-]+)((?:\\/[\\+~%\\/.\\w-_]*)?\\??(?:[-\\+=&;%#.\\w_]*)#?(?:[\\w]*))?)/";
$str = "...";
preg_match_all($re, $str, $matches);
sample code: (substitute anchor tag (or what ever you want to add))
$re = "/(([A-Za-z]{3,9}:(?:\\/\\/)?(?:[-;:&=\\+\\$,\\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\\+\\$,\\w]+#)[A-Za-z0-9.-]+)((?:\\/[\\+~%\\/.\\w-_]*)?\\??(?:[-\\+=&;%#.\\w_]*)#?(?:[\\w]*))?)/";
$str = "...";
$subst = '$1';
$result = preg_replace($re, $subst, $str);

Related

How to extract m3u8 of youtube by regex?

I had a php file already using regex to extract m3u8 link from youtube, which was working fine until last week.
http://server.com/youtube.php?id=youtbueid
use to pass the youtube id like this.
$string = get_data('https://www.youtube.com/watch?v=' . $channelid);
if(preg_match('#"hlsManifestUrl.":."(.*?m3u8)#', $string, $match)) {
$var1=$match[1];
$var1=str_replace("\/", "/", $var1);
$man = get_data($var1);
//echo $man;
preg_match_all('/(https:\/.*\/95\/.*index.m3u8)/U',$man,$matches, PREG_PATTERN_ORDER);
$var2=$matches[1][0];
header("Content-type: application/vnd.apple.mpegurl");
header("Location: $var2");
}
else {
preg_match_all('#itag.":([^,]+),."url.":."(.*?).".*?qualityLabel.":."(.*?)p."#', $string, $match);
//preg_match_all('#itag.":([^,]+),."url.":."(.*?).".*?bitrate.":.([^,]+),#', $string, $match);
$filter_keys = array_filter($match[3], function($element) {
return $element <= 720;
});
//print_r($filter_keys);
$max_key = array_keys($filter_keys, max($filter_keys))[0];
//print_r($max_key);
$urls = $match[2];
foreach($urls as &$url) {
$url = str_replace('\/', '/', $url);
$url = str_replace('\\\u0026', '&', $url);
}
print_r($urls[$max_key]);
header('location: ' . $urls[$max_key]);
How do I solve this problem?
Based on this post, I'm guessing that the desired URLs might look like:
and we can write a simple expression such as:
(.+\?v=)(.+)
We can also add more boundaries to it, if it was necessary.
RegEx
If this expression wasn't desired, you can modify/change your expressions in regex101.com.
RegEx Circuit
You can also visualize your expressions in jex.im:
PHP Test
$re = '/(.+\?v=)(.+)/m';
$str = ' https://www.youtube.com/watch?v=_Gtc-GtLlTk';
$subst = '$2';
$result = preg_replace($re, $subst, $str);
echo $result;
JavaScript Demo
This snippet shows that we likely have a valid expression:
const regex = /(.+\?v=)(.+)/gm;
const str = ` https://www.youtube.com/watch?v=_Gtc-GtLlTk`;
const subst = `$2`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);

Joomla 3 content plugn: Foreach preg_replace $row->fulltext duplicates first string

I am setting up a new content plugin for Joomla 3, that should replace plugin tags with html content. Everything works fine till the moment when i am preg_replace plugin tags in $row->fulltext.
Here is the plugin code
public function onContentPrepare($context, &$row, &$params, $page = 0) {
$pattern = '#\{uni\}(.*){\/uni\}#sU';
preg_match_all($pattern, $row->fulltext, $matches, PREG_PATTERN_ORDER);
foreach($matches[1] as $k=>$uni){
preg_match('/\{uni-title\}(.*)[\{]/Ui', $uni, $unititle);
preg_match('/\{uni-text\}(.*)/si', $uni, $unitext);
$titleID = str_replace(' ', '_', trim($unititle[1]));
$newString = '<span id="'.$titleID.'">'.$unititle[1].'</span><div class="university-info-holder"><div class="university-info"><i class="icon icon-close"></i>'.$unitext[1].'</div></div>';
$row->fulltext = preg_replace($pattern,$newString,$row->fulltext);
}
}
Any ideas, why it duplicates first found match, as many times as foreach goes?
Just to mention, if i do:
echo $unititle[1];
inside foreach, items aren't duplicated, but are rendered as it should be.
There are a few problems with the original code.
It should be using $row->text instead of $row->fulltext. This is because when rendering an article Joomla merges tht introtext and fulltext fields.
It's a mistake to use $pattern for the matching when making the substitution. That's because the $pattern matches all of the items. Instead use the $match[0][$k] to do the replacement. Use str_replace instead of preg_replace because now you are matching the exact string and don't need to do a regex.
Here's the code for the whole thing.
class PlgContentLivefilter extends JPlugin{
public function onContentPrepare($context, &$row, &$params, $page = 0) {
return $renderUniInfo = $this->renderUniInfo($row, $params, $page = 0);
}
private function renderUniInfo(&$row, &$params, $page = 0) {
$pattern = '#\{uni\}(.*){\/uni\}#sU';
preg_match_all($pattern, $row->text, $matches);
foreach($matches[0] as $k=>$uni){
preg_match('/\{uni-title\}(.*)[\{]/Ui', $uni, $unititle);
preg_match('/\{uni-text\}(.*)/si', $uni, $unitext);
print_r($unititle[1]);
$title = $unititle[1];
$text = $unitext[1];
if (preg_match('#(?:http://)?(?:https://)?(?:www\.)?(?:youtube\.com/(?:v/|embed/|watch\?v=)|youtu\.be/)([\w-]+)?#i', $unitext[1], $match)) {
$video_id = $match[1];
$video_string = '<div class="videoWrapper"><iframe src="http://youtube.com/embed/'.$video_id.'?rel=0"></iframe></div>';
$unitext[1] = preg_replace('#(?:http://)?(?:https://)?(?:www\.)?(?:youtube\.com/(?:v/|embed/|watch\?v=)|youtu\.be/)([\w-]+)?#i', $video_string, $unitext[1]);
$text = $unitext[1];
}
$titleID = str_replace(' ', '_', trim($title));
$newString = '<span id="'.$titleID.'">'.$title.'</span><div class="university-info-holder"><div class="university-info"><i class="icon icon-close"></i>'.$text.'</div></div>';
$row->text = str_replace($matches[0][$k],$newString,$row->text);
}
}
}

Remove specific line in PHP

I'm trying to get a YouTube video thumbnail with this
echo '<img src="http://i1.ytimg.com/vi/' . $video_id = $video_id[1]. '/maxresdefault.jpg" alt="' . $video_title . '" />';
but that returns:
http://i1.ytimg.com/vi/zbZu8cOTh_4&feature=youtube_gdata_player/maxresdefault.jpg
but I want it to return:
http://i1.ytimg.com/vi/zbZu8cOTh_4/maxresdefault.jpg
So I have to find a way to remove
"&feature=youtube_gdata_player"
from the url.
How do I do that?
I've tried
$video_id = str_replace('&feature=youtube_gdata_player', '', $video_id)
but I can't get that to work.
(I'm new to PHP, so I probably made some stupid error.)
You can do it with a regex :
$string = "zbZu8cOTh_4&feature=youtube_gdata_player";
$pattern = '/(&[\w\W]+$)/i';
$replacement = '';
preg_replace($pattern, $replacement, $string);
or with strpos :
$string = "zbZu8cOTh_4&feature=youtube_gdata_player";
$p = strrpos($string, "&");
substr($string,0, $p);

Regex: Modify img src tags

I try to replace all images in a html document with inline image (data:image).
I've a sample code which does not work:
function data_uri($filename) {
$mime = mime_content_type($filename);
$data = base64_encode(file_get_contents($filename));
return "data:$mime;base64,$data";
}
function img_handler($matches) {
$image_element = $matches[1];
$pattern = '/(src=["\'])([^"\']+)(["\'])/';
$image_element = preg_replace($pattern, create_function(
$matches,
$matches[1] . data_uri($matches[2]) . $matches[3]),
$image_element);
return $image_element;
}
$content = (many) different img tags
$search = '(<img\s+[^>]+>)';
$content = preg_replace_callback($search, 'img_handler', $content);
Could somebody check this code? Thanks!
UPDATE:
(...) Warning file_get_contents() [function.file-get-contents]: Filename cannot be empty (...)
That means the src url is not in the handler :(
UPDATE 2
<?php
function data_uri($filename) {
$mime = mime_content_type($filename);
$data = base64_encode(file_get_contents($filename));
return "data:$mime;base64,$data";
}
function img_handler($matches) {
$image_element = $matches[0];
$pattern = '/(src=["\'])([^"\']+)(["\'])/';
$image_element = preg_replace_callback($pattern, create_function(
$matchess,
$matchess[1] . data_uri($matchess[2]) . $matchess[3]),
$image_element);
return $image_element;
}
$content = '<img class="alignnone" src="http://upload.wikimedia.org/wikipedia/commons/thumb/4/44/Googlelogoi.png/180px-Googlelogoi.png" alt="google" width="580" height="326" title="google" />';
$search = '(<img\s+[^>]+>)';
$content = preg_replace_callback($search, 'img_handler', $content);
echo $content;
?>
I've upload this test file -> http://goo.gl/vWl9B
Your regex is alright. You are using create_function() wrong. And subsequently the inner preg_replace_callback() doesn't work. The call to data_uri() happens before any regex-replacement takes place, hencewhy the undefined filename error.
Use a proper callback function:
$image_element = preg_replace_callback($pattern, "data_uri_callback", $image_element);
Then move your code into there:
function data_uri_callback($matchess) {
return $matchess[1] . data_uri($matchess[2]) . $matchess[3];
}

Extracting the text from a link

I have a small function which goes through a bunch of text and looks for any urls in text form and converts them into html a's:
e.g
normal text lipsum etc http://www.somewebsitelink.com lipsum lipsum
becomes:
normal text lipsum etc http://www.somewebsitelink.com lipsum lipsum
My function is as follows:
function linkify($text)
{
$text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9#:%_\+.~#?&//=]+)',
'<a target="_blank" href="\\1">\\1</a>', $text);
$text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9#:%_\+.~#?&//=]+)',
'\\1<a target="_blank" href="http://\\2">\\2</a>', $text);
return $text;
}
This is all works ok, but where I use this function and print out the html is in a limited width space and sometimes links end up much bigger then can fit in the space and I end up with overflow.
I'm wondering how I might go about doing 2 things:
a. remove the unecessary from the the text ie 'http://' so I would end up with
www.somewebsitelink.com
and
b. If the text is greater than 20 characters, cut out everything after it and put in a few dots. e.g:
www.somewebsitelin...
I'm wondering if I might have to do this without using regular expressions, but then again my understanding of reg exp is fairly limited.
$link = 'http://www.somewebsitelink.com';
function linkify($text, $maxLen = 15)
{
return preg_replace_callback('(((f|ht){1}tp://)([-a-zA-Z0-9#:%_\+.~#?&//=]+))', function($t) use ($maxLen) {
if ( strlen($t[3]) > $maxLen)
$t[3] = substr_replace($t[3], '...', $maxLen);
return sprintf('<a target="_blank" href="%s">%s</a>', $t[0], $t[3]);
}, $text);
}
header('content-type: text/plain');
echo linkify($link);
Code for PHP <= 5.2
$link = 'http://www.somewebsitelink.com';
function linkify($text, $maxLen = 15)
{
$funcBody = <<<FUNC
if ( strlen(\$t[3]) > \$maxLen)
\$t[3] = substr_replace(\$t[3], '...', \$maxLen);
return sprintf('<a target="_blank" href="%s">%s</a>', \$t[0], \$t[3]);
FUNC;
$func = create_function(
'$t, $maxLen =' . $maxLen,
$funcBody
);
return preg_replace_callback('(((f|ht){1}tp://)([-a-zA-Z0-9#:%_\+.~#?&//=]+))', $func, $text);
}
header('content-type: text/plain');
echo linkify($link);
Results in
<a target="_blank" href="http://www.somewebsitelink.com">www.somewebsite...</a>
I think this will do what you need. It takes a url as its only paramater and removes the leading 'http://www." then returns the string if it is less than 20 characters. if it is more than 20 characters it returns the first 17 characters and appends '...'.
function get_formatted_url($url){
$url = trim($url);
$url = str_replace("http://www.", "", strtolower($url));
if(strlen($url) > 20){
return substr($url, 0, 16) . '...';
}else{
return $url;
}
}
Edit: An example using preg_replace_callback()
function get_formatted_url($url){
$url = $url[0];
$formatted_url = trim($url);
$formatted_url = str_replace("http://www.", "", strtolower($formatted_url));
if(strlen($formatted_url) > 20){
return '<a href="'.$url.'" />'. substr($formatted_url, 0, 16) . '... </a> ';
}else{
return '<a href="'.$url.'" />'. $formatted_url . '</a> ';
}
}
function linkify($text){
$reg = '(((f|ht)tp://)[-a-zA-Z0-9#:%_\+.~#?&//=]+)';
$text = preg_replace_callback($reg, "get_formatted_url", $text);
return $text;
}
$text = "abcd http://www.abc.com?hg=alkdjfa;lkdjfa;lkdjfa;lkdsjfa;ldks abcdefg http://www.abc.com";
echo linkify($text);
Use php substr function. For example:
<?php substr($text,0,10);?>

Categories