substr() to preg_replace() matches php - php

I have two functions in PHP, trimmer($string,$number) and toUrl($string). I want to trim the urls extracted with toUrl(), to 20 characters for example. from https://www.youtube.com/watch?v=HU3GZTNIZ6M to https://www.youtube.com/wa...
function trimmer($string,$number) {
$string = substr ($string, 0, $number);
return $string."...";
}
function toUrl($string) {
$regex="/[^\W ]+[^\s]+[.]+[^\" ]+[^\W ]+/i";
$string= preg_replace($regex, "<a href='\\0'>".trimmer("\\0",20)."</a>",$string);
return $string;
}
But the problem is that the value of the match return \\0 not a variable like $url which could be easily trimmed with the function trimmer().
The Question is how do I apply substr() to \\0 something like this substr("\\0",0,20)?

What you want is preg_replace_callback:
function _toUrl_callback($m) {
return "" . trimmer($m[0], 20) ."";
}
function toUrl($string) {
$regex = "/[^\W ]+[^\s]+[.]+[^\" ]+[^\W ]+/i";
$string = preg_replace_callback($regex, "_toUrl_callback", $string);
return $string;
}
Also note that (side notes wrt your question):
You have a syntax error, '$regex' is not going to work (they don't replace var names in single-quoted strings)
You may want to look for better regexps to match URLs, you'll find plenty of them with a quick search
You may want to run through htmlspecialchars() your matches (mainly problems with "&", but that depends how you escape the rest of the string.
EDIT: Made it more PHP 4 friendly, requested by the asker.

Related

PHP:preg_replace function

$text = "
<tag>
<html>
HTML
</html>
</tag>
";
I want to replace all the text present inside the tags with htmlspecialchars(). I tried this:
$regex = '/<tag>(.*?)<\/tag>/s';
$code = preg_replace($regex,htmlspecialchars($regex),$text);
But it doesn't work.
I am getting the output as htmlspecialchars of the regex pattern. I want to replace it with htmlspecialchars of the data matching with the regex pattern.
what should i do?
You're replacing the match with the pattern itself, you're not using the back-references and the e-flag, but in this case, preg_replace_callback would be the way to go:
$code = preg_replace_callback($regex,'htmlspecialchars',$text);
This will pass the mathces groups to htmlspecialchars, and use its return value as replacement. The groups might be an array, in which case, you can try either:
function replaceCallback($matches)
{
if (is_array($matches))
{
$matches = implode ('', array_slice($matches, 1));//first element is full string
}
return htmlspecialchars($matches);
}
Or, if your PHP version permits it:
preg_replace_callback($expr, function($matches)
{
$return = '';
for ($i=1, $j = count($matches); $i<$j;$i++)
{//loop like this, skips first index, and allows for any number of groups
$return .= htmlspecialchars($matches[$i]);
}
return $return;
}, $text);
Try any of the above, until you find simething that works... incidentally, if all you want to remove is <tag> and </tag>, why not go for the much faster:
echo htmlspecialchars(str_replace(array('<tag>','</tag>'), '', $text));
That's just keeping it simple, and it'll almost certainly be faster, too.
See the quickest, easiest way in action here
If you want to isolate the actual contents as defined by your pattern, you could use preg_match($regex,$text,$hits);. This will give you an array of hits those bits that were between the paratheses in the pattern, starting at $hits[1], $hits[0] contains the whole matched string). You can then start manipulating these found matches, possibly using htmlspecialchars ... and combine them again into $code.

php replace regular expression instead of string replace

I'm trying to give my client the ability to call a function that has various code snippets by inserted a short code in their WYSIWYG editor.
For example, they will write something like...
[getSnippet(1)]
This will call my getSnippet($id) php function and output the appropriate 'chunk'.
It works when I hard code the $id like this...
echo str_replace('[getSnippet(1)]',getSnippet(1),$rowPage['sidebar_details']);
However, I really want to make the '1' dynamic. I'm sort of on the right track with something like...
function getSnippet($id) {
if ($id == 1) {
echo "car";
}
}
$string = "This [getSnippet(1)] is a sentence.This is the next one.";
$regex = '#([getSnippet(\w)])#';
$string = preg_replace($regex, '. \1', $string);
//If you want to capture more than just periods, you can do:
echo preg_replace('#(\.|,|\?|!)(\w)#', '\1 \2', $string);
Not quite working :(
Firstly in your regex you need to add literal parentheses (the ones you have just capture \w but that will not match the parentheses themselves):
$regex = '#(\[getSnippet\((\w)\)\])#';
I also escaped the square brackets, otherwise they will open a character class. Also be aware that this captures only one character for the parameter!
But I recommend you use preg_replace_callback, with a regex like this:
function getSnippet($id) {
if ($id == 1) {
return "car";
}
}
function replaceCallback($matches) {
return getSnippet($matches[1]);
}
$string = preg_replace_callback(
'#\[getSnippet\((\w+)\)\]#',
'replaceCallback',
$string
);
Note that I changed the echo in your getSnippet to a return.
Within the callback $matches[1] will contain the first captured group, which in this case is your parameter (which now allows for multiple characters). Of course, you could also adjust you getSnippet function to read the id from the $matches array instead of redirecting through the replaceCallback.
But this approach here is slightly more flexible, as it allows you to redirect to multiple functions. Just as an example, if you changed the regex to #\[(getSnippet|otherFunction)\((\w+)\)\]# then you could find two different functions, and replaceCallback could find out the name of the function in $matches[1] and call the function with the parameter $matches[2]. Like this:
function getSnippet($id) {
...
}
function otherFunction($parameter) {
...
}
function replaceCallback($matches) {
return $matches[1]($matches[2]);
}
$string = preg_replace_callback(
'#\[(getSnippet|otherFunction)\((\w+)\)\]#',
'replaceCallback',
$string
);
It really depends on where you want to go with this. The important thing is, there is no way of processing an arbitrary parameter in a replacement without using preg_replace_callback.

What is wrong with my Regex Function?

It seems like such a simple matter. Anyways, at the bottom of my code is an array which uses the outcome of the function. Now it doesn't seem to be working and i think it might be my regex for scanning Meta Keywords. So basically, I want to know what my function is doing wrong or how to create a fully working regex code
function getKeywords($link) {
$str2 = file_get_contents($link);
if (strlen($str2)>0) {
preg_match_all( '(?i)<meta\\s+name=\"keywords\"\\s+content=\"(.*?)\">', $str2, $keywords);
return $keywords[1];
}
}
try this:
function getKeywords($link) {
$str2 = file_get_contents($link);
if (strlen($str2)>0) {
if(preg_match( '/<meta\s+name="keywords"\s+content="(.*?)">/i', $str2, $keywords))
return $keywords[1];
else
return "";
}
}
You had multiple problems with your expression:
1). To many escape chars for \s and \"
2). You didn't lead with a / and end with a /
3). You were using preg_match_all instead of preg_match
4). You didn't handle the case when no keyword meta tag can be found.

Linkify Regex Function PHP Daring Fireball Method

So, I know there are a ton of related questions on SO, but none of them are quite what I'm looking for. I'm trying to implement a PHP function that will convert text URLs from a user-generated post into links. I'm using the 'improved' Regex from Daring Fireball towards the bottom of the page: http://daringfireball.net/2010/07/improved_regex_for_matching_urls
The function does not return anything, and I'm not sure why.
<?php
if ( false === function_exists('linkify') ):
function linkify($str) {
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace($pattern, "\\0", $str);
}
endif;
?>
Can someone please help me get this to work?
Thanks!
Try this:
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`\!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace("!$pattern!i", "\\0", $str);
PHP's preg function do need delimiters. The i at the end makes it case-insensitive
Update
If you use # as the delimiter, you wan't need to escape the ! in the pattern as such use the original pattern string (the pattern does not have a #): "#$pattern#i"
Update 2
To ensure that the links are correct, do this:
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace_callback("#$pattern#i", function($matches) {
$input = $matches[0];
$url = preg_match('!^https?://!i', $input) ? $input : "http://$input";
return '' . "$input";
}, $str);
This will now append http:// to the urls so that browser doesn't think it is a relative link.
I was looking to just get the urls from a string using the same regex from the answer above by d_inevitable and wasn't looking to turn them into links or care about the rest of the string, I only wanted the urls with in the string so this is what I did. Hope it helps.
/**
* Returns the urls in an array from a string.
* This dos NOT return the string, only the urls with-in.
*/
function get_urls($str){
$regex = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
preg_match_all("#$regex#i", $str, $matches);
$urls = $matches[0];
return $urls;
}

regex for breadcrumb in php

I am currently building breadcrumb. It works for example for
http://localhost/researchportal/proposal/
<?php
$url_comp = explode('/',substr($url,1,-1));
$end = count($url_comp);
print_r($url_comp);
foreach($url_comp as $breadcrumb) {
$landing="http://localhost/";
$surl .= $breadcrumb.'/';
if(--$end)
echo '
<a href='.$landing.''.$surl.'>'.$breadcrumb.'</a>»';
else
echo '
'.$breadcrumb.'';
};?>
But when I typed in http://localhost////researchportal////proposal//////////
All the formatting was gone as it confuses my code.
I need to have the site path in an array like ([1]->researchportal, [2]->proposal)
regardless of how many slashes I put.
So can $url_comp = explode('/',substr($url,1,-1)); be turned into a regular expression to get my desired output?
You don't need regex. Look at htmlentities() and stripslashes() in the PHP manual. A regex will return a boolean value of whatever it says, and won't really help you achieve what you are trying to do. All the regex can let you do is say if the string matches the regex do something. If you put in a regex requiring at least 2 characters between each slash, then any time anyone puts more than one consecutive slash in there, the if statement will stop.
http://ca3.php.net/manual/en/function.stripslashes.php
http://ca3.php.net/manual/en/function.htmlentities.php
Found this on the php manual.
It uses simple str_replace statements, modifying this should achieve exactly what your post was asking.
<?
function stripslashes2($string) {
$string = str_replace("\\\"", "\"", $string);
$string = str_replace("\\'", "'", $string);
$string = str_replace("\\\\", "\\", $string);
return $string;
}
?>

Categories