What is wrong with my Regex Function? - php

It seems like such a simple matter. Anyways, at the bottom of my code is an array which uses the outcome of the function. Now it doesn't seem to be working and i think it might be my regex for scanning Meta Keywords. So basically, I want to know what my function is doing wrong or how to create a fully working regex code
function getKeywords($link) {
$str2 = file_get_contents($link);
if (strlen($str2)>0) {
preg_match_all( '(?i)<meta\\s+name=\"keywords\"\\s+content=\"(.*?)\">', $str2, $keywords);
return $keywords[1];
}
}

try this:
function getKeywords($link) {
$str2 = file_get_contents($link);
if (strlen($str2)>0) {
if(preg_match( '/<meta\s+name="keywords"\s+content="(.*?)">/i', $str2, $keywords))
return $keywords[1];
else
return "";
}
}
You had multiple problems with your expression:
1). To many escape chars for \s and \"
2). You didn't lead with a / and end with a /
3). You were using preg_match_all instead of preg_match
4). You didn't handle the case when no keyword meta tag can be found.

Related

substr() to preg_replace() matches php

I have two functions in PHP, trimmer($string,$number) and toUrl($string). I want to trim the urls extracted with toUrl(), to 20 characters for example. from https://www.youtube.com/watch?v=HU3GZTNIZ6M to https://www.youtube.com/wa...
function trimmer($string,$number) {
$string = substr ($string, 0, $number);
return $string."...";
}
function toUrl($string) {
$regex="/[^\W ]+[^\s]+[.]+[^\" ]+[^\W ]+/i";
$string= preg_replace($regex, "<a href='\\0'>".trimmer("\\0",20)."</a>",$string);
return $string;
}
But the problem is that the value of the match return \\0 not a variable like $url which could be easily trimmed with the function trimmer().
The Question is how do I apply substr() to \\0 something like this substr("\\0",0,20)?
What you want is preg_replace_callback:
function _toUrl_callback($m) {
return "" . trimmer($m[0], 20) ."";
}
function toUrl($string) {
$regex = "/[^\W ]+[^\s]+[.]+[^\" ]+[^\W ]+/i";
$string = preg_replace_callback($regex, "_toUrl_callback", $string);
return $string;
}
Also note that (side notes wrt your question):
You have a syntax error, '$regex' is not going to work (they don't replace var names in single-quoted strings)
You may want to look for better regexps to match URLs, you'll find plenty of them with a quick search
You may want to run through htmlspecialchars() your matches (mainly problems with "&", but that depends how you escape the rest of the string.
EDIT: Made it more PHP 4 friendly, requested by the asker.

php replace regular expression instead of string replace

I'm trying to give my client the ability to call a function that has various code snippets by inserted a short code in their WYSIWYG editor.
For example, they will write something like...
[getSnippet(1)]
This will call my getSnippet($id) php function and output the appropriate 'chunk'.
It works when I hard code the $id like this...
echo str_replace('[getSnippet(1)]',getSnippet(1),$rowPage['sidebar_details']);
However, I really want to make the '1' dynamic. I'm sort of on the right track with something like...
function getSnippet($id) {
if ($id == 1) {
echo "car";
}
}
$string = "This [getSnippet(1)] is a sentence.This is the next one.";
$regex = '#([getSnippet(\w)])#';
$string = preg_replace($regex, '. \1', $string);
//If you want to capture more than just periods, you can do:
echo preg_replace('#(\.|,|\?|!)(\w)#', '\1 \2', $string);
Not quite working :(
Firstly in your regex you need to add literal parentheses (the ones you have just capture \w but that will not match the parentheses themselves):
$regex = '#(\[getSnippet\((\w)\)\])#';
I also escaped the square brackets, otherwise they will open a character class. Also be aware that this captures only one character for the parameter!
But I recommend you use preg_replace_callback, with a regex like this:
function getSnippet($id) {
if ($id == 1) {
return "car";
}
}
function replaceCallback($matches) {
return getSnippet($matches[1]);
}
$string = preg_replace_callback(
'#\[getSnippet\((\w+)\)\]#',
'replaceCallback',
$string
);
Note that I changed the echo in your getSnippet to a return.
Within the callback $matches[1] will contain the first captured group, which in this case is your parameter (which now allows for multiple characters). Of course, you could also adjust you getSnippet function to read the id from the $matches array instead of redirecting through the replaceCallback.
But this approach here is slightly more flexible, as it allows you to redirect to multiple functions. Just as an example, if you changed the regex to #\[(getSnippet|otherFunction)\((\w+)\)\]# then you could find two different functions, and replaceCallback could find out the name of the function in $matches[1] and call the function with the parameter $matches[2]. Like this:
function getSnippet($id) {
...
}
function otherFunction($parameter) {
...
}
function replaceCallback($matches) {
return $matches[1]($matches[2]);
}
$string = preg_replace_callback(
'#\[(getSnippet|otherFunction)\((\w+)\)\]#',
'replaceCallback',
$string
);
It really depends on where you want to go with this. The important thing is, there is no way of processing an arbitrary parameter in a replacement without using preg_replace_callback.

Function which searches for a word in a text and highlights all the words which contain it

This function searches for words (from the $words array) inside a text and highlights them.
function highlightWords(Array $words, $text){ // Loop through array of words
foreach($words as $word){ // Highlight word inside original text
$text = str_replace($word, '<span class="highlighted">' . $word . '</span>', $text);
}
return $text; // Return modified text
}
Here is the problem:
Lets say the $words = array("car", "drive");
Is there a way for the function to highlight not only the word car, but also words which contain the letters "car" like: cars, carmania, etc.
Thank you!
What you want is a regular expression, preg_replace or peg_replace_callback more in particular (callback in your case would be recommended)
<?php
$searchString = "The car is driving in the carpark, he's not holding to the right lane.\n";
// define your word list
$toHighlight = array("car","lane");
Because you need a regular expression to search your words and you might want or need variation or changes over time, it's bad practice to hard code it into your search words. Hence it's best to walk over the array with array_map and transform the searchword into the proper regular expression (here just enclosing it with / and adding the "accept everything until punctuation" expression)
$searchFor = array_map('addRegEx',$toHighlight);
// add the regEx to each word, this way you can adapt it without having to correct it everywhere
function addRegEx($word){
return "/" . $word . '[^ ,\,,.,?,\.]*/';
}
Next you wish to replace the word you found with your highlighted version, which means you need a dynamic change: use preg_replace_callback instead of regular preg_replace so that it calls a function for every match it find and uses it to generate the proper result. Here we enclose the found word in its span tags
function highlight($word){
return "<span class='highlight'>$word[0]</span>";
}
$result = preg_replace_callback($searchFor,'highlight',$searchString);
print $result;
yields
The <span class='highlight'>car</span> is driving in the <span class='highlight'>carpark</span>, he's not holding to the right <span class='highlight'>lane</span>.
So just paste these code fragments after the other to get the working code, obviously. ;)
edit: the complete code below was altered a bit = placed in routines for easy use by original requester. + case insensitivity
complete code:
<?php
$searchString = "The car is driving in the carpark, he's not holding to the right lane.\n";
$toHighlight = array("car","lane");
$result = customHighlights($searchString,$toHighlight);
print $result;
// add the regEx to each word, this way you can adapt it without having to correct it everywhere
function addRegEx($word){
return "/" . $word . '[^ ,\,,.,?,\.]*/i';
}
function highlight($word){
return "<span class='highlight'>$word[0]</span>";
}
function customHighlights($searchString,$toHighlight){
// define your word list
$searchFor = array_map('addRegEx',$toHighlight);
$result = preg_replace_callback($searchFor,'highlight',$searchString);
return $result;
}
I haven't tested it, but I think this should do it:-
$text = preg_replace('/\W((^\W)?$word(^\W)?)\W/', '<span class="highlighted">' . $1 . '</span>', $text);
This looks for the string inside a complete bounded word and then puts the span around the whole lot using preg_replace and regular expressions.
function replace($format, $string, array $words)
{
foreach ($words as $word) {
$string = \preg_replace(
sprintf('#\b(?<string>[^\s]*%s[^\s]*)\b#i', \preg_quote($word, '#')),
\sprintf($format, '$1'), $string);
}
return $string;
}
// courtesy of http://slipsum.com/#.T8PmfdVuBcE
$string = "Now that we know who you are, I know who I am. I'm not a mistake! It
all makes sense! In a comic, you know how you can tell who the arch-villain's
going to be? He's the exact opposite of the hero. And most times they're friends,
like you and me! I should've known way back when... You know why, David? Because
of the kids. They called me Mr Glass.";
echo \replace('<span class="red">%s</span>', $string, [
'mistake',
'villain',
'when',
'Mr Glass',
]);
Sine it's using an sprintf format for the surrounding string, you can change your replacement accordingly.
Excuse the 5.4 syntax

Linkify Regex Function PHP Daring Fireball Method

So, I know there are a ton of related questions on SO, but none of them are quite what I'm looking for. I'm trying to implement a PHP function that will convert text URLs from a user-generated post into links. I'm using the 'improved' Regex from Daring Fireball towards the bottom of the page: http://daringfireball.net/2010/07/improved_regex_for_matching_urls
The function does not return anything, and I'm not sure why.
<?php
if ( false === function_exists('linkify') ):
function linkify($str) {
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace($pattern, "\\0", $str);
}
endif;
?>
Can someone please help me get this to work?
Thanks!
Try this:
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`\!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace("!$pattern!i", "\\0", $str);
PHP's preg function do need delimiters. The i at the end makes it case-insensitive
Update
If you use # as the delimiter, you wan't need to escape the ! in the pattern as such use the original pattern string (the pattern does not have a #): "#$pattern#i"
Update 2
To ensure that the links are correct, do this:
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace_callback("#$pattern#i", function($matches) {
$input = $matches[0];
$url = preg_match('!^https?://!i', $input) ? $input : "http://$input";
return '' . "$input";
}, $str);
This will now append http:// to the urls so that browser doesn't think it is a relative link.
I was looking to just get the urls from a string using the same regex from the answer above by d_inevitable and wasn't looking to turn them into links or care about the rest of the string, I only wanted the urls with in the string so this is what I did. Hope it helps.
/**
* Returns the urls in an array from a string.
* This dos NOT return the string, only the urls with-in.
*/
function get_urls($str){
$regex = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
preg_match_all("#$regex#i", $str, $matches);
$urls = $matches[0];
return $urls;
}

regex for breadcrumb in php

I am currently building breadcrumb. It works for example for
http://localhost/researchportal/proposal/
<?php
$url_comp = explode('/',substr($url,1,-1));
$end = count($url_comp);
print_r($url_comp);
foreach($url_comp as $breadcrumb) {
$landing="http://localhost/";
$surl .= $breadcrumb.'/';
if(--$end)
echo '
<a href='.$landing.''.$surl.'>'.$breadcrumb.'</a>»';
else
echo '
'.$breadcrumb.'';
};?>
But when I typed in http://localhost////researchportal////proposal//////////
All the formatting was gone as it confuses my code.
I need to have the site path in an array like ([1]->researchportal, [2]->proposal)
regardless of how many slashes I put.
So can $url_comp = explode('/',substr($url,1,-1)); be turned into a regular expression to get my desired output?
You don't need regex. Look at htmlentities() and stripslashes() in the PHP manual. A regex will return a boolean value of whatever it says, and won't really help you achieve what you are trying to do. All the regex can let you do is say if the string matches the regex do something. If you put in a regex requiring at least 2 characters between each slash, then any time anyone puts more than one consecutive slash in there, the if statement will stop.
http://ca3.php.net/manual/en/function.stripslashes.php
http://ca3.php.net/manual/en/function.htmlentities.php
Found this on the php manual.
It uses simple str_replace statements, modifying this should achieve exactly what your post was asking.
<?
function stripslashes2($string) {
$string = str_replace("\\\"", "\"", $string);
$string = str_replace("\\'", "'", $string);
$string = str_replace("\\\\", "\\", $string);
return $string;
}
?>

Categories