PHP Highlight Search Keywords Using preg_replace With an Array - php

I'm using this function from here, which is:
// highlight search keywords
function highlight($title, $search) {
preg_match_all('~\w+~', $search, $m);
if(!$m)
return $title;
$re = '~\\b(' . implode('|', $m[0]) . ')\\b~i';
return preg_replace($re, '<span style="background-color: #ffffcc;">$0</span>', $title);
}
Which works great, but only for titles. I want to be able to pass an array that contains $title and $description.
I was trying something like this:
$replacements = array($title, $description);
// highlight search keywords
function highlight($replacements, $search) {
preg_match_all('~\w+~', $search, $m);
if(!$m)
return $replacements;
$re = '~\\b(' . implode('|', $m[0]) . ')\\b~i';
return preg_replace($re, '<span style="background-color: #ffffcc;">$0</span>', $replacements);
}
It isn't working. It's passing an array as the title, and not highlighting the description (although it is actually returning a description). Any idea how to get this working?

I would personally leave the original function as only operating on one parameter rather than an array. It would make your calling code nice and clear;
$titleHighlighted = highlight($title, $searchKeywords);
$descriptionHighlighted = highlight($title, $searchKeywords);
However, I would rewrite your function to use str_ireplace rather than preg_replace;
function highlight($contentBlock, array $keywords) {
$highlightedContentBlock = $contentBlock;
foreach ($keywords as $singleKeyword) {
$highlightedKeyword = '<span class = "keyword">' . $singleKeyword . '</span>';
$highlightedContentBlock = str_ireplace($singleKeyword, $highlightedKeyword, $highlightedContentBlock);
}
return $highlightedContentBlock;
}
This rewritten function should be more simple to read and does not have the overhead of compiling the regular expressions. You can call it as many times as you like for any content block (title, description, etc);
$title = "The quick brown fox jumper over ... ";
$searchKeywords = array("quick", "fox");
$titleHighlighted = highlight($title, $searchKeywords);
echo $titleHighlighted; // The <span class = "keyword">quick</span> brown ...

have you try to change ?
$m[0]
with
$m[0][0]

Related

Automatically add links to strings of keywords

In my article, I want to automatically add links to keywords.
My keywords array:
$keywords = [
0=>['id'=>1,'slug'=>'getName','url'=>'https://example.com/1'],
1=>['id'=>2,'slug'=>'testName','url'=>'https://example.com/2'],
2=>['id'=>3,'slug'=>'ign','url'=>'https://example.com/3'],
];
This is my code:
private function keywords_replace(string $string, array $key_array)
{
$array_first = $key_array;
$array_last = [];
foreach ($array_first as $key=>$value)
{
$array_last[$key] = [$key, $value['slug'], '<a target="_blank" href="' . $value['url'] . '" title="' . $value['slug'] . '">' . $value['slug'] . '</a>'];
}
$count = count($array_last);
for ($i=0; $i<$count;$i++)
{
for ($j=$count-1;$j>$i;$j--)
{
if (strlen($array_last[$j][1]) > strlen($array_last[$j-1][1]))
{
$tmp = $array_last[$j];
$array_last[$j] = $array_last[$j-1];
$array_last[$j-1] = $tmp;
}
}
}
$keys = $array_last;
foreach ($keys as $key)
{
$string = str_ireplace($key[1],$key[0],$string);
}
foreach ($keys as $key)
{
$string = str_ireplace($key[0],$key[2],$string);
}
return $string;
}
result:
$str = "<p>Just a test: getName testName";
echo $this->keywords_replace($str,$keywords);
like this:Just a test: getName testName
very import: If the string has no spaces, it will not match.Because I will use other languages, sentences will not have spaces like English. Like Wordpress key words auto link
I think my code is not perfect,Is there a better algorithm to implement this function? Thanks!
You can use array_reduce and preg_replace to replace all occurrences of the slug words in your string with the corresponding url values:
$keywords = [
0=>['id'=>1,'slug'=>'getName','url'=>'https://www.getname.com'],
1=>['id'=>2,'slug'=>'testName','url'=>'https://www.testname.com'],
2=>['id'=>3,'slug'=>'ign','url'=>'https://www.ign.com'],
];
$str = "<p>Just a test: getName testName";
echo array_reduce($keywords, function ($c, $v) { return preg_replace('/\\b(' . $v['slug'] . ')\\b/', $v['url'], $c); }, $str);
Output:
<p>Just a test: https://www.getname.com https://www.testname.com
Demo on 3v4l.org
Update
To change the text into links, you need to use this:
echo array_reduce($keywords,
function ($c, $v) {
return preg_replace('/\\b(' . $v['slug'] . ')\\b/',
'$1', $c);
},
$str);
Output:
<p>Just a test: getName testName
Updated demo
Update 2
Because some of the links that are being substituted include words that are also values of slug, it's necessary to do all the replacements at once using the array format of strtr. We build an array of patterns and replacements using array_column, array_combine and array_map, then pass that to strtr:
$reps = array_combine(array_column($keywords, 'slug'),
array_map(function ($k) { return '' . $k['slug'] . ''; }, $keywords
));
$newstr = strtr($str, $reps);
New demo
First you need to change structure of array to key/value using loop that result stored in $newKeywords. Then using preg_replace_callback() select every word in string and check that it exist in key of array. If exist, wrap it in anchor tag.
$newKeywords = [];
foreach ($keywords as $keyword)
$newKeywords[$keyword['slug']] = $keyword['url'];
$newStr = preg_replace_callback("/(\w+)/", function($m) use($newKeywords){
return isset($newKeywords[$m[0]]) ? "<a href='{$newKeywords[$m[0]]}'>{$m[0]}</a>" : $m[0];
}, $str);
Output:
<p>Just a test: <a href='https://www.getname.com'>getName</a> <a href='https://www.testname.com'>testName</a></p>
Check result in demo
My answer uses preg_replace as does Nick's above.
It relies on the patterns and replacements being equally sized arrays, with corresponding patterns and replacements.
Word boundaries need to be respected, which I doubt you can do with a simple string replacement.
<?php
$keywords = [
0=>['id'=>1,'slug'=>'foo','url'=>'https://www.example.com/foo'],
1=>['id'=>2,'slug'=>'bar','url'=>'https://www.example.com/bar'],
2=>['id'=>3,'slug'=>'baz','url'=>'https://www.example.com/baz'],
];
foreach ($keywords as $item)
{
$patterns[] = '#\b(' . $item['slug'] . ')\b#i';
$replacements[] = '$1';
}
$html = "<p>I once knew a barbed man named <i>Foo</i>, he often visited the bar.</p>";
print preg_replace($patterns, $replacements, $html);
Output:
<p>I once knew a barbed man named <i>Foo</i>, he often visited the bar.</p>
This is my answer: thanks for #Nick
$content = array_reduce($keywords , function ($c, $v) {
return preg_replace('/(>[^<>]*?)(' . $v['slug'] . ')([^<>]*?<)/', '$1$2$3', $c);
}, $str);

Joomla 3 content plugn: Foreach preg_replace $row->fulltext duplicates first string

I am setting up a new content plugin for Joomla 3, that should replace plugin tags with html content. Everything works fine till the moment when i am preg_replace plugin tags in $row->fulltext.
Here is the plugin code
public function onContentPrepare($context, &$row, &$params, $page = 0) {
$pattern = '#\{uni\}(.*){\/uni\}#sU';
preg_match_all($pattern, $row->fulltext, $matches, PREG_PATTERN_ORDER);
foreach($matches[1] as $k=>$uni){
preg_match('/\{uni-title\}(.*)[\{]/Ui', $uni, $unititle);
preg_match('/\{uni-text\}(.*)/si', $uni, $unitext);
$titleID = str_replace(' ', '_', trim($unititle[1]));
$newString = '<span id="'.$titleID.'">'.$unititle[1].'</span><div class="university-info-holder"><div class="university-info"><i class="icon icon-close"></i>'.$unitext[1].'</div></div>';
$row->fulltext = preg_replace($pattern,$newString,$row->fulltext);
}
}
Any ideas, why it duplicates first found match, as many times as foreach goes?
Just to mention, if i do:
echo $unititle[1];
inside foreach, items aren't duplicated, but are rendered as it should be.
There are a few problems with the original code.
It should be using $row->text instead of $row->fulltext. This is because when rendering an article Joomla merges tht introtext and fulltext fields.
It's a mistake to use $pattern for the matching when making the substitution. That's because the $pattern matches all of the items. Instead use the $match[0][$k] to do the replacement. Use str_replace instead of preg_replace because now you are matching the exact string and don't need to do a regex.
Here's the code for the whole thing.
class PlgContentLivefilter extends JPlugin{
public function onContentPrepare($context, &$row, &$params, $page = 0) {
return $renderUniInfo = $this->renderUniInfo($row, $params, $page = 0);
}
private function renderUniInfo(&$row, &$params, $page = 0) {
$pattern = '#\{uni\}(.*){\/uni\}#sU';
preg_match_all($pattern, $row->text, $matches);
foreach($matches[0] as $k=>$uni){
preg_match('/\{uni-title\}(.*)[\{]/Ui', $uni, $unititle);
preg_match('/\{uni-text\}(.*)/si', $uni, $unitext);
print_r($unititle[1]);
$title = $unititle[1];
$text = $unitext[1];
if (preg_match('#(?:http://)?(?:https://)?(?:www\.)?(?:youtube\.com/(?:v/|embed/|watch\?v=)|youtu\.be/)([\w-]+)?#i', $unitext[1], $match)) {
$video_id = $match[1];
$video_string = '<div class="videoWrapper"><iframe src="http://youtube.com/embed/'.$video_id.'?rel=0"></iframe></div>';
$unitext[1] = preg_replace('#(?:http://)?(?:https://)?(?:www\.)?(?:youtube\.com/(?:v/|embed/|watch\?v=)|youtu\.be/)([\w-]+)?#i', $video_string, $unitext[1]);
$text = $unitext[1];
}
$titleID = str_replace(' ', '_', trim($title));
$newString = '<span id="'.$titleID.'">'.$title.'</span><div class="university-info-holder"><div class="university-info"><i class="icon icon-close"></i>'.$text.'</div></div>';
$row->text = str_replace($matches[0][$k],$newString,$row->text);
}
}
}

How can I avoid adding href to an overlapping keyword in string?

Using the following code:
$text = "أطلقت غوغل النسخة المخصصة للأجهزة الذكية العاملة بنظام أندرويد من الإصدار “25″ لمتصفحها الشهير كروم.ولم تحدث غوغل تطبيق كروم للأجهزة العاملة بأندرويد منذ شهر تشرين الثاني العام الماضي، وهو المتصفح الذي يستخدمه نسبة 2.02% من أصحاب الأجهزة الذكية حسب دراسة سابقة. ";
$tags = "غوغل, غوغل النسخة, كروم";
$tags = explode(",", $tags);
foreach($tags as $k=>$v) {
$text = preg_replace("/\b{$v}\b/u","$0",$text, 1);
}
echo $text;
Will give the following result:
I love PHP">love PHP</a>, but I am facing a problem
Note that my text is in Arabic.
The way is to do all in one pass. The idea is to build a pattern with an alternation of tags. To make this way work, you must before sort the tags because the regex engine will stop at the first alternative that succeeds (otherwise 'love' will always match even if it is followed by 'php' and 'love php' will never be matched).
To limit the replacement to the first occurence of each word you can remove tag from the array once it has been found and you test if it is always present in the array inside the replacement callback function:
$text = 'I love PHP, I love love but I am facing a problem';
$tagsCSV = 'love, love php, facing';
$tags = explode(', ', $tagsCSV);
rsort($tags);
$tags = array_map('preg_quote', $tags);
$pattern = '/\b(?:' . implode('|', $tags) . ')\b/iu';
$text = preg_replace_callback($pattern, function ($m) use (&$tags) {
$mLC = mb_strtolower($m[0], 'UTF-8');
if (false === $key = array_search($mLC, $tags))
return $m[0];
unset($tags[$key]);
return '<a href="index.php?s=news&tag=' . rawurlencode($mLC)
. '">' . $m[0] . '</a>';
}, $text);
Note: when you build an url you must encode special characters, this is the reason why I use preg_replace_callback instead of preg_replace to be able to use rawurlencode.
If you have to deal with an utf8 encoded string, you need to add the u modifier to the pattern and you need to replace strtolower with mb_strtolower)
the preg_split way
$tags = explode(', ', $tagsCSV);
rsort($tags);
$tags = array_map('preg_quote', $tags);
$pattern = '/\b(' . implode('|', $tags) . ')\b/iu';
$items = preg_split($pattern, $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$itemsLength = count($items);
$i = 1;
while ($i<$itemsLength && count($tags)) {
if (false !== $key = array_search(mb_strtolower($items[$i], 'UTF-8'), $tags)) {
$items[$i] = '<a href="index.php?s=news&tag=' . rawurlencode($tags[$key])
. '">' . $items[$i] . '</a>';
unset($tags[$key]);
}
$i+=2;
}
$result = implode('', $items);
Instead of calling preg_replace multiple times, call it a single time with a regexp that matches any of the tags:
$tags = explode(",", tags);
$tags_re = '/\b(' . implode('|', $tags) . ')\b/u';
$text = preg_replace($tags_re, '$0', $text, 1);
This turns the list of tags into the regexp /\b(love|love php|facing)\b/u. x|y in a regexp means to match either x or y.

Preg Replace in PHP for Heading Tags

I have a markdown text content which I have to replace without using library functions.So I used preg replace for this.It works fine for some cases.For cases like heading
for eg Heading
=======
should be converted to <h1>Heading</h1> and also
##Sub heading should be converted to <h2>Sub heading</h2>
###Sub heading should be converted to <h3>Sub heading</h3>
I have tried
$text = preg_replace('/##(.+?)\n/s', '<h2>$1</h2>', $text);
The above code works but I need to have count of hash symbol and based on that I have to assign heading tags.
Anyone help me please....
Try using preg_replace_callback.
Something like this -
$regex = '/(#+)(.+?)\n/s';
$line = "##Sub heading\n ###sub-sub heading\n";
$line = preg_replace_callback(
$regex,
function($matches){
$h_num = strlen($matches[1]);
return "<h$h_num>".$matches[2]."</h$h_num>";
},
$line
);
echo $line;
The output would be something like this -
<h2>Sub heading</h2> <h3>sub-sub heading</h3>
EDIT
For the combined problem of using = for headings and # for sub-headings, the regex gets a bit more complicated, but the principle remains the same using preg_replace_callback.
Try this -
$regex = '/(?:(#+)(.+?)\n)|(?:(.+?)\n\s*=+\s*\n)/';
$line = "Heading\n=======\n##Sub heading\n ###sub-sub heading\n";
$line = preg_replace_callback(
$regex,
function($matches){
//var_dump($matches);
if($matches[1] == ""){
return "<h1>".$matches[3]."</h1>";
}else{
$h_num = strlen($matches[1]);
return "<h$h_num>".$matches[2]."</h$h_num>";
}
},
$line
);
echo $line;
Whose Output is -
<h1>Heading</h1><h2>Sub heading</h2> <h3>sub-sub heading</h3>
Do a preg_match_all like this:
$string = "#####asdsadsad";
preg_match_all("/^#/", $string, $matches);
var_dump ($matches);
And based on count of matches you can do whatever you want.
Or, use the preg_replace_callback function.
$input = "#This is my text";
$pattern = '/^(#+)(.+)/';
$mytext = preg_replace_callback($pattern, 'parseHashes', $input);
var_dump($mytext);
function parseHashes($input) {
var_dump($input);
$matches = array();
preg_match_all('/(#)/', $input[1], $matches);
var_dump($matches[0]);
var_dump(count($matches[0]));
$cnt = count($matches[0]);
if ($cnt <= 6 && $cnt > 0) {
return '<h' . $cnt . ' class="if you want class here">' . $input[2] . '</h' . $cnt . '>';
} else {
//This is not a valid h tag. Do whatever you want.
return false;
}
}

str_replace not replacing correctly

I have the following, simple code:
$text = str_replace($f,''.$u.'',$text);
where $f is a URL, like http://google.ca, and $u is the name of the URL (my function names it 'Google').
My problem is, is if I give my function a string like
http://google.ca http://google.ca
it returns
Google" target="_blank">Google</a> Google" target="_blank">Google</a>
Which obviously isn't what I want. I want my function to echo out two separate, clickable links. But str_replace is replacing the first occurrence (it's in a loop to loop through all the found URLs), and that first occurrence has already been replaced.
How can I tell str_replace to ignore that specific one, and move onto the next? The string given is user input, so I can't just give it a static offset or anything with substr, which I have tried.
Thank you!
One way, though it's a bit of a kludge: you can use a temporary marker that (hopefully) won't appear in the string:
$text = str_replace ($f, '' . $u . '',
$text);
That way, the first substitution won't be found again. Then at the end (after you've processed the entire line), simply change the markers back:
$text = str_replace ('XYZZYPLUGH', $f, $text);
Why not pass your function an array of URLs, instead?
function makeLinks(array $urls) {
$links = array();
foreach ($urls as $url) {
list($desc, $href) = $url;
// If $href is based on user input, watch out for "javascript: foo;" and other XSS attacks here.
$links[] = '<a href="' . htmlentities($href) . '" target="_blank">'
. htmlentities($desc)
. '</a>';
}
return $links; // or implode('', $links) if you want a string instead
}
$urls = array(
array('Google', 'http://google.ca'),
array('Google', 'http://google.ca')
);
var_dump(makeLinks($urls));
If i understand your problem correctly, you can just use the function sprintf. I think something like this should work:
function urlize($name, $url)
{
// Make sure the url is formatted ok
if (!filter_var($url, FILTER_VALIDATE_URL))
return '';
$name = htmlspecialchars($name, ENT_QUOTES);
$url = htmlspecialchars($url, ENT_QUOTES);
return sprintf('%s', $url, $name);
}
echo urlize('my name', 'http://www.domain.com');
// my name
I havent test it though.
I suggest you to use preg_replace instead of str_replace here like this code:
$f = 'http://google.ca';
$u = 'Google';
$text='http://google.ca http://google.ca';
$regex = '~(?<!<a href=")' . preg_quote($f) . '~'; // negative lookbehind
$text = preg_replace($regex, ''.$u.'', $text);
echo $text . "\n";
$text = preg_replace($regex, ''.$u.'', $text);
echo $text . "\n";
OUTPUT:
Google Google
Google Google

Categories