Multiple word search and match using RegEx - php

I've created a function that highlights single words within a string. It looks like this:
function highlight($input, $keywords) {
preg_match_all('~[\w\'"-]+~', $keywords, $match);
if(!$match) { return $input; }
$result = '~\\b(' . implode('|', $match[0]) . ')\\b~i';
return preg_replace($result, '<strong>$0</strong>', $input);
}
I need the function to work with an array of different words supporting a space in the search.
Example:
$search = array("this needs", "here", "can high-light the text");
$string = "This needs to be in here so that the search variable can high-light the text";
echo highlight($string, $search);
Here's what I have so far to amend the function to work how I need it to:
function highlight($input, $keywords) {
foreach($keywords as $keyword) {
preg_match_all('~[\w\'"-]+~', $keyword, $match);
if(!$match) { return $input; }
$result = '~\\b(' . implode('|', $match[0]) . ')\\b~i';
$output .= preg_replace($result, '<strong>$0</strong>', $keyword);
}
return $output;
}
Obviously this doesn't work and I'm not sure how to get this to work (regular expression are not my strong point).
Another point that may be a problem, how would the function deal with a multiple match? Such as $search = array("in here", "here so"); as the result would be something like:
This needs to be <strong>in <strong>here</strong> so</strong> that the search variable can high-light the text
But this needs to be:
This needs to be <strong>in here so</strong> that the search variable can high-light the text

Description
Can you take your array of terms and join them with a regex or statement | then nest them into a string. The \b's would help ensure you're not capturing word fragments.
\b(this needs|here|can high-light the text)\b
Then run this as a replacement using the capture group \1?
Example
I'm not real familiar with Python, but in PHP I'd do something like this:
<?php
$sourcestring="This needs to be in here so that the search variable can high-light the text";
echo preg_replace('/\b(this needs|here|can high-light the text)\b/i','<strong>\1</strong>',$sourcestring);
?>
$sourcestring after replacement:
<strong>This needs</strong> to be in <strong>here</strong> so that the search variable <strong>can high-light the text</strong>

Related

How match and mark several keywords in string with php

This is my current code:
# highlight keywords in string
function highlight($string, $keyword) {
return preg_replace("/".preg_quote($keyword)."/ui", "<span class=\"h\">$0</span>", $string);
}
If I execute this code:
$string = "The house is very big.";
echo highlight($string, "hous");
It will return:
The <span class="h">hous</span>e is very big.
Now I'm trying to send several keywords to the 2nd parameter of the function as an array, and all those matches should be highlighted. Example:
echo highlight($string, array("hous", "big");
...should return:
The <span class="h">hous</span>e is very <span class="h">big</span>.
Any ideas? Thanks.
Check if $keywords contains an array of values rather than being of a string type:
function highlight($string, $keywords) {
$keywords = is_array($keywords)
? implode('|', array_map('preg_quote', $keywords))
: preg_quote($keywords);
return preg_replace("/$keywords/ui", "<span class=\"h\">$0</span>", $string);
}
PHP live demo
You could simply do this:
function highlight($string, $keywords) {
return preg_replace("/".implode('|', $keywords)."/ui", "<span class=\"h\">$0</span>", $string);
}
$string = "The house is very big.";
echo highlight($string, ["hous", "big"]);
Note: implode('|', $keywords) since the | pipe symbol would allow you such flexibility.
For more insight, see: http://www.regular-expressions.info/alternation.html

Replace all the matches in a string that matches array values

I have a string that I am checking for matches using my array and if there are any matches I want to replace those matches with the same words, but just styled red and then return all the string with the colored words included in one piece. This is what I have tried:
$string = 'This is a brovn fox wit legs.';
$misspelledOnes = array('wit', 'brovn');
echo '<p>' . str_replace($misspelledOnes,"<span style='color:red'>". $misspelledOnes . "</span>". '</p>', $string;
But of course this doesn't work, because the second parameter of str_replace() can't be an array. How to overcome this?
The most basic approach would be a foreach loop over the check words:
$string = 'This is a brovn fox wit legs.';
$misspelledOnes = array('wit', 'brovn');
foreach ($misspelledOnes as $check) {
$string = str_replace($check, "<span style='color:red'>$check</span>", $string);
}
echo "<p>$string</p>";
Note that this does a simple substring search. For example, if you spelled "with" properly, it would still get caught by this. Once you get a bit more familiar with PHP, you could look at something using regular expressions which can get around this problem:
$string = 'This is a brovn fox wit legs.';
$misspelledOnes = array('wit', 'brovn');
$check = implode("|", $misspelledOnes);
$string = preg_replace("/\b($check)\b/", "<span style='color:red'>$1</span>", $string);
echo "<p>$string</p>";

Find part of a string and output the whole string

I would like to find part of a string and if true I want to ouput the whole of the string that it finds.
Below is an example:
$Towns = "Eccleston, Aberdeen, Glasgow";
$Find = "Eccle";
if(strstr($Find, $Towns)){
echo outputWholeString($Find, $Towns); // Result: Eccleston.
}
If anyone can shed some light on how to do this as well, and bare in mind that it will not be static values; the $Towns and $Find variables will be dynamically assigned on my live script.
Use explode() and strpos() as
$Towns = "Eccleston, Aberdeen, Glasgow";
$data=explode(",",$Towns);//
$Find = "Eccle";
foreach ($data as $town)
if (strpos($town, $Find) !== false) {
echo $town;
}
DEMO
You have to use strpos() to search for a string inside another one:
if( strpos($Towns, $Find) === false ) {
echo $Towns;
}
Note that you have to use "===" to know if strpos() returned false or 0.
The solution using preg_match function:
$Towns = "Eccleston, Aberdeen, Glasgow";
$Find = "Eccle";
preg_match("/\b\w*?$Find\w*?\b/", $Towns, $match);
$result = (!empty($match))? $match[0] : "";
print_r($result); // "Eccleston"
Assuming that you will always have $Towns separated by ", " then you could do something like this
$Towns = "Eccleston, Aberdeen, Glasgow";
$Find = "Eccle";
$TownsArray = explode(", ", $Towns);
foreach($TownsArray as $Town)
{
if(stristr($Find, $Town))
{
echo $Town; break;
}
}
The above code will output the Town once it finds the needle and exit the foreach loop. You could remove the "break;" to continue letting the script run to see if it finds more results.
Using preg_match(), it is possible to search for Eccle and return the Eccleston word.
I use the Pearl Compatible Regular Expression (PCRE) '#\w*' . $Find . '\w*#' in the code below and the demo code.
The # characters are PCRE delimiters. The pattern searched is inside these delimiters. Some people prefer / as delimiter.
The \w indicates word characters.
The * indicates 0 or more repetions of the previous character.
So, the #\w*Eccle\w*# PCRE searches for an string containing Eccle surrounded by one or more word characters (letters)
<?php
$Towns = "Eccleston, Aberdeen, Glasgow";
$Find = "Eccle";
if (preg_match('#\w*' . $Find . '\w*#', $Towns, $matches)) {
print_r($matches[0]);
}
?>
Running code: http://sandbox.onlinephpfunctions.com/code/4e4026cbbd93deaf8fef0365a7bc6cf6eacc2014
Note: '#\w*' . $Find . '\w*#' is the same as "#\w*$Find\w*#" (note the surrounding single or double quotes). See this.
You were nearly there...
This is probably what you are looking for:
<?php
$Towns = "Eccleston, Aberdeen, Glasgow";
$Find = "Eccle";
if(stripos($Towns, $Find)) {
echo $Towns;
}
The output is: Eccleston, Aberdeen, Glasgow which is what I would call "the whole string".
If however you only want to output that partly matched part of "the whole string", then take a look at that example:
<?php
$Towns = "Eccleston, Aberdeen, Glasgow";
$Find = "Eccle";
foreach (explode(',', $Towns) as $Town) {
if(stripos($Town, $Find)) {
echo trim($Town);
}
}
The output of that obviously is: Eccleston...
Two general remarks:
the strpos() / stripos() functions are better suited here, since they return only a position instead of the whole matched string which is enough for the given purpose.
the usage of stripos() instead of strpos() performs a case insensitive search, which probably makes sense for the task...

mb_eregi_replace multiple matches get them

$string = 'test check one two test3';
$result = mb_eregi_replace ( 'test|test2|test3' , '<$1>' ,$string ,'i');
echo $result;
This should deliver: <test> check one two <test3>
Is it possible to get, that test and test3 was found, without using another match function ?
You can use preg_replace_callback instead:
$string = 'test check one two test3';
$matches = array();
$result = preg_replace_callback('/test|test2|test3/i' , function($match) use ($matches) {
$matches[] = $match;
return '<'.$match[0].'>';
}, $string);
echo $result;
Here preg_replace_callback will call the passed callback function for each match of the pattern (note that its syntax differs from POSIX). In this case the callback function is an anonymous function that adds the match to the $matches array and returns the substitution string that the matches are to be replaced by.
Another approach would be to use preg_split to split the string at the matched delimiters while also capturing the delimiters:
$parts = preg_split('/test|test2|test3/i', $string, null, PREG_SPLIT_DELIM_CAPTURE);
The result is an array of alternating non-matching and matching parts.
As far as I know, eregi is deprecated.
You could do something like this:
<?php
$str = 'test check one two test3';
$to_match = array("test", "test2", "test3");
$rep = array();
foreach($to_match as $val){
$rep[$val] = "<$val>";
}
echo strtr($str, $rep);
?>
This too allows you to easily add more strings to replace.
Hi following function used to found the any word from string
<?php
function searchword($string, $words)
{
$matchFound = count($words);// use tha no of word you want to search
$tempMatch = 0;
foreach ( $words as $word )
{
preg_match('/'.$word.'/',$string,$matches);
//print_r($matches);
if(!empty($matches))
{
$tempMatch++;
}
}
if($tempMatch==$matchFound)
{
return "found";
}
else
{
return "notFound";
}
}
$string = "test check one two test3";
/*** an array of words to highlight ***/
$words = array('test', 'test3');
$string = searchword($string, $words);
echo $string;
?>
If your string is utf-8, you could use preg_replace instead
$string = 'test check one two test3';
$result = preg_replace('/(test3)|(test2)|(test)/ui' , '<$1>' ,$string);
echo $result;
Oviously with this kind of data to match the result will be suboptimal
<test> check one two <test>3
You'll need a longer approach than a direct search and replace with regular expressions (surely if your patterns are prefixes of other patterns)
To begin with, the code you want to enhance does not seem to comply with its initial purpose (not at least in my computer). You can try something like this:
$string = 'test check one two test3';
$result = mb_eregi_replace('(test|test2|test3)', '<\1>', $string);
echo $result;
I've removed the i flag (which of course makes little sense here). Still, you'd still need to make the expression greedy.
As for the original question, here's a little proof of concept:
function replace($match){
$GLOBALS['matches'][] = $match;
return "<$match>";
}
$string = 'test check one two test3';
$matches = array();
$result = mb_eregi_replace('(test|test2|test3)', 'replace(\'\1\')', $string, 'e');
var_dump($result, $matches);
Please note this code is horrible and potentially insecure. I'd honestly go with the preg_replace_callback() solution proposed by Gumbo.

PHP Regular expression: exclude href anchor tags

I'm creating a simple search for my application.
I'm using PHP regular expression replacement (preg_replace) to look for a search term (case insensitive) and add <strong> tags around the search term.
preg_replace('/'.$query.'/i', '<strong>$0</strong>', $content);
Now I'm not the greatest with regular expressions. So what would I add to the regular expression to not replace search terms that are in a href of an anchor tag?
That way if someone searched "info" it wouldn't change a link to "http://something.com/this_<strong>info</strong>/index.html"
I believe you will need conditional subpatterns] for this purpose:
$query = "link";
$query = preg_quote($query, '/');
$p = '/((<)(?(2)[^>]*>)(?:.*?))*?(' . $query . ')/smi';
$r = "$1<strong>$3</strong>";
$str = ''."\n".'A Link'; // multi-line text
$nstr = preg_replace($p, $r, $str);
var_dump( $nstr );
$str = 'Its not a Link'; // non-link text
$nstr = preg_replace($p, $r, $str);
var_dump( $nstr );
Output: (view source)
string(61) "<a href="/Link/foo/the_link.htm">
A <strong>Link</strong></a>"
string(31) "Its not a <strong>Link</strong>"
PS: Above regex also takes care of multi-line replacement and more importantly it ignores matching not only href but any other HTML entity enclosed in < and >.
EDIT: If you just want to exclude hrefs and not all html entities then use this pattern instead of above in my answer:
$p = '/((<)(?(2).*?href=[^>]*>)(?:.*?))*?(' . $query . ')/smi';
I'm not 100% what you are ultimately after here, but from what I can, it's a sort of "search phrase" highlighting facility, which highlights keywords so to speak. If so, I suggest having a look at the Text Helper in CodeIgniter. It provides a nice little function called highlight_phrase and this could do what you are looking for.
The function is as follows.
function highlight_phrase($str, $phrase, $tag_open = '<strong>', $tag_close = '</strong>')
{
if ($str == '')
{
return '';
}
if ($phrase != '')
{
return preg_replace('/('.preg_quote($phrase, '/').')/i', $tag_open."\\1".$tag_close, $str);
}
return $str;
}
You may use conditional subpatterns, see explanation here: http://cz.php.net/manual/en/regexp.reference.conditional.php
preg_replace("/(?(?<=href=\")([^\"]*\")|($query))/i","\\1<strong>\\2</strong>",$x);
In your case, if you have whole HTML, not just href="", there is an easier solution using 'e' modifier, which enables you using PHP code in replacing matches
function termReplacer($found) {
$found = stripslashes($found);
if(substr($found,0,5)=="href=") return $found;
return "<strong>$found</strong>";
}
echo preg_replace("/(?:href=)?\S*$query/e","termReplacer('\\0')",$x);
See example #4 here http://cz.php.net/manual/en/function.preg-replace.php
If your expression is even more complex, you can use regExp even inside termReplacer().
There is a minor bug in PHP: the $found parameter in termReplacer() needs to be stripslashed!

Categories