I have homework to do this program in PHP, that take a matrix and the keywords, so it can find them in the matrix diagonally:
So this is how the first matrix looks, there are many with different keywords, here for instance keywords are "beef" and "pork". So I made a program for an input that looks like these 2 examples:
There correct input for these two is here:
Here is my code that works only in the first case, I don't know how to make a loop that will make more and more conditions instead of writing down all these conditions with && in if statement.
Please give me some tips on how to do it:
<?PHP
$input_line = trim(fgets(STDIN));
$input_array = explode(" ",$input_line);
// get mojiban
$mojiban = array();
for($ix=0; $ix<$input_array[0];$ix++){
$mojiban[] = str_split(trim(fgets(STDIN)));
}
//get words
$words = array();
for( $kx = 0; $kx < $input_array[1]; $kx++ ){
$words[] = trim(fgets(STDIN));
}
////check verticales
//get word
$wordCharArray = array();
for($dx=0; $dx < $input_array[1]; $dx++){
$wordCharArray = str_split($words[$dx]);
//looping and checking
for ( $line = 0; $line < $input_array[0] ; $line++) {
for ( $column = 0; $column < $input_array[0] ; $column++ ){
// for ($wordsNumber = 0; $wordsNumber < $input_array[1] ; $wordsNumber++){
if ($mojiban[$line][$column] == $wordCharArray[0] && $mojiban[$line+1][$column+1] == $wordCharArray[1]) {
echo ($column+1)." ".($line+1)."\n";
//}
}
}
}
Thanks in Advance!
Here's a solution to your problem right now it only checks for diagonal elements. You can refactor as you want from below.
There are many different solutions but the solution for your code I have pasted first checks for the character match in the array. If there is a match, it proceeds to check diagonally for the element from the position that it found the first element.
<?php
// First martix
// $searchWords = ["BEEF", "PORK"];
// $matrix = ["HPPLLM", "UROQUV", "FBSRZY", "DPEFKT", "GBBEUY", "EMCQFY"];
// Second martix
$searchWords = ["ABA", "BAB"];
$matrix = ["ACEG", "HBDF", "EGAC", "DFHB"];
// returns true if it diagonally matches the element in matrix from start row
function checkDiagonallyForString(string $search, array $matrix, int $startRow, int $firstMatchPosition)
{
$endRow = $startRow + (strlen($search) - 2);
$finding = true;
foreach (range($startRow, $endRow) as $searchIndex => $rowValue) {
if (!$finding) {
break;
}
$char = $search[$searchIndex + 1];
$finding = $matrix[$rowValue][$firstMatchPosition + $searchIndex] == $char;
}
return $finding;
}
// format: [word: [column, row]]
$found = [];
foreach ($matrix as $row => $matrixString) {
if (!count($searchWords)) {
break;
}
foreach ($searchWords as $wordRow => $word) {
$position = strpos($matrixString, $word[1]);
if ($position != false) {
if (checkDiagonallyForString($word, $matrix, $row, $position)) {
// $position = column of matrix
// $row = row of matrix
$found[$word] = [$position, $row];
unset($searchWords[$wordRow]);
}
}
}
}
// Pretty output
echo "<pre>";
print_r($found);
foreach ($found as $word => $indexes) {
echo $word . " " . implode(" ", $indexes) . "\n";
}
The results are:
Note: This is a complete answer but before copying and pasting this please try to get a general idea and attempt to solve it yourself. This is not the only way of doing it.
We can use implode tactics from comments. Here is example: click.
Code:
// First martix
$searchWords = ["BEEF", "PORK"];
$matrix = ["HPPLLM", "UROQUV", "FBSRZY", "DPEFKT", "GBBEUY", "EMCQFY"];
// Second martix
// $searchWords = ["ABA", "BAB"];
// $matrix = ["ACEG", "HBDF", "EGAC", "DFHB"];
$matrixStr = implode('', $matrix);
$matrixSize = count($matrix);
$positions = [];
foreach ($searchWords as $wordIdx => $word) {
$wordSize = strlen($word);
$found = false;
for ($y = 0; $y <= $matrixSize - $wordSize && !$found; $y++) {
for ($x = 0; $x <= $matrixSize - $wordSize && !$found; $x++) {
$allLettersOk = true;
for ($l = 0; $l < $wordSize; $l++) {
if ($matrixStr[$y * $matrixSize + $x + $l * $matrixSize + $l] != $word[$l]) {
$allLettersOk = false;
break;
}
}
if ($allLettersOk) {
$positions[$word] = [$x + 1, $y + 1];
$found = true;
}
}
}
}
foreach ($positions as $word => $indexes) {
echo $word . " " . implode(" ", $indexes) . "\n";
}
If you need a description to this code - write a comment.
// First martix
$searchWords = ["BEEF", "PORK"];
$matrix = ["HPPLLM", "UROQUV", "FBSRZY", "DPEFKT", "GBBEUY", "EMCQFY"];
// Second martix
// $searchWords = ["ABA", "BAB"];
// $matrix = ["ACEG", "HBDF", "EGAC", "DFHB"];
$matrixStr = implode('', $matrix);
$matrixSize = count($matrix);
$positions = [];
foreach ($searchWords as $wordIdx => $word) {
$wordSize = strlen($word);
$found = false;
for ($y = 0; $y <= $matrixSize - $wordSize && !$found; $y++) {
for ($x = 0; $x <= $matrixSize - $wordSize && !$found; $x++) {
$allLettersOk = true;
for ($l = 0; $l < $wordSize; $l++) {
if ($matrixStr[$y * $matrixSize + $x + $l * $matrixSize + $l] != $word[$l]) {
$allLettersOk = false;
break;
}
}
if ($allLettersOk) {
$positions[$word] = [$x + 1, $y + 1];
$found = true;
}
}
}
}
foreach ($positions as $word => $indexes) {
echo $word . " " . implode(" ", $indexes) . "\n";
}
Your $input_array contains your matrix, at each position you have a character, like this
$input_array = [
['H', 'P', 'P', 'L', 'L', 'M'],
['U', 'R', 'O', 'Q', 'U', 'V'],
['F', 'B', 'S', 'R', 'Z', 'Y'],
['D', 'P', 'E', 'F', 'K', 'T'],
['G', 'B', 'B', 'E', 'U', 'Y'],
['E', 'M', 'C', 'Q', 'F', 'Y'],
];
For the sake of simplicity, let's assume that you also have an array of arrays. I know you have strings, but string operations would make this solution difficult to read and we are mostly interested in the algorithm, so, for the sake of understandability I will not start from your actual data structure, but will answer questions if you have difficulty implementing this into your solution:
$words = [
['B', 'E', 'E', 'F'],
['P', 'O', 'R', 'K']
];
Notice that beef and pork are of the same length, but let's not assume that all words are of the same length.
//Computing row count so it can be reused
$rowCount = count($input_array);
//Computing the column count so it can be reused
//We assume that each row of the matrix has the same number of columns
$colCount = count($input_array[0]);
//Looping the rows
for ($row = 0; $row < $rowCount; $row++) {
//Looping the columns of the row
for ($column = 0; $column < $colCount; $column++) {
//Computing the max length of allowed words
$maxLength = min($rowCount - $row, $colCount - $col);
//Looping the words
for ($wIndex = 0; $wIndex < count($words); $wIndex++) {
//Storing the length of the current word for later use
$charCount = count($words[$wIndex]);
//We avoid checking for the presence of words that are
//Longer than the current position allows
if ($charCount <= $maxLength) {
//match is initialilzed with true and will be set false at
//the first mismatch. If no mistmatch is found, then we know
//that the word is present at the current position
$match = true;
//Looping the word's characters and check for possible
//mismatches
//Notice that we stop the loop either at the first mismatch
//or at the last character if there is no mismatch
for ($cIndex = 0; $match && ($cIndex < $charCount); $cIndex++) {
//If the word character at cIndex offset mismatches
//the character of input array at the same offset, starting
//from the [$row][$column] position, then it mismatches
if ($words[$wIndex][$cIndex] !== $input_array[$row + $cIndex][$column + $cIndex]) $match = false;
}
if ($match) {
//Say the word
echo implode("", $words[$wIndex]) . " found at row " . ($row + 1) . ", column " . ($column + 1);
}
}
}
}
}
Related
I have a string composed by many letters, at some point, one letter from a group can be used and this is represented by letters enclosed in []. I need to expand these letters into its actual strings.
From this:
$str = 'ABCCDF[GH]IJJ[KLM]'
To this:
$sub[0] = 'ABCCDFGIJJK';
$sub[1] = 'ABCCDFHIJJK';
$sub[2] = 'ABCCDFGIJJL';
$sub[3] = 'ABCCDFHIJJL';
$sub[4] = 'ABCCDFGIJJM';
$sub[5] = 'ABCCDFHIJJM';
UPDATE:
Thanks to #Barmar for the very valuable suggestions.
My final solution is:
$str = '[GH]DF[IK]TF[ADF]';
function parseString(string $str) : array
{
$i = 0;
$is_group = false;
$sub = array();
$chars = preg_split('//', $str, -1, PREG_SPLIT_NO_EMPTY);
foreach ($chars as $key => $value)
{
if(ctype_alpha($value))
{
if($is_group){
$sub[$i][] = $value;
} else {
if(!isset($sub[$i][0])){
$sub[$i][0] = $value;
} else {
$sub[$i][0] .= $value;
}
}
} else {
$is_group = !$is_group;
++$i;
}
}
return $sub;
}
The recommended function for combinations is (check the related post):
function array_cartesian_product($arrays)
{
$result = array();
$arrays = array_values($arrays);
$sizeIn = sizeof($arrays);
$size = $sizeIn > 0 ? 1 : 0;
foreach ($arrays as $array)
$size = $size * sizeof($array);
for ($i = 0; $i < $size; $i++) {
$result[$i] = array();
for ($j = 0; $j < $sizeIn; $j++)
array_push($result[$i], current($arrays[$j]));
for ($j = ($sizeIn - 1); $j >= 0; $j--) {
if (next($arrays[$j]))
break;
elseif (isset($arrays[$j]))
reset($arrays[$j]);
}
}
return $result;
}
Check the solution with:
$combinations = array_cartesian_product(parseString($str));
$sub = array_map('implode', $combinations);
var_dump($sub);
Convert your string into a 2-dimensional array. The parts outside brackets become single-element arrays, while each bracketed trings becomes an array of single characters. So your string would become:
$array =
array(array('ABCCDF'),
array('G', 'H', 'I'),
array('IJJ'),
array('K', 'L', 'M'));
Then you just need to compute all the combinations of those arrays; use one of the answers at How to generate in PHP all combinations of items in multiple arrays. Finally, you concatenate each of the resulting arrays with implode to get an array of strings.
$combinations = combinations($array);
$sub = array_map('implode', $combinations);
I coded this snippet:
$a = array('X', 'X', 'X', 'O', 'O', 'O');
$rk = array_rand($a, 3);
$l = $a[$rk[0]].$a[$rk[1]].$a[$rk[2]];
if($l == 'OOO' || $l == 'XXX'){
echo $l . 'Winner';
} else {
echo = $l . 'Loser';
}
In the array you will notice multiples of the same values this is because three are chosen randomly and if matched you win (its a basic game).
My question is how can it be coded without having to add multiples of the same value in the array?
Update:
Answer to Yanick Rochon question
as it stands the array is:
$a = array('X', 'X', 'X', 'O', 'O', 'O');
is is possible somehow to have it as just
$a = array('X', 'O');
and still have 3 values returned?
With a basic rand:
<?php
$a = array('X', 'O', 'R');
$size = count($a)-1;
$l = $a[rand(0,$size)].$a[rand(0,$size)].$a[rand(0,$size)];
if($l == 'OOO' || $l == 'XXX'){
echo $l . ' Winner';
} else {
echo $l . ' Loser';
}
?>
Version with XXX or OOO or RRR win:
<?php
$a = array('X', 'O', 'R');
$size = count($a)-1;
$l = $a[rand(0,$size)].$a[rand(0,$size)].$a[rand(0,$size)];
if($l === 'OOO' || $l === 'XXX' || $l === 'RRR'){
echo $l . ' Winner';
} else {
echo $l . ' Loser';
}
?>
To procedurally create an entire program, here is a code snippet. You can test it here.
function randomArray($multiple, $length = 4) {
if ($length <= $multiple) {
throw new Exception('Length must be greater than multiple');
}
// define possible characters
$possible = 'ABCEFGHIJKLMNPQRSTUVWXYZ';
$value = str_repeat(substr($possible, mt_rand(0, strlen($possible)-1), 1), $multiple);
$i = strlen($value);
// add random characters to $value until $length is reached
while ($i < $length) {
// pick a random character from the possible ones
$char = substr($possible, mt_rand(0, strlen($possible)-1), 1);
// we don't want this character if it's already in the string
if (!strstr($value, $char)) {
$value .= $char;
$i++;
}
}
// done!
$value = str_split($value);
shuffle($value);
return $value;
}
// return a random selection from the given array
// if $allowRepick is true, then the same array value may be picked multiple times
function arrayPick($arr, $count, $allowRepick = false) {
$value = array();
if ($allowRepick) {
for ($i = 0; $i < $count; ++$i) {
array_push($value, $arr[mt_rand(0, count($arr) - 1)]);
}
} else {
shuffle($arr);
$value = array_slice($arr, 0, $count);
}
return $value;
}
// generate an array with 4 values, from which 3 are the same
$values = randomArray(3, 4);
// pick any 3 distinct values from the array
$choices = arrayPick($values, 3, false);
// convert to strings
$valuesStr = implode('', $values);
$choicesStr = implode('', $choices);
echo 'Value : ' . $valuesStr . "\n";
//echo 'Choice : ' . $choicesStr . "\n";
if (preg_match('/^(.)\1*$/', $choicesStr)) {
echo $choicesStr . ' : Winner!';
} else {
echo $choicesStr . ' : Loser!';
}
Note : the function randomArray will fail if $length - $multiple > count($possible). For example, this call randomArray(1, 27) will fail, and the script will run indefinitely. Just so you know. :)
You mean something like this?
$letter = false;
$winner = true;
foreach($rk as $key)
{
if($letter === false) $letter = $a[$key];
elseif($a[$key] != $letter) $winner = false;
}
if($winner) echo 'Winner';
else echo 'Loser';
$num = 3;
$unique_vals = array('X','O');
$a = array();
foreach ($unique_vals as $val)
{
for ($i = 0; $i < $num; $i++)
{
$a[] = $val;
}
}
I'm writing a function to count syllables of a word as follows:
function estimate_syllables($word) {
$total_count=0;
foreach($word as $w) {
$syllable_count = count_english_vowels($w);
$total_count += $syllable_count;
}
return $total_count;
}
function count_english_vowels($word) {
static $english_vowels = array('A', 'E', 'I', 'O', 'U', 'Y');
$vowel_count = 0;
$letters = preg_replace('/[^a-z0-9]+/i', "", $word);
$len = strlen($letters);
$letters = str_split(strtoupper($letters));
$currPosition = -2;
$prevPosition = -1;
for($i = 0; $i < $len; $i++) {
if (in_array($letters[$i], $english_vowels)) {
if($i != $currPosition + 1) {
if ($letters[$i] == 'E' && $i != ($len -1))
$vowel_count++;
$prevPosition = $currPosition;
$currPosition = $i;
}
}
}
return $vowel_count;
}
I'm really confused: if you pass count_english_vowels a word like water, it reaches the inner loop to the correct syllable twice, yet the counter only reports a 1? Super confused, can anyone figure out what's going wrong?
It's only reporting one because of this:
if ($letters[$i] == 'E' && $i != ($len -1))
$vowel_count++;
$prevPosition = $currPosition;
$currPosition = $i;
There is only one E in water, thus it's only executing $vowel_count++; once. Based on your indenting of this code, I'd assume you need {} brackets around those lines, as without them PHP will only apply the first line after an if without them, so if that code is correct your indentation should look like this:
if ($letters[$i] == 'E' && $i != ($len -1))
$vowel_count++;
$prevPosition = $currPosition;
$currPosition = $i;
How can I write a function that gives me number of the character that is passed to it
For example, if the funciton name is GetCharacterNumber and I pass B to it then it should give me 2
GetCharacterNumber("A") // should print 1
GetCharacterNumber("C") // should print 3
GetCharacterNumber("Z") // should print 26
GetCharacterNumber("AA") // should print 27
GetCharacterNumber("AA") // should print 27
GetCharacterNumber("AC") // should print 29
Is it even possible to achieve this ?
There is a function called ord which gives you the ASCII number of the character.
ord($chr) - ord('A') + 1
gives you the correct result for one character. For longer strings, you can use a loop.
<?php
function GetCharacterNumber($str) {
$num = 0;
for ($i = 0; $i < strlen($str); $i++) {
$num = 26 * $num + ord($str[$i]) - 64;
}
return $num;
}
GetCharacterNumber("A"); //1
GetCharacterNumber("C"); //3
GetCharacterNumber("Z"); //26
GetCharacterNumber("AA"); //27
GetCharacterNumber("AC"); //29
GetCharacterNumber("BA"); //53
?>
Not very efficient but gets the job done:
function get_character_number($end)
{
$count = 1;
$char = 'A';
$end = strtoupper($end);
while ($char !== $end) {
$count++;
$char++;
}
return $count;
}
echo get_character_number('AA'); // 27
demo
This works because when you got something like $char = 'A' and do $char++, it will change to 'B', then 'C', 'D', … 'Z', 'AA', 'AB' and so on.
Note that this will become the slower the longer $end is. I would not recommend this for anything beyond 'ZZZZ' (475254 iterations) or if you need many lookups like that.
An better performing alternative would be
function get_character_number($string) {
$number = 0;
$string = strtoupper($string);
$dictionary = array_combine(range('A', 'Z'), range(1, 26));
for ($pos = 0; isset($string[$pos]); $pos++) {
$number += $dictionary[$string[$pos]] + $pos * 26 - $pos;
}
return $number;
}
echo get_character_number(''), PHP_EOL; // 0
echo get_character_number('Z'), PHP_EOL; // 26
echo get_character_number('AA'), PHP_EOL; // 27
demo
Use range and strpos:
$letter = 'z';
$alphabet = range('a', 'z');
$position = strpos($alphabet, $letter);
For double letters (eg zz) you'd probably need to create your own alphabet using a custom function:
$alphabet = range('a', 'z');
$dictionary = range('a','z');
foreach($alphabet AS $a1){
foreach($alphabet AS $a2) {
$dictionary[] = $a1 . $a2;
}
}
Then use $dictionary in place of $alphabet.
Here is the full code that does what you want.
Tested it and works perfectly for the examples you gave.
define('BASE', 26);
function singleOrd($chr) {
if (strlen($chr) == 1) {
return (ord($chr)-64);
} else{
return 0;
}
}
function multiOrd($string) {
if (strlen($string) == 0) {
return 0;
} elseif (strlen($string) == 1) {
return singleOrd($string);
} else{
$sum = 0;
for($i = strlen($string) - 1; $i >= 0; $i--) {
$sum += singleOrd($string[$i]) * pow(BASE, $i);
}
}
return $sum;
}
I think ord should be a more efficient way to have your number :
$string = strtolower($string);
$result = 0;
$length = strlen($string);
foreach($string as $key=>$value){
$result = ($length -$key - 1)*(ord($value)-ord(a)+1);
}
and result would contain what you want.
I have a small search engine doing its thing, and want to highlight the results. I thought I had it all worked out till a set of keywords I used today blew it out of the water.
The issue is that preg_replace() is looping through the replacements, and later replacements are replacing the text I inserted into previous ones. Confused? Here is my pseudo function:
public function highlightKeywords ($data, $keywords = array()) {
$find = array();
$replace = array();
$begin = "<span class=\"keywordHighlight\">";
$end = "</span>";
foreach ($keywords as $kw) {
$find[] = '/' . str_replace("/", "\/", $kw) . '/iu';
$replace[] = $begin . "\$0" . $end;
}
return preg_replace($find, $replace, $data);
}
OK, so it works when searching for "fred" and "dagg" but sadly, when searching for "class" and "lass" and "as" it strikes a real issue when highlighting "Joseph's Class Group"
Joseph's <span class="keywordHighlight">Cl</span><span <span c<span <span class="keywordHighlight">cl</span>ass="keywordHighlight">lass</span>="keywordHighlight">c<span <span class="keywordHighlight">cl</span>ass="keywordHighlight">lass</span></span>="keywordHighlight">ass</span> Group
How would I get the latter replacements to only work on the non-HTML components, but to also allow the tagging of the whole match? e.g. if I was searching for "cla" and "lass" I would want "class" to be highlighted in full as both the search terms are in it, even though they overlap, and the highlighting that was applied to the first match has "class" in it, but that shouldn't be highlighted.
Sigh.
I would rather use a PHP solution than a jQuery (or any client-side) one.
Note: I have tried to sort the keywords by length, doing the long ones first, but that means the cross-over searches do not highlight, meaning with "cla" and "lass" only part of the word "class" would highlight, and it still murdered the replacement tags :(
EDIT: I have messed about, starting with pencil & paper, and wild ramblings, and come up with some very unglamorous code to solve this issue. It's not great, so suggestions to trim/speed this up would still be greatly appreciated :)
public function highlightKeywords ($data, $keywords = array()) {
$find = array();
$replace = array();
$begin = "<span class=\"keywordHighlight\">";
$end = "</span>";
$hits = array();
foreach ($keywords as $kw) {
$offset = 0;
while (($pos = stripos($data, $kw, $offset)) !== false) {
$hits[] = array($pos, $pos + strlen($kw));
$offset = $pos + 1;
}
}
if ($hits) {
usort($hits, function($a, $b) {
if ($a[0] == $b[0]) {
return 0;
}
return ($a[0] < $b[0]) ? -1 : 1;
});
$thisthat = array(0 => $begin, 1 => $end);
for ($i = 0; $i < count($hits); $i++) {
foreach ($thisthat as $key => $val) {
$pos = $hits[$i][$key];
$data = substr($data, 0, $pos) . $val . substr($data, $pos);
for ($j = 0; $j < count($hits); $j++) {
if ($hits[$j][0] >= $pos) {
$hits[$j][0] += strlen($val);
}
if ($hits[$j][1] >= $pos) {
$hits[$j][1] += strlen($val);
}
}
}
}
}
return $data;
}
I've used the following to address this problem:
<?php
$protected_matches = array();
function protect(&$matches) {
global $protected_matches;
return "\0" . array_push($protected_matches, $matches[0]) . "\0";
}
function restore(&$matches) {
global $protected_matches;
return '<span class="keywordHighlight">' .
$protected_matches[$matches[1] - 1] . '</span>';
}
preg_replace_callback('/\x0(\d+)\x0/', 'restore',
preg_replace_callback($patterns, 'protect', $target_string));
The first preg_replace_callback pulls out all matches and replaces them with nul-byte-wrapped placeholders; the second pass replaces them with the span tags.
Edit: Forgot to mention that $patterns was sorted by string length, longest to shortest.
Edit; another solution
<?php
function highlightKeywords($data, $keywords = array(),
$prefix = '<span class="hilite">', $suffix = '</span>') {
$datacopy = strtolower($data);
$keywords = array_map('strtolower', $keywords);
$start = array();
$end = array();
foreach ($keywords as $keyword) {
$offset = 0;
$length = strlen($keyword);
while (($pos = strpos($datacopy, $keyword, $offset)) !== false) {
$start[] = $pos;
$end[] = $offset = $pos + $length;
}
}
if (!count($start)) return $data;
sort($start);
sort($end);
// Merge and sort start/end using negative values to identify endpoints
$zipper = array();
$i = 0;
$n = count($end);
while ($i < $n)
$zipper[] = count($start) && $start[0] <= $end[$i]
? array_shift($start)
: -$end[$i++];
// EXAMPLE:
// [ 9, 10, -14, -14, 81, 82, 86, -86, -86, -90, 99, -103 ]
// take 9, discard 10, take -14, take -14, create pair,
// take 81, discard 82, discard 86, take -86, take -86, take -90, create pair
// take 99, take -103, create pair
// result: [9,14], [81,90], [99,103]
// Generate non-overlapping start/end pairs
$a = array_shift($zipper);
$z = $x = null;
while ($x = array_shift($zipper)) {
if ($x < 0)
$z = $x;
else if ($z) {
$spans[] = array($a, -$z);
$a = $x;
$z = null;
}
}
$spans[] = array($a, -$z);
// Insert the prefix/suffix in the start/end locations
$n = count($spans);
while ($n--)
$data = substr($data, 0, $spans[$n][0])
. $prefix
. substr($data, $spans[$n][0], $spans[$n][1] - $spans[$n][0])
. $suffix
. substr($data, $spans[$n][1]);
return $data;
}
I had to revisit this subject myself today and wrote a better version of the above. I'll include it here. It's the same idea only easier to read and should perform better since it uses arrays instead of concatenation.
<?php
function highlight_range_sort($a, $b) {
$A = abs($a);
$B = abs($b);
if ($A == $B)
return $a < $b ? 1 : 0;
else
return $A < $B ? -1 : 1;
}
function highlightKeywords($data, $keywords = array(),
$prefix = '<span class="highlight">', $suffix = '</span>') {
$datacopy = strtolower($data);
$keywords = array_map('strtolower', $keywords);
// this will contain offset ranges to be highlighted
// positive offset indicates start
// negative offset indicates end
$ranges = array();
// find start/end offsets for each keyword
foreach ($keywords as $keyword) {
$offset = 0;
$length = strlen($keyword);
while (($pos = strpos($datacopy, $keyword, $offset)) !== false) {
$ranges[] = $pos;
$ranges[] = -($offset = $pos + $length);
}
}
if (!count($ranges))
return $data;
// sort offsets by abs(), positive
usort($ranges, 'highlight_range_sort');
// combine overlapping ranges by keeping lesser
// positive and negative numbers
$i = 0;
while ($i < count($ranges) - 1) {
if ($ranges[$i] < 0) {
if ($ranges[$i + 1] < 0)
array_splice($ranges, $i, 1);
else
$i++;
} else if ($ranges[$i + 1] < 0)
$i++;
else
array_splice($ranges, $i + 1, 1);
}
// create substrings
$ranges[] = strlen($data);
$substrings = array(substr($data, 0, $ranges[0]));
for ($i = 0, $n = count($ranges) - 1; $i < $n; $i += 2) {
// prefix + highlighted_text + suffix + regular_text
$substrings[] = $prefix;
$substrings[] = substr($data, $ranges[$i], -$ranges[$i + 1] - $ranges[$i]);
$substrings[] = $suffix;
$substrings[] = substr($data, -$ranges[$i + 1], $ranges[$i + 2] + $ranges[$i + 1]);
}
// join and return substrings
return implode('', $substrings);
}
// Example usage:
echo highlightKeywords("This is a test.\n", array("is"), '(', ')');
echo highlightKeywords("Classes are as hard as they say.\n", array("as", "class"), '(', ')');
// Output:
// Th(is) (is) a test.
// (Class)es are (as) hard (as) they say.
OP - something that's not clear in the question is whether $data can contain HTML from the get-go. Can you clarify this?
If $data can contain HTML itself, you are getting into the realms attempting to parse a non-regular language with a regular language parser, and that's not going to work out well.
In such a case, I would suggest loading the $data HTML into a PHP DOMDocument, getting hold of all of the textNodes and running one of the other perfectly good answers on the contents of each text block in turn.