keyword highlight is highlighting the highlights in PHP preg_replace() - php

I have a small search engine doing its thing, and want to highlight the results. I thought I had it all worked out till a set of keywords I used today blew it out of the water.
The issue is that preg_replace() is looping through the replacements, and later replacements are replacing the text I inserted into previous ones. Confused? Here is my pseudo function:
public function highlightKeywords ($data, $keywords = array()) {
$find = array();
$replace = array();
$begin = "<span class=\"keywordHighlight\">";
$end = "</span>";
foreach ($keywords as $kw) {
$find[] = '/' . str_replace("/", "\/", $kw) . '/iu';
$replace[] = $begin . "\$0" . $end;
}
return preg_replace($find, $replace, $data);
}
OK, so it works when searching for "fred" and "dagg" but sadly, when searching for "class" and "lass" and "as" it strikes a real issue when highlighting "Joseph's Class Group"
Joseph's <span class="keywordHighlight">Cl</span><span <span c<span <span class="keywordHighlight">cl</span>ass="keywordHighlight">lass</span>="keywordHighlight">c<span <span class="keywordHighlight">cl</span>ass="keywordHighlight">lass</span></span>="keywordHighlight">ass</span> Group
How would I get the latter replacements to only work on the non-HTML components, but to also allow the tagging of the whole match? e.g. if I was searching for "cla" and "lass" I would want "class" to be highlighted in full as both the search terms are in it, even though they overlap, and the highlighting that was applied to the first match has "class" in it, but that shouldn't be highlighted.
Sigh.
I would rather use a PHP solution than a jQuery (or any client-side) one.
Note: I have tried to sort the keywords by length, doing the long ones first, but that means the cross-over searches do not highlight, meaning with "cla" and "lass" only part of the word "class" would highlight, and it still murdered the replacement tags :(
EDIT: I have messed about, starting with pencil & paper, and wild ramblings, and come up with some very unglamorous code to solve this issue. It's not great, so suggestions to trim/speed this up would still be greatly appreciated :)
public function highlightKeywords ($data, $keywords = array()) {
$find = array();
$replace = array();
$begin = "<span class=\"keywordHighlight\">";
$end = "</span>";
$hits = array();
foreach ($keywords as $kw) {
$offset = 0;
while (($pos = stripos($data, $kw, $offset)) !== false) {
$hits[] = array($pos, $pos + strlen($kw));
$offset = $pos + 1;
}
}
if ($hits) {
usort($hits, function($a, $b) {
if ($a[0] == $b[0]) {
return 0;
}
return ($a[0] < $b[0]) ? -1 : 1;
});
$thisthat = array(0 => $begin, 1 => $end);
for ($i = 0; $i < count($hits); $i++) {
foreach ($thisthat as $key => $val) {
$pos = $hits[$i][$key];
$data = substr($data, 0, $pos) . $val . substr($data, $pos);
for ($j = 0; $j < count($hits); $j++) {
if ($hits[$j][0] >= $pos) {
$hits[$j][0] += strlen($val);
}
if ($hits[$j][1] >= $pos) {
$hits[$j][1] += strlen($val);
}
}
}
}
}
return $data;
}

I've used the following to address this problem:
<?php
$protected_matches = array();
function protect(&$matches) {
global $protected_matches;
return "\0" . array_push($protected_matches, $matches[0]) . "\0";
}
function restore(&$matches) {
global $protected_matches;
return '<span class="keywordHighlight">' .
$protected_matches[$matches[1] - 1] . '</span>';
}
preg_replace_callback('/\x0(\d+)\x0/', 'restore',
preg_replace_callback($patterns, 'protect', $target_string));
The first preg_replace_callback pulls out all matches and replaces them with nul-byte-wrapped placeholders; the second pass replaces them with the span tags.
Edit: Forgot to mention that $patterns was sorted by string length, longest to shortest.
Edit; another solution
<?php
function highlightKeywords($data, $keywords = array(),
$prefix = '<span class="hilite">', $suffix = '</span>') {
$datacopy = strtolower($data);
$keywords = array_map('strtolower', $keywords);
$start = array();
$end = array();
foreach ($keywords as $keyword) {
$offset = 0;
$length = strlen($keyword);
while (($pos = strpos($datacopy, $keyword, $offset)) !== false) {
$start[] = $pos;
$end[] = $offset = $pos + $length;
}
}
if (!count($start)) return $data;
sort($start);
sort($end);
// Merge and sort start/end using negative values to identify endpoints
$zipper = array();
$i = 0;
$n = count($end);
while ($i < $n)
$zipper[] = count($start) && $start[0] <= $end[$i]
? array_shift($start)
: -$end[$i++];
// EXAMPLE:
// [ 9, 10, -14, -14, 81, 82, 86, -86, -86, -90, 99, -103 ]
// take 9, discard 10, take -14, take -14, create pair,
// take 81, discard 82, discard 86, take -86, take -86, take -90, create pair
// take 99, take -103, create pair
// result: [9,14], [81,90], [99,103]
// Generate non-overlapping start/end pairs
$a = array_shift($zipper);
$z = $x = null;
while ($x = array_shift($zipper)) {
if ($x < 0)
$z = $x;
else if ($z) {
$spans[] = array($a, -$z);
$a = $x;
$z = null;
}
}
$spans[] = array($a, -$z);
// Insert the prefix/suffix in the start/end locations
$n = count($spans);
while ($n--)
$data = substr($data, 0, $spans[$n][0])
. $prefix
. substr($data, $spans[$n][0], $spans[$n][1] - $spans[$n][0])
. $suffix
. substr($data, $spans[$n][1]);
return $data;
}

I had to revisit this subject myself today and wrote a better version of the above. I'll include it here. It's the same idea only easier to read and should perform better since it uses arrays instead of concatenation.
<?php
function highlight_range_sort($a, $b) {
$A = abs($a);
$B = abs($b);
if ($A == $B)
return $a < $b ? 1 : 0;
else
return $A < $B ? -1 : 1;
}
function highlightKeywords($data, $keywords = array(),
$prefix = '<span class="highlight">', $suffix = '</span>') {
$datacopy = strtolower($data);
$keywords = array_map('strtolower', $keywords);
// this will contain offset ranges to be highlighted
// positive offset indicates start
// negative offset indicates end
$ranges = array();
// find start/end offsets for each keyword
foreach ($keywords as $keyword) {
$offset = 0;
$length = strlen($keyword);
while (($pos = strpos($datacopy, $keyword, $offset)) !== false) {
$ranges[] = $pos;
$ranges[] = -($offset = $pos + $length);
}
}
if (!count($ranges))
return $data;
// sort offsets by abs(), positive
usort($ranges, 'highlight_range_sort');
// combine overlapping ranges by keeping lesser
// positive and negative numbers
$i = 0;
while ($i < count($ranges) - 1) {
if ($ranges[$i] < 0) {
if ($ranges[$i + 1] < 0)
array_splice($ranges, $i, 1);
else
$i++;
} else if ($ranges[$i + 1] < 0)
$i++;
else
array_splice($ranges, $i + 1, 1);
}
// create substrings
$ranges[] = strlen($data);
$substrings = array(substr($data, 0, $ranges[0]));
for ($i = 0, $n = count($ranges) - 1; $i < $n; $i += 2) {
// prefix + highlighted_text + suffix + regular_text
$substrings[] = $prefix;
$substrings[] = substr($data, $ranges[$i], -$ranges[$i + 1] - $ranges[$i]);
$substrings[] = $suffix;
$substrings[] = substr($data, -$ranges[$i + 1], $ranges[$i + 2] + $ranges[$i + 1]);
}
// join and return substrings
return implode('', $substrings);
}
// Example usage:
echo highlightKeywords("This is a test.\n", array("is"), '(', ')');
echo highlightKeywords("Classes are as hard as they say.\n", array("as", "class"), '(', ')');
// Output:
// Th(is) (is) a test.
// (Class)es are (as) hard (as) they say.

OP - something that's not clear in the question is whether $data can contain HTML from the get-go. Can you clarify this?
If $data can contain HTML itself, you are getting into the realms attempting to parse a non-regular language with a regular language parser, and that's not going to work out well.
In such a case, I would suggest loading the $data HTML into a PHP DOMDocument, getting hold of all of the textNodes and running one of the other perfectly good answers on the contents of each text block in turn.

Related

Increasing the conditions in IF statement with loop PHP, Diagonal Check

I have homework to do this program in PHP, that take a matrix and the keywords, so it can find them in the matrix diagonally:
So this is how the first matrix looks, there are many with different keywords, here for instance keywords are "beef" and "pork". So I made a program for an input that looks like these 2 examples:
There correct input for these two is here:
Here is my code that works only in the first case, I don't know how to make a loop that will make more and more conditions instead of writing down all these conditions with && in if statement.
Please give me some tips on how to do it:
<?PHP
$input_line = trim(fgets(STDIN));
$input_array = explode(" ",$input_line);
// get mojiban
$mojiban = array();
for($ix=0; $ix<$input_array[0];$ix++){
$mojiban[] = str_split(trim(fgets(STDIN)));
}
//get words
$words = array();
for( $kx = 0; $kx < $input_array[1]; $kx++ ){
$words[] = trim(fgets(STDIN));
}
////check verticales
//get word
$wordCharArray = array();
for($dx=0; $dx < $input_array[1]; $dx++){
$wordCharArray = str_split($words[$dx]);
//looping and checking
for ( $line = 0; $line < $input_array[0] ; $line++) {
for ( $column = 0; $column < $input_array[0] ; $column++ ){
// for ($wordsNumber = 0; $wordsNumber < $input_array[1] ; $wordsNumber++){
if ($mojiban[$line][$column] == $wordCharArray[0] && $mojiban[$line+1][$column+1] == $wordCharArray[1]) {
echo ($column+1)." ".($line+1)."\n";
//}
}
}
}
Thanks in Advance!
Here's a solution to your problem right now it only checks for diagonal elements. You can refactor as you want from below.
There are many different solutions but the solution for your code I have pasted first checks for the character match in the array. If there is a match, it proceeds to check diagonally for the element from the position that it found the first element.
<?php
// First martix
// $searchWords = ["BEEF", "PORK"];
// $matrix = ["HPPLLM", "UROQUV", "FBSRZY", "DPEFKT", "GBBEUY", "EMCQFY"];
// Second martix
$searchWords = ["ABA", "BAB"];
$matrix = ["ACEG", "HBDF", "EGAC", "DFHB"];
// returns true if it diagonally matches the element in matrix from start row
function checkDiagonallyForString(string $search, array $matrix, int $startRow, int $firstMatchPosition)
{
$endRow = $startRow + (strlen($search) - 2);
$finding = true;
foreach (range($startRow, $endRow) as $searchIndex => $rowValue) {
if (!$finding) {
break;
}
$char = $search[$searchIndex + 1];
$finding = $matrix[$rowValue][$firstMatchPosition + $searchIndex] == $char;
}
return $finding;
}
// format: [word: [column, row]]
$found = [];
foreach ($matrix as $row => $matrixString) {
if (!count($searchWords)) {
break;
}
foreach ($searchWords as $wordRow => $word) {
$position = strpos($matrixString, $word[1]);
if ($position != false) {
if (checkDiagonallyForString($word, $matrix, $row, $position)) {
// $position = column of matrix
// $row = row of matrix
$found[$word] = [$position, $row];
unset($searchWords[$wordRow]);
}
}
}
}
// Pretty output
echo "<pre>";
print_r($found);
foreach ($found as $word => $indexes) {
echo $word . " " . implode(" ", $indexes) . "\n";
}
The results are:
Note: This is a complete answer but before copying and pasting this please try to get a general idea and attempt to solve it yourself. This is not the only way of doing it.
We can use implode tactics from comments. Here is example: click.
Code:
// First martix
$searchWords = ["BEEF", "PORK"];
$matrix = ["HPPLLM", "UROQUV", "FBSRZY", "DPEFKT", "GBBEUY", "EMCQFY"];
// Second martix
// $searchWords = ["ABA", "BAB"];
// $matrix = ["ACEG", "HBDF", "EGAC", "DFHB"];
$matrixStr = implode('', $matrix);
$matrixSize = count($matrix);
$positions = [];
foreach ($searchWords as $wordIdx => $word) {
$wordSize = strlen($word);
$found = false;
for ($y = 0; $y <= $matrixSize - $wordSize && !$found; $y++) {
for ($x = 0; $x <= $matrixSize - $wordSize && !$found; $x++) {
$allLettersOk = true;
for ($l = 0; $l < $wordSize; $l++) {
if ($matrixStr[$y * $matrixSize + $x + $l * $matrixSize + $l] != $word[$l]) {
$allLettersOk = false;
break;
}
}
if ($allLettersOk) {
$positions[$word] = [$x + 1, $y + 1];
$found = true;
}
}
}
}
foreach ($positions as $word => $indexes) {
echo $word . " " . implode(" ", $indexes) . "\n";
}
If you need a description to this code - write a comment.
// First martix
$searchWords = ["BEEF", "PORK"];
$matrix = ["HPPLLM", "UROQUV", "FBSRZY", "DPEFKT", "GBBEUY", "EMCQFY"];
// Second martix
// $searchWords = ["ABA", "BAB"];
// $matrix = ["ACEG", "HBDF", "EGAC", "DFHB"];
$matrixStr = implode('', $matrix);
$matrixSize = count($matrix);
$positions = [];
foreach ($searchWords as $wordIdx => $word) {
$wordSize = strlen($word);
$found = false;
for ($y = 0; $y <= $matrixSize - $wordSize && !$found; $y++) {
for ($x = 0; $x <= $matrixSize - $wordSize && !$found; $x++) {
$allLettersOk = true;
for ($l = 0; $l < $wordSize; $l++) {
if ($matrixStr[$y * $matrixSize + $x + $l * $matrixSize + $l] != $word[$l]) {
$allLettersOk = false;
break;
}
}
if ($allLettersOk) {
$positions[$word] = [$x + 1, $y + 1];
$found = true;
}
}
}
}
foreach ($positions as $word => $indexes) {
echo $word . " " . implode(" ", $indexes) . "\n";
}
Your $input_array contains your matrix, at each position you have a character, like this
$input_array = [
['H', 'P', 'P', 'L', 'L', 'M'],
['U', 'R', 'O', 'Q', 'U', 'V'],
['F', 'B', 'S', 'R', 'Z', 'Y'],
['D', 'P', 'E', 'F', 'K', 'T'],
['G', 'B', 'B', 'E', 'U', 'Y'],
['E', 'M', 'C', 'Q', 'F', 'Y'],
];
For the sake of simplicity, let's assume that you also have an array of arrays. I know you have strings, but string operations would make this solution difficult to read and we are mostly interested in the algorithm, so, for the sake of understandability I will not start from your actual data structure, but will answer questions if you have difficulty implementing this into your solution:
$words = [
['B', 'E', 'E', 'F'],
['P', 'O', 'R', 'K']
];
Notice that beef and pork are of the same length, but let's not assume that all words are of the same length.
//Computing row count so it can be reused
$rowCount = count($input_array);
//Computing the column count so it can be reused
//We assume that each row of the matrix has the same number of columns
$colCount = count($input_array[0]);
//Looping the rows
for ($row = 0; $row < $rowCount; $row++) {
//Looping the columns of the row
for ($column = 0; $column < $colCount; $column++) {
//Computing the max length of allowed words
$maxLength = min($rowCount - $row, $colCount - $col);
//Looping the words
for ($wIndex = 0; $wIndex < count($words); $wIndex++) {
//Storing the length of the current word for later use
$charCount = count($words[$wIndex]);
//We avoid checking for the presence of words that are
//Longer than the current position allows
if ($charCount <= $maxLength) {
//match is initialilzed with true and will be set false at
//the first mismatch. If no mistmatch is found, then we know
//that the word is present at the current position
$match = true;
//Looping the word's characters and check for possible
//mismatches
//Notice that we stop the loop either at the first mismatch
//or at the last character if there is no mismatch
for ($cIndex = 0; $match && ($cIndex < $charCount); $cIndex++) {
//If the word character at cIndex offset mismatches
//the character of input array at the same offset, starting
//from the [$row][$column] position, then it mismatches
if ($words[$wIndex][$cIndex] !== $input_array[$row + $cIndex][$column + $cIndex]) $match = false;
}
if ($match) {
//Say the word
echo implode("", $words[$wIndex]) . " found at row " . ($row + 1) . ", column " . ($column + 1);
}
}
}
}
}

How can I split the string in php?

I have a string like $str.
$str = "00016cam 321254300022cam 321254312315300020cam 32125433153";
I want to split it in array like this. The numbers before 'cam' is the string length.
$splitArray = ["00016cam 3212543", "00022cam 3212543123153", "00020cam 32125433153"]
I have tried following code:
$lengtharray = array();
while ($str != null)
{
$sublength = substr($str, $star, $end);
$star += (int)$sublength; // echo $star."<br>"; // echo $sublength."<br>";
if($star == $total)
{
exit;
}
else
{
}
array_push($lengtharray, $star); // echo
print_r($lengtharray);
}
You can try this 1 line solution
$str = explode('**', preg_replace('/\**cam/', 'cam', $str)) ;
If your string doesn't contain stars then I'm afraid you need to write a simple parser that will:
take characters from the left until it's not numeric
do substr having the length
repeat previous steps on not consumed string
<?php
$input = "00016cam 321254300022cam 321254312315300020cam 32125433153";
function parseArray(string $input)
{
$result = [];
while ($parsed = parseItem($input)) {
$input = $parsed['rest'];
$result[] = $parsed['item'];
}
return $result;
}
function parseItem(string $input)
{
$sLen = strlen($input);
$len = '';
$pos = 0;
while ($pos < $sLen && is_numeric($input[$pos])) {
$len .= $input[$pos];
$pos++;
}
if ((int) $len == 0) {
return null;
}
return [
'rest' => substr($input, $len),
'item' => substr($input, 0, $len)
];
}
var_dump(parseArray($input));
this code works for me. hope helps.
$str = "**00016**cam 3212543**00022**cam 3212543123153**00020**cam 32125433153";
$arr = explode("**", $str);
for ($i=1; $i < sizeof($arr); $i=$i+2)
$arr_final[]=$arr[$i].$arr[$i+1];

Creating licence plates from different but similar strings

I need help to create a licence plate (6 character length) from different equal or unequal length of strings.
Example 1:
$str1 = "YE37";
$str2 = "TE37";
$str3 = "LYTE";
When I combine, it should give me "LYTE37". I must use all of them to formulate a plate. I can find the common longest sequence between $str1 and $str2 is "E37" but unsure "Y" or "T" comes first (i.e., whether "YTE37" or "TYE37")" then I can combine with $str3 using the longest common sequence ("YTE") which supposed to give me "LYTE37".
Example 2: "YLF3", "EYLF" and "YLF37" should give me "EYLF37".
I use the following function that finds the longest common sequence
$string_1="YE37";
$string_2="TE37";
$S =get_longest_common_subsequence($string_1, $string_2); // $S is "E37"
function get_longest_common_subsequence($string_1, $string_2)
{
$string_1_length = strlen($string_1);
$string_2_length = strlen($string_2);
$return = '';
if ($string_1_length === 0 || $string_2_length === 0)
{
// No similarities
return $return;
}
$longest_common_subsequence = array();
// Initialize the CSL array to assume there are no similarities
$longest_common_subsequence = array_fill(0, $string_1_length, array_fill(0, $string_2_length, 0));
$largest_size = 0;
for ($i = 0; $i < $string_1_length; $i++)
{
for ($j = 0; $j < $string_2_length; $j++)
{
// Check every combination of characters
if ($string_1[$i] === $string_2[$j])
{
// These are the same in both strings
if ($i === 0 || $j === 0)
{
// It's the first character, so it's clearly only 1 character long
$longest_common_subsequence[$i][$j] = 1;
}
else
{
// It's one character longer than the string from the previous character
$longest_common_subsequence[$i][$j] = $longest_common_subsequence[$i - 1][$j - 1] + 1;
}
if ($longest_common_subsequence[$i][$j] > $largest_size)
{
// Remember this as the largest
$largest_size = $longest_common_subsequence[$i][$j];
// Wipe any previous results
$return = '';
// And then fall through to remember this new value
}
if ($longest_common_subsequence[$i][$j] === $largest_size)
{
// Remember the largest string(s)
$return = substr($string_1, $i - $largest_size + 1, $largest_size);
}
}
// Else, $CSL should be set to 0, which it was already initialized to
}
}
// Return the list of matches
return $return;
}
I need an algorithm that uses these strings and creates a licence plate.
Could this be the Algorithm you are looking for? Quick-Test Here.
<?php
$str1 = "YE37";
$str2 = "TE37";
$str3 = "LYTE";
$strA = "YLF3";
$strB = "EYLF";
$strC = "YLF37";
function generatePlateNumber($str1, $str2, $str3) {
$plateNumber = '';
$arr = array($str1, $str2, $str3);
$arrStr = array();
foreach($arr as $str){
if(!preg_match("#\d#", $str)){
$arrStr[] = $str;
}
}
foreach($arr as $str){
if(preg_match("#\d#", $str)){
$arrStr[] = $str;
}
}
$chars = array_merge(str_split($arrStr[0]),
str_split($arrStr[1]),
str_split($arrStr[2]) );
$alphabets = [];
$numbers = [];
foreach($chars as $char){
if(is_numeric($char)){
$numbers[] = $char;
}else{
$alphabets[] = $char;
}
}
$alphabets = array_unique($alphabets);
$numbers = array_unique($numbers);
// BUILD THE PLATE NUMBER:
$plateNumber .= implode($alphabets) . implode($numbers);
return $plateNumber;
}

replace all matches in string after given position without regular expressions

I have a string abcxdefxghix. I wish to remove all "x"s except the first one. I can easily find the position of the first "x" using strpos(), so wish to remove all "x"s after that position. str_replace() performs a replacement of a given string with another, but doesn't allow a start position. substr_replace() gives a start position, but doesn't have the search parameter. I realize this can be done using preg_replace() but it seems like it should also be possible with without regular expressions (or without some crazy split/replace/assemble strategy).
You could do something like this:
list($first,$remainder) = explode($searchString,$subjectString,2);
$remainder = str_replace($searchString,$replacementString,$remainder);
$resultString = $first.$searchString.$remainder;
I 'd most likely do it the old-fashioned way:
$index = strpos($input, $needle);
if ($index !== false) {
$input = substr($input, 0, $index + 1).
str_replace($needle, $replacement, substr($input, $index + 1));
}
I thought there had to be an easier way, but evidently there isn't. My homespun function was negligently faster, but if there is an "old-fashion way", it is probably the way to go.
function replace_all_but_first_1($search,$replace,$subject){
$pos=strpos($subject,$search);
return ($pos===false)?$subject:substr($subject,0,$pos+1).str_replace($search,$replace,substr($subject, $pos));
}
function replace_all_but_first_2($search,$replace,$subject){
$index = strpos($subject, $search);
if ($index !== false) {
$subject = substr($subject, 0, $index + 1).
str_replace($search, $replace, substr($subject, $index + 1));
}
return $subject;
}
function replace_all_but_first_3($search,$replace,$subject){
list($first,$remainder) = explode($search,$subject,2);
$remainder = str_replace($search,$replace,$remainder);
$resultString = $first.$search.$remainder;
return $resultString;
}
function replace_all_but_first($search,$replace,$subject){
echo('Replace "'.$search.'" with "'.$replace.'" in "'.$subject.'"<br><br>');
echo('replace_all_but_first_1: '.replace_all_but_first_1($search,$replace,$subject)."<br>");
echo('replace_all_but_first_2: '.replace_all_but_first_2($search,$replace,$subject)."<br>");
echo('replace_all_but_first_3: '.replace_all_but_first_3($search,$replace,$subject)."<br>");
$time=microtime(true);
for ($i = 1; $i <= 100000; $i++) {$x=replace_all_but_first_1($search,$replace,$subject);}
echo('replace_all_but_first_1 '.(microtime(true)-$time).'<br>');
$time=microtime(true);
for ($i = 1; $i <= 100000; $i++) {$x=replace_all_but_first_2($search,$replace,$subject);}
echo('replace_all_but_first_2 '.(microtime(true)-$time).'<br>');
$time=microtime(true);
for ($i = 1; $i <= 100000; $i++) {$x=replace_all_but_first_3($search,$replace,$subject);}
echo('replace_all_but_first_3 '.(microtime(true)-$time).'<br>');
echo('<br><br><br>');
}
replace_all_but_first('x','','abcxdefxghix');
replace_all_but_first('x','','xabcxdefxghix');
replace_all_but_first('z','','abcxdefxghix');
Replace "x" with "" in "abcxdefxghix"
replace_all_but_first_1: abcxdefghi
replace_all_but_first_2: abcxdefghi
replace_all_but_first_3: abcxdefghi
replace_all_but_first_1 0.18973803520203
replace_all_but_first_2 0.19031405448914
replace_all_but_first_3 0.19151902198792
Replace "x" with "" in "xabcxdefxghix"
replace_all_but_first_1: xabcdefghi
replace_all_but_first_2: xabcdefghi
replace_all_but_first_3: xabcdefghi
replace_all_but_first_1 0.18725895881653
replace_all_but_first_2 0.19358086585999
replace_all_but_first_3 0.19228482246399
Replace "z" with "" in "abcxdefxghix"
replace_all_but_first_1: abcxdefxghix
replace_all_but_first_2: abcxdefxghix
replace_all_but_first_3: abcxdefxghixz
replace_all_but_first_1 0.074465036392212
replace_all_but_first_2 0.075581073760986
replace_all_but_first_3 0.71253705024719

Reverse letters in each word of a string without using native splitting or reversing functions [duplicate]

This question already has answers here:
Reverse the letters in each word of a string
(6 answers)
Closed 1 year ago.
This task has already been asked/answered, but I recently had a job interview that imposed some additional challenges to demonstrate my ability to manipulate strings.
Problem: How to reverse words in a string? You can use strpos(), strlen() and substr(), but not other very useful functions such as explode(), strrev(), etc.
Example:
$string = "I am a boy"
Answer:
I ma a yob
Below is my working coding attempt that took me 2 days [sigh], but there must be a more elegant and concise solution.
Intention:
1. get number of words
2. based on word count, grab each word and store into array
3. loop through array and output each word in reverse order
Code:
$str = "I am a boy";
echo reverse_word($str) . "\n";
function reverse_word($input) {
//first find how many words in the string based on whitespace
$num_ws = 0;
$p = 0;
while(strpos($input, " ", $p) !== false) {
$num_ws ++;
$p = strpos($input, ' ', $p) + 1;
}
echo "num ws is $num_ws\n";
//now start grabbing word and store into array
$p = 0;
for($i=0; $i<$num_ws + 1; $i++) {
$ws_index = strpos($input, " ", $p);
//if no more ws, grab the rest
if($ws_index === false) {
$word = substr($input, $p);
}
else {
$length = $ws_index - $p;
$word = substr($input, $p, $length);
}
$result[] = $word;
$p = $ws_index + 1; //move onto first char of next word
}
print_r($result);
//append reversed words
$str = '';
for($i=0; $i<count($result); $i++) {
$str .= reverse($result[$i]) . " ";
}
return $str;
}
function reverse($str) {
$a = 0;
$b = strlen($str)-1;
while($a < $b) {
swap($str, $a, $b);
$a ++;
$b --;
}
return $str;
}
function swap(&$str, $i1, $i2) {
$tmp = $str[$i1];
$str[$i1] = $str[$i2];
$str[$i2] = $tmp;
}
$string = "I am a boy";
$reversed = "";
$tmp = "";
for($i = 0; $i < strlen($string); $i++) {
if($string[$i] == " ") {
$reversed .= $tmp . " ";
$tmp = "";
continue;
}
$tmp = $string[$i] . $tmp;
}
$reversed .= $tmp;
print $reversed . PHP_EOL;
>> I ma a yob
Whoops! Mis-read the question. Here you go (Note that this will split on all non-letter boundaries, not just space. If you want a character not to be split upon, just add it to $wordChars):
function revWords($string) {
//We need to find word boundries
$wordChars = 'abcdefghijklmnopqrstuvwxyz';
$buffer = '';
$return = '';
$len = strlen($string);
$i = 0;
while ($i < $len) {
$chr = $string[$i];
if (($chr & 0xC0) == 0xC0) {
//UTF8 Characer!
if (($chr & 0xF0) == 0xF0) {
//4 Byte Sequence
$chr .= substr($string, $i + 1, 3);
$i += 3;
} elseif (($chr & 0xE0) == 0xE0) {
//3 Byte Sequence
$chr .= substr($string, $i + 1, 2);
$i += 2;
} else {
//2 Byte Sequence
$i++;
$chr .= $string[$i];
}
}
if (stripos($wordChars, $chr) !== false) {
$buffer = $chr . $buffer;
} else {
$return .= $buffer . $chr;
$buffer = '';
}
$i++;
}
return $return . $buffer;
}
Edit: Now it's a single function, and stores the buffer naively in reversed notation.
Edit2: Now handles UTF8 characters (just add "word" characters to the $wordChars string)...
My answer is to count the string length, split the letters into an array and then, loop it backwards. This is also a good way to check if a word is a palindrome. This can only be used for regular string and numbers.
preg_split can be changed to explode() as well.
/**
* Code snippet to reverse a string (LM)
*/
$words = array('one', 'only', 'apple', 'jobs');
foreach ($words as $d) {
$strlen = strlen($d);
$splits = preg_split('//', $d, -1, PREG_SPLIT_NO_EMPTY);
for ($i = $strlen; $i >= 0; $i=$i-1) {
#$reverse .= $splits[$i];
}
echo "Regular: {$d}".PHP_EOL;
echo "Reverse: {$reverse}".PHP_EOL;
echo "-----".PHP_EOL;
unset($reverse);
}
Without using any function.
$string = 'I am a boy';
$newString = '';
$temp = '';
$i = 0;
while(#$string[$i] != '')
{
if($string[$i] == ' ') {
$newString .= $temp . ' ';
$temp = '';
}
else {
$temp = $string[$i] . $temp;
}
$i++;
}
$newString .= $temp . ' ';
echo $newString;
Output: I ma a yob

Categories