PHP: how to get numeric value of a string with multiple numbers? - php

Let's say in PHP I have a string-variable : "This takes between 5 and 7 days"
I need to store some sensible information about the time it takes in an integer.
I'm satisfied if the result would be 5.
I tried stripping non-numeric characters, but end up with 57 then.
How can this be done in a better way?

Use preg_match to match the first digit group using a regex:
$subject = 'This takes between 5 and 7 days';
if (preg_match('/\d+/', $subject, $matches)) {
echo 'First number is: ' . $matches[0];
} else {
echo 'No number found';
}
Using preg_match_all you could match all digit groups (5 and 7 in this example):
$subject = 'This takes between 5 and 7 days';
if (preg_match_all('/\d+/', $subject, $matches)) {
echo 'Matches found:<br />';
print_r($matches);
} else {
echo 'No number found';
}

If you want to quantify the numbers appropriately, I would suggest the following:
<?php
$subject = "This takes between 5 and 7 days";
$dayspattern = '/\d+(\.\d+)? ?days?/';
$hourspattern = '/\d+(\.\d+)? ?hours?/'
$hours = -1;
if (preg_match($dayspattern , $subject, $matches) > 0)
{
preg_match($dayspattern, $matches[0], $days);
$hours = $days * 24;
} elseif (preg_match($dayspattern , $subject, $matches) > 0) {
preg_match($hourspattern, $matches[0], $hours);
$hours = $hours;
}
?>
You would need to consider:
what happens when no numbers are found, or numbers are given as text instead.
what happens when someone says '1 day and 5 hours'
Hopefully this gives you enough information to do the rest yourself.

If you would like to extract the price, ie €7.50 OR $50 etc
here is the regular expression comes to the solution for me.
preg_match('/\d+\.?\d*/',$price_str,$matches);
echo $matches[0];
results 7.50 OR 50

Seperate the subject into words by splitting the string into an array. Then, somehow I don't know, take the words out of the array. Maybe by looping through all children of the array inside a try-catch block, and try to change each element into an int type:
for($i=0;$i<$words->length;$i++){
try{
$tempNum = (int)$words[$i];
}catch($ex){
$words->remove($i);
}
}
Or something like that. I don't know any of the array methods, but you get what I mean. Anyway, the $words array now only contains numbers.

Related

PHP - get total number of array items with a specific number sequence in value

So, basically I'm trying to count the number of landline phone numbers in a list of both landlines and mobile phone numbers $mobile_list (071234567890,02039989435,0781...)
$mobile_array = explode(",",$mobile_list); // turn into an array
$landlines = array_count_values($mobile_array); // create count variable
echo $landlines["020..."]; // print the number of numbers
So, I get the basic count specific elements function, but I don't see where I can specify if an element 'starts with' or 'contains' a sequence. With the above you can only specify an exact phone number (obviously not useful).
Any help would be great!
I don't see any reason to first explode the string to an array, and then check each array item.
That is a complete waste of performance!
I suggest using preg_match_all and match with word boundary "020".
That means the "word" has to start with 020.
$mobile_list = "071234567890,02039989435,0781,020122,123020";
preg_match_all("/\b020\d+\b/", $mobile_list, $m);
var_dump($m);
echo count($m[0]); // 2
https://3v4l.org/ucSDm
The lightest and fastest method I have found is to explode on ",020".
The array that is returned has item 0 as undefined, meaning we don't know if it's a 020 number so I have to look at that manually.
$temp = explode(",020", $mobile_list);
$cnt = count($temp);
if(substr($temp[0],0,3) != "020") $cnt--;
echo $cnt;
A small scale test shows this as the fastest method.
https://3v4l.org/rD54d
You can use array_reduce() to count the occurrences of strings beginning with '020'
$mobile_list = "02039619491,07143502893,02088024526,07351261813,02095694897";
$mobile_array = explode(',', $mobile_list);
function landlineCount($carry, $item)
{
if (substr($item, 0, 3) === '020') {
return $carry += 1;
}
return $carry;
}
$count = array_reduce($mobile_array, 'landlineCount');
echo $count;
prints 3
I'm sure the OP has finished what they needed to do hours ago but for fun here is a faster way to count the landlines.
I hadn't spotted that the question original code was exploding the string.
That isn't necessary, you can just count the sub strings with substr_count() this could miss the first which wouldn't have a comma before it so I check for that too with substr().
If you need the total count of all numbers you can just count the commas with substr_count() again and add one.
$count = substr($mobile_list, 0, 3) === '020' ? 1 : 0;
$count += substr_count($mobile_list, ",020");
$totalCount = substr_count($mobile_list, ",") + 1;
echo $count;
echo $totalCount;
Here is the bench run a 1000 times to get an average.
https://3v4l.org/Sma66
Use array_filter() or preg_grep() functions to find all numbers that contain or starts with given number sequence.
Note: There is easier and better solution in other answers that cover request to find values that start with given number sequence.
Because you have mentioned - "but I don't see where I can specify if an element 'starts with' or 'contains' a sequence." - My code assumes that you wan't to find any occurrence of sequence, not only in start of string of each item.
$mobile_list = '02000, 02032435, 039002300, 00305600';
$mobile_array = explode(",",$mobile_list); // turn into an array
$landlines = array_count_values($mobile_array); // create count variable
$sequence = '020'; // print the number of numbers
function filter_phone_numbers($mobile_array, $sequence){
return array_filter($mobile_array, function ($item) use ($sequence) {
if (stripos($item, $sequence) !== false) {
return true;
}
return false;
});
}
$filtered_items = array_unique (filter_phone_numbers($mobile_array, $sequence)); //use array_unique in case we find same number that both contains or starts with sequence
echo count($filtered_items);
Or with preg_grep():
$mobile_list = '02000, 02032435, 039002300, 00305600';
$mobile_array = explode(",",$mobile_list); // turn into an array
$landlines = array_count_values($mobile_array); // create count variable
$sequence = preg_quote('020', '~'); ; // print the number of numbers
function grep_phone_numbers($mobile_array, $sequence){
return preg_grep('~' . $sequence . '~', $mobile_array);
}
//use array_unique in case we find same number that both contains or starts with sequence
$filtered_items = array_unique(grep_phone_numbers($mobile_array, $sequence));
echo count($filtered_items);
I recommend doing this with the database. The database is design to manage data and can do it a lot more efficient than PHP can. You can simply put it into a query and just get the result you want in 1 go:
SELECT * FROM phone_numbers WHERE number LIKE '020%'
If you get the data from the database anyways, that LIKE adds a little time to the query, but less that it takes PHP to loop, strpos and store the results. Also, as you return a smaller dataset, less resources are being used.

Split a comma separated string but only split by a comma

Hi I have a long string
0BV,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD,1AG,1AKF.......
I want to show it in a page by sub sting them
like
0BV,0BW,100,102,108,112,146
163191,192,193,1D94,19339
1A1,1AA,1AE,1AFD,1AG,1AKF
What i want to do is create sub strings from the string , length of 100 characters , but if the 100 th character is a not a comma i want to check the next comma in the string and split by that .
I tried to use chunk() to split by word count , but since the sub-string lengths are different , it is showing inappropriate in the page
$db_ocode = $row["option_code"];
$exclude_options_array = explode(",",$row["option_code"]);
$exclude_options_chunk_array = array_chunk($exclude_options_array,25);
$exclude_options_string = '';
foreach($exclude_options_chunk_array as $exclude_options_chunk)
{
$exclude_options_string .= implode(",",$exclude_options_chunk);
$exclude_options_string .= "</br>";
}
Please help . thanks in advance
Take the string, set the cutoff position. If that position does not contain a comma then find the first comma after that position and cut off there. Simple
<?php
$string="0BV,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD";
$cutoff=30;
if($string[$cutoff]!=",")
$cutoff=strpos($string,",",$cutoff);
echo substr($string,0,$cutoff);
Fiddle
(.{99})(?=,),|([^,]*),
Instead of split you can grab the captures which is much easy.See demo for 20 characters.
https://regex101.com/r/sH8aR8/37
Using Hanky Panky's answer i was able to provide a complete solution to my Problem , Thank you very much Hanky panky . If my code is not efficient ,Kindly please edit it .
$string="0BV,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD";
for($start=0;$start<strlen($string);) {
$cutoff=30;
if(isset($string[$start+$cutoff]) && $string[$start+$cutoff]!=",")
{
$cutoff=strpos($string,",",$start+$cutoff);
}
else if(($start+$cutoff) >= strlen($string))
{
$cutoff = strlen($string);
}
else if($start >= 30)
{
$cutoff = $start + $cutoff;
}
echo substr($string,$start,$cutoff-$start)."\n";
$start=$cutoff+1;
}
In case python
ln=0
i=1
str='0BVAa,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD,1AG,1AKF'
for item in str:
print (item),
ln=ln+len(item)
if ln/10>=i and item==',':
print ""
i=i+1

Simple regex trying to extract 3 or 4 numbers from "dirty" time string

Despite some help earlier on I am still floundering in regex problems and now in array problems.
I am trying to allow users to put time in as 205pm 1405 14:05 2.05 pm and so on.
Previously I had times stored as 14:05 (standard mySQL TIME format) but users were not liking that but if I convert to 2:05 pm then, when the updated values are entered (in similar format), that obviously breaks the database.
I have NO TROUBLE going 14:05 to 2:05 pm but I am having a nightmare going in the opposite direction.
I have fudged things a bit with a cascading IF statement to get the string length but I have spent literally hours trying to get at the output.
IE if I get 2-05 pm, to start off with I just want to get 205.
Here is my atrocious code:
if ($_POST['xxx']='yyy')
{
$stuff=$_POST['stuff'];
$regex='/^\d\D*\d\D*\d\D*\d\D*\d\D*$/';
if (preg_match($regex, $stuff, $matches)) {echo " More than 4 digits. This cannot be a time."; }
else{
$regex='/^\d\D*\d\D*\d\D*\d\D*$/';
if (preg_match($regex, $stuff, $matches)) {echo " >>4 digits";}
else{
$regex='/^\d\D*\d\D*\d\D*$/';
if (preg_match($regex, $stuff, $matches)) {echo " >>3 digits";}
else{
$regex='/^\d\D*\d\D*$/';
if (preg_match($regex, $stuff, $matches)) {echo " Less than 3 digits. This cannot be a time.";}
}
}
}
}
debug ($matches,"mat1");
$NEWmatches = implode($matches);
debug ($matches,"matN1");
preg_match_all('!\d+!', $NEWmatches, $matches);
debug ($matches,"mat2");
$matches = implode($matches);
debug ($matches,"mat3");
echo "<br> Matches $matches"; /// I hoped to get the digits only here
?>
Thanks for any help.
$times = array(
'205pm', '1405', '4:05', '2.05 pm'
);
foreach($times as $time)
{
// parsing string into array with 'h' - hour, 'm' - minutes and 'ap' keys
preg_match('/(?P<h>\d{1,2})\D?(?P<m>\d{2})\s*(?P<ap>(a|p)m)?/i', $time, $matches);
// construction below is not necessary, it just removes extra values from array
$matches = array_intersect_key($matches,
array_flip(array_filter(array_keys($matches), 'is_string')));
// output the result
var_dump($matches);
}
If you are using that string at strtotime then it is easier just to reformat it to the correct format, like this
$times = array(
'205pm', '1405', '4:05', '2.05 pm'
);
var_dump(preg_replace('/(\d{1,2})\D?(\d{2})(\s*(a|p)m)?/i', '$1:$2$3', $times));
ps: for more complex possible situations I would suggest to reformat the time and do something like this, otherwise regexp can be a nightmare..
$times = array(
'9 pm', '205pm', '1405', '4:05', '2.05 pm'
);
$times = preg_replace('/(\d{1,2})\D?(\d{2})(\s*(a|p)m)?/i', '$1:$2$3', $times);
foreach($times as $time)
{
$date = strtotime($time);
if ($date === false) { echo 'Unable to parse the time ' . $time . "\n"; continue; }
$hour = date('G', $date);
$minutes = date('i', $date);
echo $hour . " : " . $minutes . "\n";
}
For your given example "2-05 or 14:05" you can use this RegEx:
^(?<HOUR>[0-9]{1,2})\s{0,}((-|:|\.)\s{0,})?(?<MIN>[0-9]{2})\s{0,}(?<MODE>(a|p)m)?$
"Hour" will hold the the first 2 numbers of the string, "MIN" will always hold the last 2 numbers of the string. "MODE" will hold (am or pm)
So you can combine them at the end to an single string. Also you can just run an simple Replace("-","").

Can the for loop be eliminated from this piece of PHP code?

I have a range of whole numbers that might or might not have some numbers missing. Is it possible to find the smallest missing number without using a loop structure? If there are no missing numbers, the function should return the maximum value of the range plus one.
This is how I solved it using a for loop:
$range = [0,1,2,3,4,6,7];
// sort just in case the range is not in order
asort($range);
$range = array_values($range);
$first = true;
for ($x = 0; $x < count($range); $x++)
{
// don't check the first element
if ( ! $first )
{
if ( $range[$x - 1] + 1 !== $range[$x])
{
echo $range[$x - 1] + 1;
break;
}
}
// if we're on the last element, there are no missing numbers
if ($x + 1 === count($range))
{
echo $range[$x] + 1;
}
$first = false;
}
Ideally, I'd like to avoid looping completely, as the range can be massive. Any suggestions?
Algo solution
There is a way to check if there is a missing number using an algorithm. It's explained here. Basically if we need to add numbers from 1 to 100. We don't need to calculate by summing them we just need to do the following: (100 * (100 + 1)) / 2. So how is this going to solve our issue ?
We're going to get the first element of the array and the last one. We calculate the sum with this algo. We then use array_sum() to calculate the actual sum. If the results are the same, then there is no missing number. We could then "backtrack" the missing number by substracting the actual sum from the calculated one. This of course only works if there is only one number missing and will fail if there are several missing. So let's put this in code:
$range = range(0,7); // Creating an array
echo check($range) . "\r\n"; // check
unset($range[3]); // unset offset 3
echo check($range); // check
function check($array){
if($array[0] == 0){
unset($array[0]); // get ride of the zero
}
sort($array); // sorting
$first = reset($array); // get the first value
$last = end($array); // get the last value
$sum = ($last * ($first + $last)) / 2; // the algo
$actual_sum = array_sum($array); // the actual sum
if($sum == $actual_sum){
return $last + 1; // no missing number
}else{
return $sum - $actual_sum; // missing number
}
}
Output
8
3
Online demo
If there are several numbers missing, then just use array_map() or something similar to do an internal loop.
Regex solution
Let's take this to a new level and use regex ! I know it's nonsense, and it shouldn't be used in real world application. The goal is to show the true power of regex :)
So first let's make a string out of our range in the following format: I,II,III,IIII for range 1,3.
$range = range(0,7);
if($range[0] === 0){ // get ride of 0
unset($range[0]);
}
$str = implode(',', array_map(function($val){return str_repeat('I', $val);}, $range));
echo $str;
The output should be something like: I,II,III,IIII,IIIII,IIIIII,IIIIIII.
I've come up with the following regex: ^(?=(I+))(^\1|,\2I|\2I)+$. So what does this mean ?
^ # match begin of string
(?= # positive lookahead, we use this to not "eat" the match
(I+) # match I one or more times and put it in group 1
) # end of lookahead
( # start matching group 2
^\1 # match begin of string followed by what's matched in group 1
| # or
,\2I # match a comma, with what's matched in group 2 (recursive !) and an I
| # or
\2I # match what's matched in group 2 and an I
)+ # repeat one or more times
$ # match end of line
Let's see what's actually happening ....
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^
(I+) do not eat but match I and put it in group 1
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^
^\1 match what was matched in group 1, which means I gets matched
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^^^ ,\2I match what was matched in group 1 (one I in thise case) and add an I to it
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^^^^ \2I match what was matched previously in group 2 (,II in this case) and add an I to it
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^^^^^ \2I match what was matched previously in group 2 (,III in this case) and add an I to it
We're moving forward since there is a + sign which means match one or more times,
this is actually a recursive regex.
We put the $ to make sure it's the end of string
If the number of I's don't correspond, then the regex will fail.
See it working and failing. And Let's put it in PHP code:
$range = range(0,7);
if($range[0] === 0){
unset($range[0]);
}
$str = implode(',', array_map(function($val){return str_repeat('I', $val);}, $range));
if(preg_match('#^(?=(I*))(^\1|,\2I|\2I)+$#', $str)){
echo 'works !';
}else{
echo 'fails !';
}
Now let's take in account to return the number that's missing, we will remove the $ end character to make our regex not fail, and we use group 2 to return the missed number:
$range = range(0,7);
if($range[0] === 0){
unset($range[0]);
}
unset($range[2]); // remove 2
$str = implode(',', array_map(function($val){return str_repeat('I', $val);}, $range));
preg_match('#^(?=(I*))(^\1|,\2I|\2I)+#', $str, $m); // REGEEEEEX !!!
$n = strlen($m[2]); //get the length ie the number
$sum = array_sum($range); // array sum
if($n == $sum){
echo $n + 1; // no missing number
}else{
echo $n - 1; // missing number
}
Online demo
EDIT: NOTE
This question is about performance. Functions like array_diff and array_filter are not magically fast. They can add a huge time penalty. Replacing a loop in your code with a call to array_diff will not magically make things fast, and will probably make things slower. You need to understand how these functions work if you intend to use them to speed up your code.
This answer uses the assumption that no items are duplicated and no invalid elements exist to allow us to use the position of the element to infer its expected value.
This answer is theoretically the fastest possible solution if you start with a sorted list. The solution posted by Jack is theoretically the fastest if sorting is required.
In the series [0,1,2,3,4,...], the n'th element has the value n if no elements before it are missing. So we can spot-check at any point to see if our missing element is before or after the element in question.
So you start by cutting the list in half and checking to see if the item at position x = x
[ 0 | 1 | 2 | 3 | 4 | 5 | 7 | 8 | 9 ]
^
Yup, list[4] == 4. So move halfway from your current point the end of the list.
[ 0 | 1 | 2 | 3 | 4 | 5 | 7 | 8 | 9 ]
^
Uh-oh, list[6] == 7. So somewhere between our last checkpoint and the current one, one element was missing. Divide the difference in half and check that element:
[ 0 | 1 | 2 | 3 | 4 | 5 | 7 | 8 | 9 ]
^
In this case, list[5] == 5
So we're good there. So we take half the distance between our current check and the last one that was abnormal. And oh.. it looks like cell n+1 is one we already checked. We know that list[6]==7 and list[5]==5, so the element number 6 is the one that's missing.
Since each step divides the number of elements to consider in half, you know that your worst-case performance is going to check no more than log2 of the total list size. That is, this is an O(log(n)) solution.
If this whole arrangement looks familiar, It's because you learned it back in your second year of college in a Computer Science class. It's a minor variation on the binary search algorithm--one of the most widely used index schemes in the industry. Indeed this question appears to be a perfectly-contrived application for this searching technique.
You can of course repeat the operation to find additional missing elements, but since you've already tested the values at key elements in the list, you can avoid re-checking most of the list and go straight to the interesting ones left to test.
Also note that this solution assumes a sorted list. If the list isn't sorted then obviously you sort it first. Except, binary searching has some notable properties in common with quicksort. It's quite possible that you can combine the process of sorting with the process of finding the missing element and do both in a single operation, saving yourself some time.
Finally, to sum up the list, that's just a stupid math trick thrown in for good measure. The sum of a list of numbers from 1 to N is just N*(N+1)/2. And if you've already determined that any elements are missing, then obvously just subtract the missing ones.
Technically, you can't really do without the loop (unless you only want to know if there's a missing number). However, you can accomplish this without first sorting the array.
The following algorithm uses O(n) time with O(n) space:
$range = [0, 1, 2, 3, 4, 6, 7];
$N = count($range);
$temp = str_repeat('0', $N); // assume all values are out of place
foreach ($range as $value) {
if ($value < $N) {
$temp[$value] = 1; // value is in the right place
}
}
// count number of leading ones
echo strspn($temp, '1'), PHP_EOL;
It builds an ordered identity map of N entries, marking each value against its position as "1"; in the end all entries must be "1", and the first "0" entry is the smallest value that's missing.
Btw, I'm using a temporary string instead of an array to reduce physical memory requirements.
I honestly don't get why you wouldn't want to use a loop. There's nothing wrong with loops. They're fast, and you simply can't do without them. However, in your case, there is a way to avoid having to write your own loops, using PHP core functions. They do loop over the array, though, but you simply can't avoid that.
Anyway, I gather what you're after, can easily be written in 3 lines:
function highestPlus(array $in)
{
$compare = range(min($in), max($in));
$diff = array_diff($compare, $in);
return empty($diff) ? max($in) +1 : $diff[0];
}
Tested with:
echo highestPlus(range(0,11));//echoes 12
$arr = array(9,3,4,1,2,5);
echo highestPlus($arr);//echoes 6
And now, to shamelessly steal Pé de Leão's answer (but "augment" it to do exactly what you want):
function highestPlus(array $range)
{//an unreadable one-liner... horrid, so don't, but know that you can...
return min(array_diff(range(0, max($range)+1), $range)) ?: max($range) +1;
}
How it works:
$compare = range(min($in), max($in));//range(lowest value in array, highest value in array)
$diff = array_diff($compare, $in);//get all values present in $compare, that aren't in $in
return empty($diff) ? max($in) +1 : $diff[0];
//-------------------------------------------------
// read as:
if (empty($diff))
{//every number in min-max range was found in $in, return highest value +1
return max($in) + 1;
}
//there were numbers in min-max range, not present in $in, return first missing number:
return $diff[0];
That's it, really.
Of course, if the supplied array might contain null or falsy values, or even strings, and duplicate values, it might be useful to "clean" the input a bit:
function highestPlus(array $in)
{
$clean = array_filter(
$in,
'is_numeric'//or even is_int
);
$compare = range(min($clean), max($clean));
$diff = array_diff($compare, $clean);//duplicates aren't an issue here
return empty($diff) ? max($clean) + 1; $diff[0];
}
Useful links:
The array_diff man page
The max and min functions
Good Ol' range, of course...
The array_filter function
The array_map function might be worth a look
Just as array_sum might be
$range = array(0,1,2,3,4,6,7);
// sort just in case the range is not in order
asort($range);
$range = array_values($range);
$indexes = array_keys($range);
$diff = array_diff($indexes,$range);
echo $diff[0]; // >> will print: 5
// if $diff is an empty array - you can print
// the "maximum value of the range plus one": $range[count($range)-1]+1
echo min(array_diff(range(0, max($range)+1), $range));
Simple
$array1 = array(0,1,2,3,4,5,6,7);// array with actual number series
$array2 = array(0,1,2,4,6,7); // array with your custom number series
$missing = array_diff($array1,$array2);
sort($missing);
echo $missing[0];
$range = array(0,1,2,3,4,6,7);
$max=max($range);
$expected_total=($max*($max+1))/2; // sum if no number was missing.
$actual_total=array_sum($range); // sum of the input array.
if($expected_total==$actual_total){
echo $max+1; // no difference so no missing number, then echo 1+ missing number.
}else{
echo $expected_total-$actual_total; // the difference will be the missing number.
}
you can use array_diff() like this
<?php
$range = array("0","1","2","3","4","6","7","9");
asort($range);
$len=count($range);
if($range[$len-1]==$len-1){
$r=$range[$len-1];
}
else{
$ref= range(0,$len-1);
$result = array_diff($ref,$range);
$r=implode($result);
}
echo $r;
?>
function missing( $v ) {
static $p = -1;
$d = $v - $p - 1;
$p = $v;
return $d?1:0;
}
$result = array_search( 1, array_map( "missing", $ARRAY_TO_TEST ) );

How to compare two very large strings [duplicate]

This question already has answers here:
Closed 10 years ago.
How can I compare the two large strings of size 50Kb each using php. I want to highlight the differentiating bits.
Differences between two strings can also be found using XOR:
$s = 'the sky is falling';
$t = 'the pie is failing';
$d = $s ^ $t;
echo $s, "\n";
for ($i = 0, $n = strlen($d); $i != $n; ++$i) {
echo $d[$i] === "\0" ? ' ' : '#';
}
echo "\n$t\n";
Output:
the sky is falling
### #
the pie is failing
The XOR operation will result in a string that has '\0' where both strings are the same and something not '\0' if they're different. It won't be faster than just comparing both strings character by character, but it'd be useful if you want to just know the first character that's different by using strspn().
Do you want to output like diff?
Perhaps this is what you want https://github.com/paulgb/simplediff/blob/5bfe1d2a8f967c7901ace50f04ac2d9308ed3169/simplediff.php
ADDED:
Or if you want to highlight every character that is different, you can use a PHP script like this:
for($i=0;$i<strlen($string1);$i++){
if($string1[$i]!=$string2[$i]){
echo "Char $i is different ({$string1[$i]}!={$string2[$i]}<br />\n";
}
}
Perhaps if you can tell us in detail how you would like to compare, or give us some examples, it would be easier for us to decide the answer.
A little modification to #Alvin's script:
I tested it in my local server with a 50kb lorem ipsum string, i substituted all "a" to "4" and it highlight them. It runs pretty fast
<?php
$string1 = "This is a sample text to test a script to highlight the differences between 2 strings, so the second string will be slightly different";
$string2 = "This is 2 s4mple text to test a scr1pt to highlight the differences between 2 strings, so the first string will be slightly different";
for($i=0;$i<strlen($string1);$i++){
if($string1[$i]!=$string2[$i]){
$string3[$i] = "<mark>{$string1[$i]}</mark>";
$string4[$i] = "<mark>{$string2[$i]}</mark>";
}
else {
$string3[$i] = "{$string1[$i]}";
$string4[$i] = "{$string2[$i]}";
}
}
$string3 = implode("",$string3);
$string4 = implode("",$string4);
echo "$string3". "<br />". $string4;
?>

Categories