Trying to understand array_diff_uassoc optimization - php

It seems that arrays sorted before comparing each other inside array_diff_uassoc.
What is the benefit of this approach?
Test script
function compare($a, $b)
{
echo("$a : $b\n");
return strcmp($a, $b);
}
$a = array('a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5);
$b = array('v' => 1, 'w' => 2, 'x' => 3, 'y' => 4, 'z' => 5);
var_dump(array_diff_uassoc($a, $b, 'compare'));
$a = array('a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5);
$b = array('d' => 1, 'e' => 2, 'f' => 3, 'g' => 4, 'h' => 5);
var_dump(array_diff_uassoc($a, $b, 'compare'));
$a = array('a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5);
$b = array('a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5);
var_dump(array_diff_uassoc($a, $b, 'compare'));
$a = array('a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5);
$b = array('e' => 5, 'd' => 4, 'c' => 3, 'b' => 2, 'a' => 1);
var_dump(array_diff_uassoc($a, $b, 'compare'));
http://3v4l.org/DKgms#v526
P.S. it seems that sorting algorithm changed in php7.

Sorting algorithm didn't change in PHP 7. Elements are just passed in another order to the sorting algorithm for some performance improvements.
Well, benefit could be an eventual faster execution. You really hit worst case when both arrays have completely other keys.
Worst case complexity is twice sorting the arrays and then comparisons of each key of the two arrays. O(n*m + n * log(n) + m * log(m))
Best case is twice sorting and then just as many comparisons as there are elements in the smaller array. O(min(m, n) + n * log(n) + m * log(m))
In case of a match, you wouldn't have to compare against the full array again, but only from the key after the match on.
But in current implementation, the sorting is just redundant. Implementation in php-src needs some improvement I think. There's no outright bug, but implementation is just bad. If you understand some C: http://lxr.php.net/xref/PHP_TRUNK/ext/standard/array.c#php_array_diff
(Note that that function is called via php_array_diff(INTERNAL_FUNCTION_PARAM_PASSTHRU, DIFF_ASSOC, DIFF_COMP_DATA_INTERNAL, DIFF_COMP_KEY_USER); from array_diff_uassoc)

Theory
Sorting allows for a few shortcuts to be made; for instance:
A | B
-------+------
1,2,3 | 4,5,6
Each element of A will only be compared against B[0], because the other elements are known to be at least as big.
Another example:
A | B
-------+-------
4,5,6 | 1,2,6
In this case, the A[0] is compared against all elements of B, but A[1] and A[2] are compared against B[2] only.
If any element of A is bigger than all elements in B you will get the worst performance.
Practice
While the above works well for the standard array_diff() or array_udiff(), once a key comparison function is used it will resort to O(n * m) performance because of this change while trying to fix this bug.
The aforementioned bug describes how custom key comparison functions can cause unexpected results when used with arrays that have mixed keys (i.e. numeric and string key values). I personally feel that this should've been addressed via the documentation, because you would get equally strange results with ksort().

Related

Get dense rank and gapped rank for all items in array

I want to calculate and store the dense rank and gapped rank for all entries in an array using PHP.
I want to do this in PHP (not MySQL because I am dealing with dynamic combinations 100,000 to 900 combinations per week, that’s why I cannot use MySQL to make that many tables.
My code to find the dense ranks is working, but the gapped ranks are not correct.
PHP code
$members = [
['num' => 2, 'rank' => 0, 'dense_rank' => 0],
['num' => 2, 'rank' => 0, 'dense_rank' => 0],
['num' => 3, 'rank' => 0, 'dense_rank' => 0],
['num' => 3, 'rank' => 0, 'dense_rank' => 0],
['num' => 3, 'rank' => 0, 'dense_rank' => 0],
['num' => 3, 'rank' => 0, 'dense_rank' => 0],
['num' => 3, 'rank' => 0, 'dense_rank' => 0],
['num' => 5, 'rank' => 0, 'dense_rank' => 0],
['num' => 9, 'rank' => 0, 'dense_rank' => 0],
['num' => 9, 'rank' => 0, 'dense_rank' => 0],
['num' => 9, 'rank' => 0, 'dense_rank' => 0]
];
$rank=0;
$previous_rank=0;
$dense_rank=0;
$previous_dense_rank=0;
foreach($members as &$var){
//star of rank
if($var['num']==$previous_rank){
$var['rank']=$rank;
}else{
$var['rank']=++$rank;
$previous_rank=$var['num'];
}//end of rank
//star of rank_dense
if($var['num']===$previous_dense_rank){
$var['dense_rank']=$dense_rank;
++$dense_rank;
}else{
$var['dense_rank']=++$dense_rank;
$previous_dense_rank=$var['num'];
}
//end of rank_dense
echo $var['num'].' - '.$var['rank'].' - '.$var['dense_rank'].'<br>';
}
?>
My flawed output is:
num
rank
dynamic rank
2
1
1
2
1
1
3
2
3
3
2
3
3
2
4
3
2
5
3
2
6
5
3
8
9
4
9
9
4
9
9
4
10
Notice when the error happens and there is a higher number in the number column it corrects the error in that row. See that when the number goes from 3 to 5.
Given that your results are already sorted in an ascending fashion...
For dense ranking, you need to only increment your counter when a new score is encountered.
For gapped ranking, you need to unconditionally increment your counter and use the counter value for all members with the same score.
??= is the "null coalescing assignment" operator (a breed of "combined operator"). It only allows the right side operand to be executed/used if the left side operand is not declared or is null. This is a technique of performing conditional assignments without needing to write a classic if condition.
Code: (Demo)
$denseRank = 0;
$gappedRank = 0;
foreach ($members as &$row) {
$denseRanks[$row['num']] ??= ++$denseRank;
$row['dense_rank'] = $denseRanks[$row['num']];
++$gappedRank;
$gappedRanks[$row['num']] ??= $gappedRank;
$row['rank'] = $gappedRanks[$row['num']];
// for better presentation:
echo json_encode($row) . "\n";
}
Output:
{"num":2,"rank":1,"dense_rank":1}
{"num":2,"rank":1,"dense_rank":1}
{"num":3,"rank":3,"dense_rank":2}
{"num":3,"rank":3,"dense_rank":2}
{"num":3,"rank":3,"dense_rank":2}
{"num":3,"rank":3,"dense_rank":2}
{"num":3,"rank":3,"dense_rank":2}
{"num":5,"rank":8,"dense_rank":3}
{"num":9,"rank":9,"dense_rank":4}
{"num":9,"rank":9,"dense_rank":4}
{"num":9,"rank":9,"dense_rank":4}
For the record, if you are dealing with huge volumes of data, I would be using SQL instead of PHP for this task.
It seems like you want the dynamic rank to be sequential?
Your sample data appears to be sorted, if this remains true for your real data then you can remove the conditional and just increment the variable as you assign it:
//start of rank_dense
$var['dense_rank']=++$dense_rank;
//end of rank_dense
It sounds like you're saying you won't be implementing a database.
Databases like MySQL can easily handle the workload numbers you outlined and they can sort your data as well. You may want to reconsider.

PHP Checkif two arrays have the same keys and same count of keys [duplicate]

This question already has answers here:
PHP - Check if two arrays are equal
(19 answers)
Check if two arrays have the same values (regardless of value order) [duplicate]
(13 answers)
How to check if PHP associative arrays are equal, ignoring key ordering?
(1 answer)
Closed 4 years ago.
I'm trying to match 2 arrays that look like below.
$system = array('blue' => 1, 'red' => 2, 'green' => 3, 'purple' => 4);
$public = array('blue' => 1, 'red' => 2, 'green' => 3, 'purple' => 4);
My problem is, I need the array keys of both arrays to be the same value and same count.
Which means:
// passes - both arrays have the same key values and same counts of each key
$system = array('blue' => 1, 'red' => 2, 'green' => 3, 'purple' => 4);
$public = array('blue' => 1, 'red' => 2, 'green' => 3, 'purple' => 4);
// fails - $public does not have 'blue' => 1
$system = array('blue' => 1, 'red' => 2, 'green' => 3, 'purple' => 4);
$public = array('red' => 2, 'green' => 3, 'purple' => 4);
// should fail - $public has 2 'blue' => 1
$system = array('blue' => 1, 'red' => 2, 'green' => 3, 'purple' => 4);
$public = array('blue' => 1, 'blue' => 1, 'red' => 2, 'green' => 3, 'purple' => 4);
I've tried using array_diff_keys, array_diff and other php functions, but none can catch extra keys with the same value (i.e. if 'blue' => 1, is repeated it still passes)
What's a good way to solve this?
When you write two values with same key in PHP, the second one will overwrite the value from the first (and this is not an error). Below is what I did on the PHP interactive CLI (run it with php -a):
php > $x = ["x" => 1, "x" => 2, "y" => 2];
php > var_dump($x);
array(2) {
["x"]=>
int(2)
["y"]=>
int(2)
}
So array_diff seems to be working correctly. You are just expecting PHP to behave in a different way than it actually does!

Storing JSON string to shopware store cookies

How is it possible to pass json string to cookie?
I have something like this in my Subscriber folder in shopware plugin:
$arr = array('a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5);
$this->setCookie($controller,json_encode($arr));
But it stores the following string:
%7B%22a%22%3A1%2C%22b%22%3A2%2C%22c%22%3A3%2C%22d%22%3A4%2C%22e%22%3A5%7D
I know that the problem is in php-json connection.

Data relationship - looking for solutions

I have a problem I need to solve and I'm sure there is a way of doing this, I'm just not exactly sure "what to search for" and how to find it.
I was thinking of doing this either in Excel or I could maybe try to make a PHP script to do it.
So basically, I have a set of substances. Each pair of substances is either compatible or incompatible with another one. So what I have is a table with rows and columns where there is either 0 or 1, i.e. compatible/incompatible.
Now what I want to do is try to find groups of substances, where all substances in that group are compatible with each other. And the goal is to find as large group as possible, or ideally, find the largest, second largest etc. and sort them from largest to smallest (given there could be some limitation for the minimum number of elements in that group).
I hope it makes sense, the problem is that I'm not sure how to solve it, but I think this is something that should be relatively commonly done and so I doubt the only way is writing a script/macro from scratch that would use brute force to do this. This would also probably not be very efficient as I have a table with over 30 elements.
So just to make it more clear, for example here is a simplified table of what my data looks like:
Substance A B C D
A 0 1 1 1
B 1 0 0 1
C 1 0 0 0
D 1 0 0 0
If you use only php without database you can use uasort to sort by sum all elements of related array.
<?php
$substances = [
'A' => [
'A' => 0,
'B' => 1,
'C' => 1,
'D' => 0,
],
'B' => [
'A' => 1,
'B' => 0,
'C' => 1,
'D' => 1,
],
'C' => [
'A' => 0,
'B' => 1,
'C' => 0,
'D' => 0,
]
];
uasort ($substances, function ($a, $b) {
$a = array_sum($a);
$b = array_sum($b);
if ($a == $b) {
return 0;
}
return ($a > $b) ? -1 : 1;
});
var_export($substances);

get integer / float from string in PHP

I ran into an issue with a data feed I need to import where for some reason the feed producer has decided to provide data that should clearly be either INT or FLOAT as strings-- like this:
$CASES_SOLD = "THREE";
$CASES_STOCKED = "FOUR";
Is there a way in PHP to interpret the text string as the actual integer?
EDIT: I should be more clear-- I need to have the $cases_sold etc. as an integer-- so I can then manipulate them as digits, store in database as INT, etc.
Use an associative array, for example:
$map = array("ONE" => 1, "TWO" => 2, "THREE" => 3, "FOUR" => 4);
$CASES_SOLD = $map["THREE"]; // 3
If you are only interested by "converting" one to nine, you may use the following code:
$convert = array('one' => 1,
'two' => 2,
'three' => 3,
'four' => 4,
'five' => 5,
'six' => 6,
'seven' => 7,
'eight' => 8,
'nine' => 9
);
echo $convert[strtolower($CASES_SOLD)]; // will display 3
If you only need the base 10 numerals, just make a map
$numberMap = array(
'ONE' => 1
, 'TWO' => 2
, 'THREE' => 3
// etc..
);
$number = $numberMap[$CASES_SOLD];
// $number == 3'
If you need something more complex, like interpreting Four Thousand Two Hundred Fifty Eight into 4258 then you'll need to roll up your sleeves and look at this related question.
Impress your fellow programmers by handling this in a totally obtuse way:
<?php
$text = 'four';
if(ereg("[[.$text.]]", "0123456789", $m)) {
$value = (int) $m[0];
echo $value;
}
?>
You need a list of numbers in english and then replace to string, but, you should play with 'thousand' and 'million' clause where must check if after string 'thousend-three' and remove integer from string.
You should play with this function and try change if-else and add some functionality for good conversion:
I'm writing now a simple code for basic, but you know others what should change, play!
Look at million, thousand and string AND, it should be change if no in string like '1345'. Than replace with str_replace each of them separaterly and join them to integer.
function conv($string)
{
$conv = array(
'ONE' => 1,
'TWO' => 2,
'THREE' => 3,
'FOUR' => 4,
'FIVE' => 5,
'SIX' => 6,
'SEVEN' => 7,
'EIGHT' => 8,
'NINE' => 9,
'TEN' => 10,
'ELEVEN' => 11,
'TWELVE' => 12,
'THIRTEEN' => 13,
'FOURTEEN' => 14,
'FIFTEEN' => 15,
'SIXTEEN' => 16,
'SEVENTEEN' => 17,
'EIGHTEEN' => 18,
'NINETEEN' => 19,
'TWENTY' => 20,
'THIRTY' => 30,
'FORTY' => 40,
'FIFTY' => 50,
'SIXTY' => 60,
'SEVENTY' => 70,
'EIGTHY' => 80,
'NINETY' => 90,
'HUNDRED' => 00,
'AND' => '',
'THOUSAND' => 000
'MILLION' => 000000,
);
if (stristr('-', $string))
{
$val = explode('-', $string);
#hardcode some programming logic for checkers if thousands, should if trim zero or not, check if another values
foreach ($conv as $conv_k => $conv_v)
{
$string[] = str_replace($conv_k, $conv_v, $string);
}
return join($string);
}
else
{
foreach ($conv as $conv_k => $conv_v)
{
$string[] = str_replace($conv_k, $conv_v, $string);
}
return join($string);
}
}
Basically what you want is to write a parser for the formal grammar that represents written numbers (up to some finite upper bound). Depending on how high you need to go, the parser could be as trivial as
$numbers = ('zero', 'one', 'two', 'three');
$input = 'TWO';
$result = array_search(strtolower($input), $numbers);
...or as involved as a full-blown parser generated by a tool as ANTLR. Since you probably only need to process relatively small numbers, the most practical solution might be to manually hand-code a small parser. You can take a look here for the ready-made grammar and implement it in PHP.
This is similar to Converting words to numbers in PHP
PHP doesn't have built in conversion functionality. You'd have to build your own logic based on switch statements or otherwise.
Or use an existing library like:
http://www.phpclasses.org/package/7082-PHP-Convert-a-string-of-English-words-to-numbers.html

Categories