Related
I am doing this to echo the minimum value in an array...
$array = [
[
'a' => 0,
'f' => 0,
'f' => 0,
'l' => 61.60
],
[
'a' => 38,
'f' => 0,
'f' => 0,
'l' => 11.99
],
[
'a' => 28,
'f' => 0,
'f' => 0,
'l' => 3.40
]
];
$min = min(array_column($array, 'a'));
echo $min;
Now I want to exclude 0 from the results, I know I can use array_filter to achieve this but do i need to process the array twice?
Yes, this will do:
$min = min(array_filter(array_column($array, 'a')));
It will iterate the array three times, once for each function.
You can use array_reduce to do it in one iteration:
$min = array_reduce($array, function ($min, $val) {
return $min === null || ($val['a'] && $val['a'] < $min) ? $val['a'] : $min;
});
Whether that's faster or not must be benchmarked, a PHP callback function may after all be slower than three functions in C.
A somewhat more efficient solution without the overhead of a function call would be a good ol' loop:
$min = null;
foreach ($array as $val) {
if ($min === null || ($val['a'] && $val['a'] < $min)) {
$min = $val['a'];
}
}
In the end you need to benchmark and decide on the correct tradeoff of performance vs. readability. In practice, unless you have positively humongous datasets, the first one-liner will probably do just fine.
A solution using array_reduce() to walk the array only once.
$min = array_reduce(
$array,
function($acc, array $item) {
return min($acc, $item['a'] ?: INF);
},
INF
);
How it works:
It starts with +INF as the partial minimum value. All the values it encounters in array are, theoretically smaller than that.
The callback function ignores the items having 0 (or another value that is equal to FALSE when evaluated as boolean). The expression $item['a'] ?: INF uses INF (infinity) instead of $item['a'] to avoid altering the partial result (to ignore 0 values).
It returns the minimum between the current partial minimum (passed by array_reduce() in parameter $acc) and the value of the current item, as explained above.
The value in $min is the minimum of the not FALSE-ey values in column a of the items in $array. If all these values are 0 (FALSE), the value returned in $min is INF.
This is not an answer but the format of its content cannot be provided by a comment. It also cannot stay in my answer as it is technically not part of it.
I generated a benchmark for the three solutions provided by #deceze and my solution and ran it using PHP 7.0. Everything below applies only to PHP 7.x.
PHP 5 runs much slower and it requires more memory.
I started by running the code 1,000,000 times over a small list of 100 items then I iteratively divided the number of iteration by 10 while multiplied the list length by 10.
Here are the results:
$ php bench.php 100 1000000
Generating 100 elements... Done. Time: 0.000112 seconds.
array_filter(): 3.265538 seconds/1000000 iterations. 0.000003 seconds/iteration.
foreach : 3.771463 seconds/1000000 iterations. 0.000004 seconds/iteration.
reduce #deceze: 6.869162 seconds/1000000 iterations. 0.000007 seconds/iteration.
reduce #axiac : 8.599051 seconds/1000000 iterations. 0.000009 seconds/iteration.
$ php bench.php 1000 100000
Generating 1000 elements... Done. Time: 0.000750 seconds.
array_filter(): 3.024423 seconds/100000 iterations. 0.000030 seconds/iteration.
foreach : 3.997505 seconds/100000 iterations. 0.000040 seconds/iteration.
reduce #deceze: 6.669426 seconds/100000 iterations. 0.000067 seconds/iteration.
reduce #axiac : 8.342756 seconds/100000 iterations. 0.000083 seconds/iteration.
$ php bench.php 10000 10000
Generating 10000 elements... Done. Time: 0.002643 seconds.
array_filter(): 2.913948 seconds/10000 iterations. 0.000291 seconds/iteration.
foreach : 4.190049 seconds/10000 iterations. 0.000419 seconds/iteration.
reduce #deceze: 9.649768 seconds/10000 iterations. 0.000965 seconds/iteration.
reduce #axiac : 11.236113 seconds/10000 iterations. 0.001124 seconds/iteration.
$ php bench.php 100000 1000
Generating 100000 elements... Done. Time: 0.042237 seconds.
array_filter(): 90.369577 seconds/1000 iterations. 0.090370 seconds/iteration.
foreach : 15.487466 seconds/1000 iterations. 0.015487 seconds/iteration.
reduce #deceze: 19.896064 seconds/1000 iterations. 0.019896 seconds/iteration.
reduce #axiac : 15.056250 seconds/1000 iterations. 0.015056 seconds/iteration.
For lists up to about 10,000 elements, the results are consistent and they match the expectations: array_filter() is the fastest, foreach comes close then the array_reduce() solutions aligned by the number of functions they call (#deceze's is faster as it doesn't call any function, mine's calls min() once). Even the total running time feels consistent.
The value of 90 seconds for the array_filter() solution for 100,000 items in the list looks out of place but it has a simple explanation: both array_filter() and array_column() generate new arrays. They allocate memory and copy the data and this takes time. Add the time needed by the garbage collector to free all the small memory blocks used by a list of 10,000 small arrays and the running time will go up faster.
Another interesting result for 100,000 items array is that my solution using array_reduce() is as fast as the foreach solution and better than #deceze's solution using array_reduce(). I don't have an explanation for this result.
I tried to find out some thresholds when these things start to happen. For this I ran the benchmark with different list sizes, starting from 5,000 and increasing the size by 1,000 while keeping the total number of visited items to 100,000,000. The results can be found here.
The results are surprising. For some sizes of the list (8,000, 11,000, 12,000, 13,000, 17,000 items), the array_filter() solution needs about 10 times more time to complete than any solution that uses array_reduce(). For other list sizes, however, it goes back to the track and completes the 100 million node visits in about 3 seconds while the time needed by the other solutions constantly increases as the list length increases.
I suspect the culprit for the hops in the time needed by the array_filter() solution is the PHP's memory allocation strategy. For some lengths of the initial array, the temporary arrays returned by array_column() and array_filter() probably trigger more memory allocation and garbage cleanup cycles than for other sizes. Of course, it is possible that the same behaviour happens on other sizes I didn't test.
Somewhere around 16,000...17,000 items in the list, my solution starts running faster than #deceze's solution using array_reduce() and around 25.000 it starts performing equally fast as the foreach solution (and even faster sometimes).
Also for lists longer than 16,000-17,000 items the array_filter() solution consistently needs more time to complete than the others.
The benchmark code can be found here. Unfortunately it cannot be executed on 3v4l.org for lists larger than 15,000 elements because it reaches the memory limit imposed by the system.
Its results for lists larger than 5,000 items can be found here.
The code was executed using PHP 7.0.20 CLI on Linux Mint 18.1. No APC or other kind of cache was involved.
Conclusion
For small lists, up to 5,000 items, use the array_filter(array_column()) solution as it performs well for this size of the list and it looks neat.
For lists larger than 5,000 items switch to the foreach solution. It doesn't look well but it runs fast and it doesn't need extra memory. Stick to it as the list size increases.
For hackatons, interviews and to look smart to your colleagues, use any array_reduce() solution. It shows your knowledge about PHP array functions and your understanding of the "callback" programming concept.
Try with array_flip and unset() php function
like this
$array = [
[
'a' => 0,
'f' => 0,
'f' => 0,
'l' => 61.60
],
[
'a' => 38,
'f' => 0,
'f' => 0,
'l' => 11.99
],
[
'a' => 28,
'f' => 0,
'f' => 0,
'l' => 3.40
]
];
$min = array_flip(array_column($array, 'a'));
unset($min[0]);
$min=min(array_flip($min));
o/p
28
You can use sort:
$array = [
[
'a' => 0,
'f' => 0,
'f' => 0,
'l' => 61.60
],
[
'a' => 38,
'f' => 0,
'f' => 0,
'l' => 11.99
],
[
'a' => 28,
'f' => 0,
'f' => 0,
'l' => 3.40
]
];
$array = array_column($array, 'a');
sort($array);
echo $array[1];
I have an array with data like below:
$data=array(
array(1,1),
array(1,2),
array(1,3),
array(1,4),
array(1,5),
array(1,6),
array(1,7)
);
And I want to apply some operands to group of data, for example
(pseudo-code)
// get all indexes , after the second index of each index
$d = $data[all indexes][1] /10 ; //devided by 10
// multiple the second index of first to fifteenth indexes by 2
$d2=$data[0-15][1] * 2;
I know I can use foreach or any other loops , but i'm looking for a better way .
Well, I don't really get why you don't want to use loops, but if it makes you feel better you can use a "loop in disguise"!
PHP offers some functions to perform a repetitive task in an array, for example:
arrray_map - Keeps the base array intact
array_walk - Changes the base array
Example with array_map
Code
$newData = array_map (function($subArray)
{
return $subArray[1] / 10;
}, $data);
output
array (size=7)
0 => float 0.1
1 => float 0.2
2 => float 0.3
3 => float 0.4
4 => float 0.5
5 => float 0.6
6 => float 0.7
Answer to OP comment:
If it's a performance problem, then array_map is actually slower than foreach or for loops.
I don't think PHP has built in library to deal with matrices operations. However, a quick search in PEAR revealed this extension:
http://pear.php.net/package/Math_Matrix
I've never used so I don't know if it's any good, and the package isn't maintained anymore
Here is a snippet of code where I'm using gmp_prob_prime. Even though I'm currently only testing numbers in the 10^6 range this function VERY regularly "fails" my QuickTest and ends up needing to do a manual check of $NumberToTest for primality.
Is gmp_prob_prime not very robust? I didn't expect it to suggest "probable prime" until I was in the 10^9 or even 10^12 range.
Here is the snippet of my code's function that is being called:
function IsPrime($DocRoot, $NumberToTest, $PowOf2)
{
// First a quick test...
// 0 = composite
// 1 = probable prime
// 2 = definite prime
$Reps = 15;
$QuickTest = gmp_prob_prime($NumberToTest,$Reps);
if( $QuickTest == 0 )
{
return 0;
}
if ( $QuickTest == 2 )
{
return 1;
}
// If we get to here then gmp_prob_prime isn't sure whether the $NumberToTest is prime or not.
print "Consider increasing the Reps for gmp_prob_prime.\n";
// Find the sqrt of $NumberToTest;
... code continues ...
I had the same behavior when calling mpz_probab_prime_p directly from C++, but I can't recall if the below information fixed it or not (copied from the manual).
Function: int mpz_probab_prime_p (const mpz_t n, int reps)
Determine whether n is prime. Return 2 if n is definitely prime, return 1 if n is probably prime (without being certain), or return 0 if n is definitely composite.
This function does some trial divisions, then some Miller-Rabin probabilistic primality tests. The argument reps controls how many such tests are done; a higher value will reduce the chances of a composite being returned as “probably prime”. 25 is a reasonable number; a composite number will then be identified as a prime with a probability of less than 2^(-50).
Miller-Rabin and similar tests can be more properly called compositeness tests. Numbers which fail are known to be composite but those which pass might be prime or might be composite. Only a few composites pass, hence those which pass are considered probably prime.
Sorry for the title. I wasn't sure how to ask this question.
I have a form on a website that asks a question. The answers are in check box form. Each answer is saved into my database with a 'score', the values look like this:
Allergy 1
Cardiology 2
Chest Disease 4
Dermatology 8
Emergency Room 16
Ambulance Trips 32
Gastroenterology 64
General Medicine 128
Gynecology 256
Hematology 512
Neurology 1024
Obstetrics 2048
Opthamology 4096
Orthopedics 8192
Physical Therapy 16384
Plastic Surgery 32768
Podiatry 65536
Proctology 131072
Psychiatry 262144
Surgery Performed 524288
Thoracic Surgery 1048576
Urology 2097152
Outside X-Rays 4194304
Diagnostic Tests (outside) 8388608
As you can see, the score is the previous value times two. When a user fills out the form, the answer is saved in the database as one value - all the answers added together.
For example, a user selected the values: Allergy, General Medicine, Hematology, Obstetrics. In the database, the answer for this question is saved as 2689.
Is there a way to figure out what answers have been selected by only having the answer to the question?
For example, I would query my database and pull the 2689 value, and I need to determine what answers were checked.
edit: I was hoping to reverse engineer the answer in PHP.
Yes, this is a common pattern called bit masking. Use your language's binary AND operator on the value corresponding to a given answer and the value submitted from the form to see if the given answer was one of the selected choices. For example, if the answer submitted and saved is 2689 as in your example, you can check whether "chest disease" was one of the selected choices by seeing if 2689 & 4 is nonzero. (& should be substituted with whatever the binary AND operator is in your language of choice.)
Note that this only works as long as all the values corresponding to individual choices are powers of 2. In general, the question posed in your title, about finding out what numbers from a given set have been added to come up with a given sum, is an instance of something called the knapsack problem and is only known to be solvable by checking every possible combination, which is very inefficient. (NP-complete, specifically)
You can find the values by ANDing with powers of 2.
20 = 1
21 = 2
22 = 4
23 = 8
...
223 = 8388608
You can find out the value of 2n using binary shifting like this: 1 << n
php like code:
$item[] = {"Allergy", "Cardiology", ..., "Diagnostic Tests (outside)"};
$answer = 2689;
for ( $power = 0; $power < count($item); $power++ ) {
if ( 1 << $power & $answer ) {
echo $item[$power] . "\n";
}
}
Edit: made it more php friendly
Yes, there is. Note that each k'th "score" is of the form 2^(k - 1), which corresponds to a bitstring with only the k'th bit set. If you know which bits are set, you can reconstruct the sum.
Taking 2689 as an example, we first need to write it out in binary:
2689 = 101010000001b
Counting from the right, we see that the first, eighth, tenth and twelfth bits are set, so (as you can verify)
2689 = 2^0 + 2^7 + 2^9 + 2^11
= 1 + 128 + 512 + 2048
The actual implementation of this can be done efficiently using bitwise operations. By taking the AND of the value and each of the "scores" in turn, then checking whether that gives a non-zero value, we can check which scores went into the sum.
this will do exactly what you wanted :
<?php
Print bindecValues("2689");
function bindecValues($decimal, $reverse=false, $inverse=false) {
$bin = decbin($decimal);
if ($inverse) {
$bin = str_replace("0", "x", $bin);
$bin = str_replace("1", "0", $bin);
$bin = str_replace("x", "1", $bin);
}
$total = strlen($bin);
$stock = array();
for ($i = 0; $i < $total; $i++) {
if ($bin{$i} != 0) {
$bin_2 = str_pad($bin{$i}, $total - $i, 0);
array_push($stock, bindec($bin_2));
}
}
$reverse ? rsort($stock):sort($stock);
return implode(", ", $stock);
}
?>
Happy coding
Remember that integers are stored in binary - so each of these flags (Allergy = 1) etc. will correspond to a single bit being true or false in the binary representation of the sum.
For example, 2689 in binary is 0000 1010 1000 0001 which, if you think of it as an array of bits, where the least significant bit (right most in that array) is the least significant flag (allergy) then we can easily see that the first (allergy), eighth (gen. medicine), tenth (hematology) and twelfth (obs) slots of the array are marked with a 1 for true.
The largest value in your array of flags is 24th bit in a 32 bit integer. You could define up to 8 more flags in this system before having to use a larger integer.
Since all your numbers seem to be powers of two, you just need to store the input value in a long enough integer to hold it, then bit mask.
if( value & 1 ) then 1 was part of the selection
if( value & 2 ) then 2 was part of the selection
if( value & 3 ) then 3 was part of the selection
and so on
I am doing this programming challenge which can be found at www.interviewstreet.com (its the first challenge worth 30 points).
When I submitted the solution, I was returned a result which said that the answer was wrong because it only passed 1/11 test cases. However, I feel have tested various cases and do not understand what I am doing wrong. It would be helpful to know what those test cases could be so that I can test my program.
Here is the question (in between the grey lines below):
Quadrant Queries (30 points)
There are N points in the plane. The ith point has coordinates (xi, yi). Perform the following queries:
1) Reflect all points between point i and j both including along the X axis. This query is represented as "X i j"
2) Reflect all points between point i and j both including along the Y axis. This query is represented as "Y i j"
3) Count how many points between point i and j both including lie in each of the 4 quadrants. This query is represented as "C i j"
Input:
The first line contains N, the number of points. N lines follow.
The ith line contains xi and yi separated by a space.
The next line contains Q the number of queries. The next Q lines contain one query each, of one of the above forms.
All indices are 1 indexed.
Output:
Output one line for each query of the type "C i j". The corresponding line contains 4 integers; the number of points having indices in the range [i..j] in the 1st,2nd,3rd and 4th quadrants respectively.
Constraints:
1 <= N <= 100000
1 <= Q <= 100000
You may assume that no point lies on the X or the Y axis.
All (xi,yi) will fit in a 32-bit signed integer
In all queries, 1 <=i <=j <=N
Sample Input:
4
1 1
-1 1
-1 -1
1 -1
5
C 1 4
X 2 4
C 3 4
Y 1 2
C 1 3
Sample Output:
1 1 1 1
1 1 0 0
0 2 0 1
Explanation:
When a query says "X i j", it means that take all the points between indices i and j both including and reflect those points along the X axis. The i and j here have nothing to do with the co-ordinates of the points. They are the indices. i refers to point i and j refers to point j
'C 1 4' asks you to 'Consider the set of points having index in {1,2,3,4}. Amongst those points, how many of them lie in the 1st,2nd,3rd and 4th quads respectively?'
The answer to this is clearly 1 1 1 1.
Next we reflect the points between indices '2 4' along the X axis. So the new coordinates are :
1 1
-1 -1
-1 1
1 1
Now 'C 3 4' is 'Consider the set of points having index in {3,4}. Amongst those points, how many of them lie in the 1st,2nd,3rd and 4th quads respectively?' Point 3 lies in quadrant 2 and point 4 lies in quadrant 1.
So the answer is 1 1 0 0
I'm coding in PHP and the method for testing is with STDIN and STDOUT.
Any ideas on difficult test cases to test my code with? I don't understand why I am failing 10 / 11 test cases.
Also, here is my code if you're interested:
// The global variable that will be changed
$points = array();
/******** Functions ********/
// This function returns the number of points in each quadrant.
function C($beg, $end) {
// $quad_count is a local array and not global as this gets reset for every C operation
$quad_count = array("I" => 0, "II" => 0, "III" => 0, "IV" => 0);
for($i=$beg; $i<$end+1; $i++) {
$quad = checkquad($i);
$quad_count[$quad]++;
}
return $quad_count["I"]." ".$quad_count["II"]." ".$quad_count["III"]." ".$quad_count["IV"];
}
// Reflecting over the x-axis means taking the negative value of y for all given points
function X($beg, $end) {
global $points;
for($i=$beg; $i<$end+1; $i++) {
$points[$i]["y"] = -1*($points[$i]["y"]);
}
}
// Reflecting over the y-axis means taking the negative value of x for all given points
function Y($beg, $end) {
global $points;
for($i=$beg; $i<$end+1; $i++) {
$points[$i]["x"] = -1*($points[$i]["x"]);
}
}
// Determines which quadrant a given point is in
function checkquad($i) {
global $points;
$x = $points[$i]["x"];
$y = $points[$i]["y"];
if ($x > 0) {
if ($y > 0) {
return "I";
} else {
return "IV";
}
} else {
if ($y > 0) {
return "II";
} else {
return "III";
}
}
}
// First, retrieve the number of points that will be provided. Make sure to check constraints.
$no_points = intval(fgets(STDIN));
if ($no_points > 100000) {
fwrite(STDOUT, "The number of points cannot be greater than 100,000!\n");
exit;
}
// Remember the points are 1 indexed so begin key from 1. Store all provided points in array format.
for($i=1; $i<$no_points+1; $i++) {
global $points;
list($x, $y) = explode(" ",fgets(STDIN)); // Get the string returned from the command line and convert to an array
$points[$i]["x"] = intval($x);
$points[$i]["y"] = intval($y);
}
// Retrieve the number of operations that will be provied. Make sure to check constraints.
$no_operations = intval(fgets(STDIN));
if($no_operations > 100000) {
fwrite(STDOUT, "The number of operations cannot be greater than 100,000!\n");
exit;
}
// Retrieve the operations, determine the type and send to the appropriate functions. Make sure i <= j.
for($i=0; $i<$no_operations; $i++) {
$operation = explode(" ",fgets(STDIN));
$type = $operation[0];
if($operation[1] > $operation[2]) {
fwrite(STDOUT, "Point j must be further in the sequence than point i!\n");
exit;
}
switch ($type) {
case "C":
$output[$i] = C($operation[1], $operation[2]);
break;
case "X":
X($operation[1], $operation[2]);
break;
case "Y":
Y($operation[1], $operation[2]);
break;
default:
$output[$i] = "Sorry, but we do not recognize this operation. Please try again!";
}
}
// Print the output as a string
foreach($output as $line) {
fwrite(STDOUT, $line."\n");
}
UPDATE:
I finally found a test case for which my program fails. Now I am trying to determine why. This is a good lesson on testing with large numbers.
10
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
12
C 1 10
X 1 3
C 5 5
Y 2 10
C 10 10
C 1 10
X 1 3
C 5 5
Y 2 10
C 10 10
X 3 7
C 9 9
I am going to test this properly by initializing an error array and determining which operations are causing an issue.
I discovered a test case that failed and understood why. I am posting this answer here so it's clear to everyone.
I placed a constraint on the program so that j must be greater than i, otherwise an error should be returned. I noticed an error with the following test case:
10
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1
C 2 10
The error returned for the operation C. Essentially the program believed that "2" was greater than "10". The reason for this I discovered was the following:
When using fgets(), a string is returned. If you perform string operations such as explode() or substr() on that line, you are converting the numbers in that initial string into a string again. So this means that the 10 becomes "10" and then after string operations becomes "0".
One solution to this is to use the sscanf() function and basically tell the program to expect a number. Example: for "C 2 10" you could use:
$operation_string = fgets(STDIN);
list($type, $begpoint, $endpoint) = sscanf($operation_string, "%s %d %d");
I submitted the new solution using sscanf() and now have 3/11 test cases passed. It did not check any more test cases because the CPU time limit was exceeded. So, now I have to go back and optimize my algorithm.
Back to work! :)
To answer, "What are those test cases?" Try this "solution":
<?php
$postdata = http_build_query(
array(
'log' => file_get_contents('php://stdin')
)
);
$opts = array('http' =>
array(
'method' => 'POST',
'header' => 'Content-type: application/x-www-form-urlencoded',
'content' => $postdata
)
);
$context = stream_context_create($opts);
file_get_contents('http://myserver/answer.php', false, $context);
?>
On your server:
<?php
$fp = fopen('/tmp/answers.log', 'a');
fputs($fp, $_POST['log']."\n");
fclose($fp);
?>
Edit:
I did that. And came up with this being your main problem (I think):
$operation = explode(" ",fgets(STDIN));
Change that to:
$operation = explode(" ",trim(fgets(STDIN)));
Because otherwise "9" > "41 " due to string comparison. You should make that fix in any place you read a line.
As far as I guess, this solution won't work. Even if you solve the Wrong Answer problem, the solution will time out.
I was able to figure out a way for returning the quadrants count in O(1) time.
But not able to make the reflections in lesser time. :(