PHP arrays memory consumptions - php

Please, explain me how it works. Why passing value to array from variable instead of literal increasing memory consumption in 10x times?
PHP 7.1.17
First example:
<?php
ini_set('memory_limit', '1G');
$array = [];
$row = 0;
while ($row < 2000000) {
$array[] = [1];
if ($row % 100000 === 0) {
echo (memory_get_usage(true) / 1000000) . PHP_EOL;
}
$row++;
}
Total memory usage ~70MB
Second example:
<?php
ini_set('memory_limit', '1G');
$array = [];
$a = 1;
$row = 0;
while ($row < 2000000) {
$array[] = [$a];
if ($row % 100000 === 0) {
echo (memory_get_usage(true) / 1000000) . PHP_EOL;
}
$row++;
}
Total memory usage ~785MB
Also there is no difference in memory consumption if resulting array is one-dimensional.

The key thing here is that [1], although it's a complex value, is a constant - the compiler can trivially know that it's the same every time it's used.
Since PHP uses a "copy on write" system when multiple variables have the same value, the compiler can actually construct the "zval" structure for the array before the code is run, and just increment its reference counter each time a new variable or array value points to it. (If any of them are modified later, they will be "separated" into a new zval before modification, so at that point an extra copy will be made anyway.)
So (using 42 to stand out more), this:
$bar = [];
$bar[] = [42];
Compiles to this (VLD output generated with https://3v4l.org):
compiled vars: !0 = $bar
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
3 0 E > ASSIGN !0, <array>
4 1 ASSIGN_DIM !0
2 OP_DATA <array>
3 > RETURN 1
Note that the 42 doesn't even show up in the VLD output, it's implicit in the second <array>. So the only memory usage is for the outer array to store a long list of pointers, which all happen to point to the same zval.
When using a variable like [$a], on the other hand, there is no guarantee that the values will all be the same. It's possible to analyse the code and deduce that they will be, so OpCache might apply some optimisations, but on its own:
$a = 42;
$foo = [];
$foo[] = [$a];
Compiles to:
compiled vars: !0 = $a, !1 = $foo
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
3 0 E > ASSIGN !0, 42
4 1 ASSIGN !1, <array>
5 2 INIT_ARRAY ~5 !0
3 ASSIGN_DIM !1
4 OP_DATA ~5
5 > RETURN 1
Note the extra INIT_ARRAY opcode - that's a new zval being created with the value of [$a]. This is where all your extra memory goes - every iteration will create a new array that happens to have the same contents.
It's relevant to point out here that if $a was itself a complex value - an array or object - it would not be copied on each iteration, as it would have its own reference counter. You'd still be creating a new array each time around the loop, but those arrays would all contain a copy-on-write pointer to $a, not a copy of it. This doesn't happen for integers (in PHP 7) because its actually cheaper to store the integer directly than to store a pointer to somewhere else that stores the integer.
One more variation worth looking at, because it may be an optimisation you can make by hand:
$a = 42;
$b = [$a];
$foo = [];
$foo[] = $a;
VLD output:
compiled vars: !0 = $a, !1 = $b, !2 = $foo
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
3 0 E > ASSIGN !0, 42
4 1 INIT_ARRAY ~4 !0
2 ASSIGN !1, ~4
5 3 ASSIGN !2, <array>
6 4 ASSIGN_DIM !2
5 OP_DATA !0
7 6 > RETURN 1
Here, we have an INIT_ARRAY opcode when we create $b, but not when we add it to $foo. The ASSIGN_DIM will see that it's safe to reuse the $b zval each time, and increment its reference counter. I haven't tested, but I believe this will take you back to the same memory usage as the constant [1] case.
A final way to verify that copy-on-write is in use here is to use debug_zval_dump, which shows the reference count of a value. The exact numbers are always a bit off, because passing the variable to the function itself creates one or more references, but you can get a good idea from the relative values:
Constant array:
$foo = [];
for($i=0; $i<100; $i++) {
$foo[] = [42];
}
debug_zval_dump($foo[0]);
Shows refcount of 102, as value is shared across 100 copies.
Identical but not constant array:
$a = 42;
$foo = [];
for($i=0; $i<100; $i++) {
$foo[] = [$a];
}
debug_zval_dump($foo[0]);
Shows refcount of 2, as each value has its own zval.
Array constructed once and reused explicitly:
$a = 42;
$b = [$a];
$foo = [];
for($i=0; $i<100; $i++) {
$foo[] = $b;
}
debug_zval_dump($foo[0]);
Shows refcount of 102, as value is shared across 100 copies.
Complex value inside (also try $a = new stdClass etc):
$a = [1,2,3,4,5];
$foo = [];
for($i=0; $i<100; $i++) {
$foo[] = [$a];
}
debug_zval_dump($foo[0]);
Shows refcount of 2, but the inner array has a refcount of 102: there's a separate array for every outer item, but they all contain pointers to the zval created as $a.

Related

Pre Increment in the IF statement has no effect. Why?

The pre-increment in the IF statement has no effect. Can anyone please explain?
<?php
$x = 10;
if($x < ++$x)
{
echo "Hello World";
}
?>
As shown by the opcode dump below, and it shows:
$x is assigned value 10
$x is then incremented to 11 at the memory location
if is executed
Therefore, when you are making the if you are effectivelly comparing variables (memory location) $x with $x and not values 10 and 11.
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > ASSIGN !0, 10
3 1 PRE_INC ~2 !0
2 IS_SMALLER !0, ~2
3 > JMPZ ~3, ->5
5 4 > ECHO 'Hello+World'
7 5 > > RETURN 1
What your code does, is really the following:
<?php
$x=10;
++$x;
if ($x < $x){
The order of evaluation of the operands seems not guaranteed inside a if block, which means the value of $x may be incremented as you seem to expect, or at some other time. the pre_increment has a not well defined behavior.
To fix this, use the increment, it has a very well defined behavior:
<?php
$x = 10;
if ($x < $x++){
echo "hello world";
}
I say the pre_inc behavior is not well defined, because it varies from php interpreter to interpreter. Here's an example of the code that works "as you'd think is expected": https://3v4l.org/n0v6n#v5.0.5
and here's how it "fails": https://3v4l.org/n0v6n#v7.0.25

why these both post increment in PHP gives the same answer? [duplicate]

This question already has answers here:
PHP operator precedence "Undefined order of evaluation"?
(3 answers)
Closed 5 years ago.
I am trying to run the following code in PHP through localhost, but its giving the unexpected output!
<?php
$a = 1;
echo ($a+$a++); // 3
?>
//answer is 3 but answer should be 2 due to post increment
here is another code and it gives the same answer! why?
<?php
$a = 1;
echo ($a+$a+$a++);
?>
//answer is still 3 !!!
The PHP manual says the following:
Operator precedence and associativity only determine how expressions are grouped, they do not specify an order of evaluation. PHP does not (in the general case) specify in which order an expression is evaluated and code that assumes a specific order of evaluation should be avoided, because the behavior can change between versions of PHP or depending on the surrounding code.
So what this comes down to, PHP doesn't explicitly define what the end-result is of those types of statements, and it may even change between PHP versions. We call this undefined behavior, and you shouldn't rely on it.
You might be able to find an exact reason somewhere in the source why this order is chosen, but there might not be any logic to it.
Your two examples are being evaluated as follows:
<?php
$a = 1;
echo ($a + $a++); // 3
?>
Really becomes:
<?php
$a = 1;
$b = $a++;
echo ($a + $b); // a = 2, b = 1
?>
Your second example:
<?php
$a = 1;
echo ($a + $a + $a++); // 3
?>
Becomes:
<?php
$a = 1;
$b = $a + $a;
$a++;
echo $b + $a; // 3
?>
I hope this kind of makes sense. You're right that there's no hard logic behind this.
BEHAVIOR OF INCREMENT OPERATION IN SAME LINE WITH CALCULATION IS NOT DEFINIED!
Compilator can generate different code then you expect.
Simple answer from my teacher:
NEVER USE INCREMENT/DECREMENT OPERATOR IN SAME LINE WITH CALCULATIONS!
It's behavior is undefined - computers calculate in different order then humans.
GOOD:
$d = $i++;
$i++;
for ($i = 0; $i < 5; $i++)
CAN BE A PROBLEM (you can read it wrong):
$d = $array[$i++];
WILL BE A PROBLEM:
$d = $i++ + 5 - --$k;
EDIT:
I wrote same code in C++. Checked Assembler code and result is as I said:
Math with increment is not logic for people, but you can't say that someone implemented it wrong.
As someone posted in comment:
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > ASSIGN !0, 1
3 1 POST_INC ~2 !0
2 ADD ~3 !0, ~2
3 ECHO ~3
16 4 > RETURN 1
//$a = 1;
//echo ($a+$a++);
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > ASSIGN !0, 1
3 1 ADD ~2 !0, !0
2 POST_INC ~3 !0
3 ADD ~4 ~2, ~3
4 ECHO ~4
5 > RETURN 1
//$a = 1;
//echo ($a+$a+$a++);
Translated to human language:
$a = 1;
echo ($a + $a++);
// After changing to 'computer logic' (ASM):
$c = 1; // for increment operation
$b = $a; // keep value 'before' add +1 operation
$a += $c; // add +1 operation
$d = $a + $b; // calculate value for 'echo'
echo $d; // print result
Its because the ++ sign is an incremental operator to a variable. So your
$a = 1
$a++ = 2
($a+$a++) = (1+2) = 3
Thats why it shows 3 as the answer.

I don't understand the pre-increment part in this code

Can anyone elexplain to me why the output of this code is 22 not 21?
$x=10;
$x+=++$x;
echo $x;
$x += ++$x;
The right-hand side of this assignment is evaluated first:
increment $x → $x is now 11, result of expression ++$x is 11
take the value of $x (11) and add the result of step 1 to it → 22
assign the result of step 2 to $x
x is being incremented by the stored incremented value of x.
x += (x = x + 1).
Internally, this is evaluated as:
# op ext return operands
-------------------------------------
1 ASSIGN !0, 10
2 PRE_INC $2 !0
3 ASSIGN_ADD 0 !0, $2
4 ECHO !0
Assign 10 to $x (referred to above as !0)
Pre-increment $x, i.e.
Add 1 to $x
Return the new result (11)
Increase $x (now 11) by the return value from step 2 (also 11)
Echo the result (22)
(Edited the VLD output for readability, see the full version here: https://3v4l.org/mftI4/vld#output)
Its all about order of operations.
$a+=$b is just a shorthand for $a = $a + $b. So now, "unroll" your 2nd line with that knowledge:
$x = $x + (++$x);
To assign value to $x, we must 1st evaluate right side of the assignment. To do that, we need first to perform the ++ operation, only then our variables on the right are ready to be added.
So what is operator ++ in this context? It is in turn a shorthand for a function, that does something similar to this:
function preIncrement(&$variable) {
$variable = $variable + 1;
return $variable;
}
Note that variable is a reference (&$variable, rather than $variable). What that means is that inside that function, if we modify variable, it will modify the variable that was passed to it, OUTSIDE. So when we pass $x, the function increases $x and then returns some number value. That number value is being replaced in the right side of the assignment.
So, when that line really looks like is:
$x = $x + postIncrement($x);
So, to evaluate we need to first execute the function in the assignment and get the functions return value. It happens to be 11. Great, now we know we need to add 11 to $x.
$x = $x + 11;
Great, lets just read current value of $x and we can assign. $x is 11. postIncrement function increased it to 11 when we executed it. So:
$x = 11 + 11;
So now, $x is 22.
Lets compare that to post incrementation:
$x+=$x++;
Unrolling += ...
$x = $x + $x++;
As before, we need to get return value of $x++ before we can evaluate. Post incrementation looks something like this:
function postIncrement(&$variable) {
$oldValue = $variable;
$variable = $variable + 1;
return $oldValue;
}
So it takes our $x, increases its value, but returns the original value. As a result, the $x++ gets evaluated as 10. Now we arrive at:
$x = $x + 10;
Ok, lets evaluate $x. $x is 11. Post increment increased it when we executed it. So:
$x = 11 + 10;
So $x is 21 in that case.
Hope this helps you.

Make an output of 1 2 4 8 11 12 14 18... using one for loop

How can I generate a sequence of numbers like
1 2 4 8 11 12 14 18 ...
(plus 10 every 4 numbers) with the following additional requirements:
using only one loop
output should stop when a value in the sequence is greater than a specified input
Examples
$input = 24;
1 2 4 8 11 12 14 18 21 22 24
$input = 20;
1 2 4 8 11 12 14 18
Here's what I tried so far:
<?php
// sample user input
$input = 20;
$number = 1;
$counter = 0;
$array = array();
//conditions
while ($counter < 4) {
$counter++;
array_push($array, $number);
$number += $number;
}
//outputs
for ($x = 0; $x < count($array); $x++) {
echo $array[$x];
echo " ";
}
Code: (Demo)
function arrayBuilder($max,$num=1){
$array=[];
for($i=0; ($val=($num<<$i%4)+10*floor($i/4))<=$max; ++$i){
$array[]=$val;
}
return $array;
}
echo implode(',',arrayBuilder(28)),"\n"; // 1,2,4,8,11,12,14,18,21,22,24,28
echo implode(',',arrayBuilder(28,2)),"\n"; // 2,4,8,16,12,14,18,26,22,24,28
echo implode(',',arrayBuilder(20)),"\n"; // 1,2,4,8,11,12,14,18
echo implode(',',arrayBuilder(24)),"\n"; // 1,2,4,8,11,12,14,18,21,22,24
This method is very similar to localheinz's answer, but uses a technique introduced to me by beetlejuice which is faster and php version safe. I only read localheinz's answer just before posting; this is a matter of nearly identical intellectual convergence. I am merely satisfying the brief with the best methods that I can think of.
How/Why does this work without a lookup array or if statements?
When you call arrayBuilder(), you must send a $max value (representing the highest possible value in the returned array) and optionally, you can nominate $num (the first number in the returned array) otherwise the default value is 1.
Inside arrayBuilder(), $array is declared as an empty array. This is important if the user's input value(s) do not permit a single iteration in the for loop. This line of code is essential for good coding practices to ensure that under no circumstances should a Notice/Warning/Error occur.
A for loop is the most complex loop in php (so says the manual), and its three expressions are the perfect way to package the techniques that I use.
The first expression $i=0; is something that php developers see all of the time. It is a one-time declaration of $i equalling 0 which only occurs before the first iteration.
The second expression is the only tricky/magical aspect of my entire code block. This expression is called before every iteration. I'll try to break it down: (parentheses are vital to this expression to avoid unintended results due to operator precedence
( open parenthesis to contain leftside of comparison operator
$val= declare $val for use inside loop on each iteration
($num<<$i%4) because of precedence this is the same as $num<<($i%4) meaning: "find the remainder of $i divided by 4 then use the bitwise "shift left" operator to "multiply $num by 2 for every "remainder". This is a very fast way of achieving the 4-number pattern of [don't double],[double once],[double twice],[double three times] to create: 1,2,4,8, 2,4,8,16, and so on. bitwise operators are always more efficient than arithmetic operators.The use of the arithmetic operator modulo ensure that the intended core number pattern repeats every four iterations.
+ add (not concatenation in case there is any confusion)
10*floor($i/4) round down $i divided by 4 then multiply by 10 so that the first four iterations get a bonus of 0, the next four get 10, the next four get 20, and so on.
) closing parenthesis to contain leftside of comparison operator
<=$max allow iteration until the $max value is exceeded.
++$i is pre-incrementing $i at the end of every iteration.
Complex solution using while loop:
$input = 33;
$result = [1]; // result array
$k = 0; // coeficient
$n = 1;
while ($n < $input) {
$size = count($result); // current array size
if ($size < 4) { // filling 1st 4 values (i.e. 1, 2, 4, 8)
$n += $n;
$result[] = $n;
}
if ($size % 4 == 0) { // determining each 4-values sequence
$multiplier = 10 * ++$k;
}
if ($size >= 4) {
$n = $multiplier + $result[$size - (4 * $k)];
if ($n >= $input) {
break;
}
$result[] = $n;
}
}
print_r($result);
The output:
Array
(
[0] => 1
[1] => 2
[2] => 4
[3] => 8
[4] => 11
[5] => 12
[6] => 14
[7] => 18
[8] => 21
[9] => 22
[10] => 24
[11] => 28
[12] => 31
[13] => 32
)
On closer inspection, each value in the sequence of values you desire can be calculated by adding the corresponding values of two sequences.
Sequence A
0 0 0 0 10 10 10 10 20 20 20 20
Sequence B
1 2 4 8 1 2 4 8 1 2 4 8
Total
1 2 4 8 11 12 14 18 21 22 24 28
Solution
Prerequisite
The index of the sequences start with 0. Alternatively, they could start with 1, but then we would have to deduct 1, so to keep things simple, we start with 0.
Sequence A
$a = 10 * floor($n / 4);
The function floor() accepts a numeric value, and will cut off the fraction.
Also see https://en.wikipedia.org/wiki/Floor_and_ceiling_functions.
Sequence B
$b = 2 ** ($n % 4);
The operator ** combines a base with the exponent and calculates the result of raising base to the power of exponent.
In PHP versions prior to PHP 5.6 you will have to resort to using pow(), see http://php.net/manual/en/function.pow.php.
The operator % combines two values and calculates the remainder of dividing the former by the latter.
Total
$value = $a + $b;
Putting it together
$input = 20;
// sequence a
$a = function ($n) {
return 10 * floor($n / 4);
};
// sequence b
$b = function ($n) {
return 2 ** ($n % 4);
};
// collect values in an array
$values = [];
// use a for loop, stop looping when value is greater than input
for ($n = 0; $input >= $value = $a($n) + $b($n) ; ++$n) {
$values[] = $value;
}
echo implode(' ', $values);
For reference, see:
http://php.net/manual/en/control-structures.for.php
http://php.net/manual/en/function.floor.php
http://php.net/manual/en/language.operators.arithmetic.php
http://php.net/manual/en/function.implode.php
For an example, see:
https://3v4l.org/pp9Ci

Which construction is faster?

Which construction is faster:
$a = $b * $c ? $b * $c : 0;
or
$i = $b * $c;
$a = $i ? $i : 0;
All variables are local ones.
Does speed differs for mulitplication, addition, substraction and division?
Update:
Here's some clarification:
This is a theoretical question about writing speed-optimized code from scratch. Not about "searching bottlenecks".
I can measure code speed by myself. But it's was not a question about homework of using microtime(). It was a question about how PHP-interpreter works (what I tried to figure out by digging google myself but was unseccusfull).
Moreover - I did measuring with myself and was a little confused. Different starting values of $a, $b and $c (combinations of zeros, negative, positive, integer and floats) produce different results between constructions. So I was confused.
BoltClock provide me usefull info but user576875 made my day by posting a link to opcode decoder! His answer contains also direct answer to my question. Thanks!
If you have PHP 5.3, this is faster:
$a = $b * $c ?: 0;
This is the same as $a = $b * $c ? $b * $c : 0;, but the $a*$b calcultation is done only once. Also, it doesn't do additional assignments as in your second solution.
Using Martin v. Löwis's benchmark script I get the following times:
$a = $b * $c ?: 0; 1.07s
$a = $b * $c ? $b * $c : 0; 1.16s
$i = $b * $c; $a = $i ? $i : 0; 1.39s
Now these are micro-optimizations, so there is probably many ways of optimizing your code before doing this :)
If it is not the case, you may also want to compare generated PHP OP codes:
1 $a = $b * $c ? $b * $c : 0; :
number of ops: 8
compiled vars: !0 = $a, !1 = $b, !2 = $c
line # op fetch ext return operands
-------------------------------------------------------------------------------
1 0 MUL ~0 !1($b), !2($c)
1 JMPZ ~0, ->5
2 MUL ~1 !1($b), !2($c)
3 QM_ASSIGN ~2 ~1
4 JMP ->6
5 QM_ASSIGN ~2 0
6 ASSIGN !0($a), ~2
7 RETURN null
2 $i = $b * $c; $a = $i ? $i : 0;
number of ops: 8
compiled vars: !0 = $i, !1 = $b, !2 = $c, !3 = $a
line # op fetch ext return operands
-------------------------------------------------------------------------------
1 0 MUL ~0 !1($b), !2($c)
1 ASSIGN !0($i), ~0
2 JMPZ !0($i), ->5
3 QM_ASSIGN ~2 !0($i)
4 JMP ->6
5 QM_ASSIGN ~2 0
6 ASSIGN !3($a), ~2
7 RETURN null
3 $a = $b * $c ?: 0; :
number of ops: 5
compiled vars: !0 = $a, !1 = $b, !2 = $c
line # op fetch ext return operands
-------------------------------------------------------------------------------
1 0 MUL ~0 !1($b), !2($c)
1 ZEND_JMP_SET ~1 ~0
2 QM_ASSIGN ~1 0
3 ASSIGN !0($a), ~1
4 RETURN null
These OP code listings was generated by the VLD extension.
<?php
function run(){
$b=10;
$c=10;
$start=gettimeofday(TRUE);
for($k=0;$k<10000000;$k++){
$a = $b * $c ? $b * $c : 0;
}
printf("%f\n", gettimeofday(TRUE)-$start);
$start=gettimeofday(TRUE);
for($k=0;$k<10000000;$k++){
$i = $b * $c;
$a = $i ? $i : 0;
}
printf("%f\n", gettimeofday(TRUE)-$start);
}
run();
?>
On my system (PHP 5.3.3, Linux, Core i7 2.8GHz), I get
1.593521
1.512892
So the separate assignment is slightly faster. For addition, (+ instead of *), I get the reverse result:
1.386522
1.450358
So you really need to measure these on your own system - with a different PHP version, the outcome may change again.
Your two pieces of code have a drawback each. One does an additional assignment; the other does an additional mathematical operation. Best would be to do neither, which, with the ternary operator in PHP 5.3, you can:
$a = $b * $c ?: 0;
Omitting the second part of the ternary causes PHP to put the result of the first part there instead.
Using Martin v. Löwis's benchmarking code, I reckon this is about 25% faster than either.

Categories