php memory usage - php

It looks like in PHP it requires about 213 bytes to store one integer, is it true?
Okay, please take a look on the next code:
$N = 10000;
echo memory_get_usage()."\n";
$v = array();
for($i = 0; $i < $N; $i++) {
$v[] = $i;
}
echo memory_get_usage()."\n";
unset($v);
echo memory_get_usage()."\n";
Output is next:
641784
2773768
642056
So, the difference is 2773768 - 641784 = 2131984 byte, or 213 byte per integer.
why so much? 4 bytes is more than enough.

4 bytes is only enough if you simply store an integer value somewhere in memory, without making any allowance for the fact that it is a variable which needs a datatype identification, flags to indicate if there are any other references to that variable, the name of that variable, etc. all of which require additional memory.
PHP stores the value in a zval* so there's all the additional bytes used to store the zval details in addition to the actual value.

Related

Memory footprint way too large

I have tried the memory usage of some simple variables and encounter unexpected results, please see this code:
$datetimes = [];
$memory_before = memory_get_usage();
for ($x = 0; $x < 1000; $x++) {
$datetimes[] = new \DateTime();
}
var_dump('DateTimes: ' . (memory_get_usage() - $memory_before));
$ints = [];
$memory_before = memory_get_usage();
for ($x = 0; $x < 1000; $x++) {
$ints[] = $x;
}
var_dump('Integers: ' . (memory_get_usage() - $memory_before));
I get this output (on PHP 7.4, 64bit):
string(17) "DateTimes: 350504"
string(15) "Integers: 37160"
37 KB memory for 1000 ints does not make sense to me, right? I'd expect 8000 byte plus some array overhead.
My experiment scales: for a million ints, I get 33558808 byte memory usage.
I have disabled xdebug.
It's how PHP works and the disadvantage of having dynamically-typed variables.
The integer is in reality a Zend object.
1000 x (64 * 2) = 128 Kbit so 16KB.
Add to that the array of size 1000.
In memory, zval is represented as two 64-bit words. The first word keeps the value — and the second word keeps the type, type_flags, extra, and reserved fields.

Efficient way for manipulating binary data

I have a string containing bitmap data. Basically, it holds cycles of 1 byte of red, green, blue for each pixel in an image.
I want to manipulate the 8-bit integer value of each color channel in the bitmap. Currently I do this using unpack('C'.strlen($string)) which gives me an array of integers. This is very slow as it converts a lot of data.
Is there a more efficient way to access and modify data from a string as integers?
By more efficient, I mean more efficient than this:
for($i = 0, $l = strlen($string); $i < $l; $i++)
$string[$i] = chr(floor(ord($string[$i]) / 2));
The manipulation where I divide by 2 is simply an example. The manipulations would be more advanced in a real example.
The problem with the above example is that for say 1 million pixels, you get 3 million function calls initialized from PHP (chr, floor, ord). On my server, this amounts to about 1.4 seconds for an example picture.
PHP strings essentially are byte arrays. The fastest way would simply be:
for ($i = 0, $length = strlen($data); $i < $length; $i++) {
$byte = $data[$i];
...
}
If you manipulate the byte with bitwise operators, you'll hardly get more efficient than that. You can write the byte back into the string with $data[$i] = $newValue. If you need to turn a single byte into an int, use ord(), the other way around use chr().

Generating unique 6 digit code

I'm generating a 6 digit code from the following characters. These will be used to stamp on stickers.
They will be generated in batches of 10k or less (before printing) and I don't envisage there will ever be more than 1-2 million total (probably much less).
After I generate the batches of codes, I'll check the MySQL database of existing codes to ensure there are no duplicates.
// exclude problem chars: B8G6I1l0OQDS5Z2
$characters = 'ACEFHJKMNPRTUVWXY4937';
$string = '';
for ($i = 0; $i < 6; $i++) {
$string .= $characters[rand(0, strlen($characters) - 1)];
}
return $string;
Is this a solid approach to generating the code?
How many possible permutations would there be? (6 Digit code from pool of 21 characters). Sorry math isn't my strong point
21^6 = 85766121 possibilities.
Using a DB and storing used values is bad. If you want to fake randomness you can use the following:
Reduce to 19 possible numbers and make use of the fact that groups of order p^k where p is an odd prime are always cyclic.
Take the group of order 7^19, using a generator co-prime to 7^19 (I'll pick 13^11, you can choose anything not divisible by 7).
Then the following works:
$previous = 0;
function generator($previous)
{
$generator = pow(13,11);
$modulus = pow(7,19); //int might be too small
$possibleChars = "ACEFHJKMNPRTUVWXY49";
$previous = ($previous + $generator) % $modulus;
$output='';
$temp = $previous;
for($i = 0; $i < 6; $i++) {
$output += $possibleChars[$temp % 19];
$temp = $temp / 19;
}
return $output;
}
It will cycle through all possible values and look a little random unless they go digging. An even safer alternative would be multiplicative groups but I forget my math already :(
There is a lot of possible combination with or without repetition so your logic would be sufficient
Collision would be frequent because you are using rand see str_shuffle and randomness.
Change rand to mt_rand
Use fast storage like memcached or redis not MySQL when checking
Total Possibility
21 ^ 6 = 85,766,121
85,766,121 should be ok , To add database to this generation try:
Example
$prifix = "stamp.";
$cache = new Memcache();
$cache->addserver("127.0.0.1");
$stamp = myRand(6);
while($cache->get($prifix . $stamp)) {
$stamp = myRand(6);
}
echo $stamp;
Function Used
function myRand($no, $str = "", $chr = 'ACEFHJKMNPRTUVWXY4937') {
$length = strlen($chr);
while($no --) {
$str .= $chr{mt_rand(0, $length- 1)};
}
return $str;
}
as Baba said generating a string on the fly will result in tons of collisions. the closer you will go to 80 millions already generated ones the harder it will became to get an available string
another solution could be to generate all possible combinations once, and store each of them in the database already, with some boolean column field that marks if a row/token is already used or not
then to get one of them
SELECT * FROM tokens WHERE tokenIsUsed = 0 ORDER BY RAND() LIMIT 0,1
and then mark it as already used
UPDATE tokens SET tokenIsUsed = 1 WHERE token = ...
You would have 21 ^ 6 codes = 85 766 121 ~ 85.8 million codes!
To generate them all (which would take some time), look at the selected answer to this question: algorithm that will take numbers or words and find all possible combinations.
I had the same problem, and I found very impressive open source solution:
http://www.hashids.org/php/
You can take and use it, also it's worth it to look in it's source code to understand what's happening under the hood.
Or... you can encode username+datetime in md5 and save to database, this for sure will generate an unique code ;)

Does a boolean value in PHP take up only 1 bit of memory?

As the question states, would the following array require 5 bits of memory?
$flags = array(true, false, true, false, false);
[EDIT]: Apologies just found this duplicate.
Each element in the array stored in a separate memory location, you also need to store the hashtable for the array, along with the keys, so NOOOO, it's going to be a lot more.
No. PHP has internal metadata attached to every variable/array element definined. PHP does not support bit fields directly, so the smallest ACTUAL allocation is a byte, plus metadata overhead.
I doubt there is an application that uses less than system arcitecture's data word as a minimum data storage unit.
But I am sure it shouldn't be your concern at all.
It depends on the php interpreter. The standard interpreter is extremely wasteful, although this is not uncommon for a dynamic language. The massive overhead is caused by garbage collection, and the dynamic nature of every value; since the contents of an array can take arbitrary values of arbitrary types (i.e. you can write $ar[1] = 's';), the type and additional metainformation must be stored.
With the following test script:
<?php
$n = 20000000;
$ar = array();
$i = 0;
$before = memory_get_usage();
for ($i = 0;$i < $n;$i++) {
$ar[] = ($i % 2 == 0);
}
$after = memory_get_usage();
echo 'Using ' . ($after - $before) . ' Bytes for ' . $n . ' values';
echo ', per value: ' . (($after - $before) / $n) . "\n";
I get about 150 Bytes per array entry (x64, php 5.4.0-2). This seems to be at the higher end of implementations; ideone reports 73 Bytes/entry (php 5.2.11), and so does codepad.

Reading /dev/urandom and generating a random integer

I am trying to create a function that generates a random integer out of the bytes I get from /dev/urandom. I am doing this in PHP and it currently looks like:
public static function getRandomInteger($min, $max)
{
// First we need to determine how many bytes we need to construct $min-$max range.
$difference = $max-$min;
$bytesNeeded = ceil($difference/256);
$randomBytes = self::getRandomBytes($bytesNeeded);
// Let's sum up all bytes.
$sum = 0;
for ($a = 0; $a < $bytesNeeded; $a++)
$sum += ord($randomBytes[$a]);
// Make sure we don't push the limits.
$sum = $sum % ($difference);
return $sum + $min;
}
Everything works great except that I think it's not calculating the values exactly fair. For example, if you want to have a random value between 0 and 250, it receives one byte and mods it with 250 so the values of 0-6 are more likely to appear than the values of 7-250. What should I do to fix this?
a) If you don't need cryptographically secure random numbers, simply use mt_rand. It will probably suffice for your needs.
b) If you want to stick with your algorithm: Do some remapping: return round($min + $sum / pow(256, $bytesNeeded) * ($max - $min)).
c) As you can see, this requires rounding. That will lead to a not perfectly uniform distribution, I think (though I am not sure about this). Probably the best way is to get the random number as a float and then scale it. Though I have no idea how you get a float from /dev/urandom. That's why I stick with mt_rand and lcg_value.
I would read $difference bytes from /dev/urandom mod $difference and then add $min
Then make sure $max isn't higher than that number.

Categories