Are there any binary serializers available? - php

Going for space savings here. I have an object that I need to serialize for each row.
In theory, it needs to consist of 168 16b unsigned integers (even though there's no explicit typing in php), that should be worth 336 bytes.
If I go for the serialize() function that creates strings though, the size is up to 2349 bytes.
Is there a binary kind of serializer for php?

Store the data in a plain and simple array. Then pack it:
$len = count($arr);
$data = pack("S$len", ...$arr); // requires PHP5.6+
To get the array back you want to unpack()it:
$arr = unpack("S$len", $data);
Mind you that you need to keep the $len variable along the $data block.
More about it here.

Related

php serialize huge float causes rounding and formatting issues

I have the following php snippet
$newData = serialize(array('ep' => 50733372961735.4));
echo "New data: " . print_r($newData, 1);
Output:
New data: a:1:{s:2:"ep";d:5.07333729617E+13;}
But I would like the float value as it is and not E+13.
What could I do without having to make drastic changes as this is just an example. In my actual code the 'ep' value could be inside a complex array hierarchy
Firstly, a general note: serialize should never be used on data that could be manipulated in any way. It's useful for things like session data and caches, but should not be relied on for transporting data between applications or data storage. In many cases, you're better off using a standard serialization format like JSON.
You also certainly shouldn't care what the serialized string looks like - the only thing you should do with that string is pass it back to unserialize(). So the fact that there is E+13 is not a problem if the actual value it gives back when you unserialize is the one you wanted.
However, it's clear in your example that you have lost precision - the last digits are ...29617 rather than ...29617354 - so back to the point: there is a PHP setting serialize_precision, described in the manual here. It's default value has varied over the years, but setting it to an explicit value other than -1 will serialize floats with that number of significant figures:
ini_set('serialize_precision', 2);
echo serialize(50733372961735.4), PHP_EOL;
// d:5.1E+13;
ini_set('serialize_precision', 20);
echo serialize(50733372961735.4), PHP_EOL;
// d:50733372961735.398438;
Note that the first example has clearly thrown away information, whereas the second has actually stored more precision than you realised you had - because of the inaccuracy of storing decimals in binary floating point format.
The problem is not the serialize at all!
it's numbers in general.
$num = 50733372961735.4;
print($num);
=> 50733372961735
to "solve" this problem you can use:
ini_set('serialize_precision', 15);
I just executed your code on www.writephponline.com
I got the below value if I did not put your value in string :- $newData = serialize(array('ep' => 50733372961735.4));
New data: a:1:{s:2:"ep";d:50733372961735.398;}
and this after adding it in string :- $newData = serialize(array('ep' => '50733372961735.4'));
New data: a:1:{s:2:"ep";s:16:"50733372961735.4";}

Need an array-like structure in PHP with minimal memory usage

In my PHP script I need to create an array of >600k integers. Unfortunately my webservers memory_limit is set to 32M so when initializing the array the script aborts with message
Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 71 bytes) in /home/www/myaccount/html/mem_test.php on line 8
I am aware of the fact, that PHP does not store the array values as plain integers, but rather as zvalues which are much bigger than the plain integer value (8 bytes on my 64-bit system). I wrote a small script to estimate how much memory each array entry uses and it turns out, that it's pretty exactly 128 bytes. 128!!! I'd need >73M just to store the array. Unfortunately the webserver is not under my control so I cannot increase the memory_limit.
My question is, is there any possibility in PHP to create an array-like structure that uses less memory. I don't need this structure to be associative (plain index-access is sufficient). It also does not need to have dynamic resizing - I know exactly how big the array will be. Also, all elements would be of the same type. Just like a good old C-array.
Edit:
So deceze's solution works out-of-the-box with 32-bit integers. But even if you're on a 64-bit system, pack() does not seem to support 64-bit integers. In order to use 64-bit integers in my array I applied some bit-manipulation. Perhaps the below snippets will be of help for someone:
function push_back(&$storage, $value)
{
// split the 64-bit value into two 32-bit chunks, then pass these to pack().
$storage .= pack('ll', ($value>>32), $value);
}
function get(&$storage, $idx)
{
// read two 32-bit chunks from $storage and glue them back together.
return (current(unpack('l', substr($storage, $idx * 8, 4)))<<32 |
current(unpack('l', substr($storage, $idx * 8+4, 4))));
}
The most memory efficient you'll get is probably by storing everything in a string, packed in binary, and use manual indexing to it.
$storage = '';
$storage .= pack('l', 42);
// ...
// get 10th entry
$int = current(unpack('l', substr($storage, 9 * 4, 4)));
This can be feasible if the "array" initialisation can be done in one fell swoop and you're just reading from the structure. If you need a lot of appending to the string, this becomes extremely inefficient. Even this can be done using a resource handle though:
$storage = fopen('php://memory', 'r+');
fwrite($storage, pack('l', 42));
...
This is very efficient. You can then read this buffer back into a variable and use it as string, or you can continue to work with the resource and fseek.
A PHP Judy Array will use significantly less memory than a standard PHP array, and an SplFixedArray.
I quote "An array with 1 million entries using regular PHP array data structure takes 200MB. SplFixedArray uses around 90 megabytes. Judy uses 8 megs. Tradeoff is in performance, Judy takes about double the time of regular php array implementation."
You could use an object if possible. These often use less memory than array's.
Also SplFixedArray is an good option.
But it really depends on the implementation that you need to do. If you need an function to return an array and are using PHP 5.5. You could use the generator yield to stream the array back.
You can try to use a SplFixedArray, it's faster and take less memory (the doc comment say ~30% less). Test here and here.
Use a string - that's what I'd do. Store it in a string on fixed offsets (16 or 20 digits should do it I guess?) and use substr to get the one needed. Blazing fast write / read, super easy, and 600.000 integers will only take ~12M to store.
base_convert() - if you need something more compact but with minimum effort, convert your integers to base-36 instead of base-10; in this case, a 14-digit number would be stored in 9 alphanumeric characters. You'll need to make 2 pieces of 64-bit ints, but I'm sure that's not a problem. (I'd split them to 9-digit chunks where conversion gives you a 6-char version.)
pack()/unpack() - binary packing is the same thing with a bit more efficiency. Use it if nothing else works; split your numbers to make them fit to two 32-bit pieces.
600K is a lot of elements. If you are open to alternative methods, I personally would use a database for that. Then use standard sql/nosql select syntax to pull things out. Perhaps memcache or redis if you have an easy host for that, such as garantiadata.com. Maybe APC.
Depending on how you are generate the integers, you could potentially use PHP's generators, assuming you are traversing the array and doing something with individual values.
I took the answer by #deceze and wrapped it in a class that can handle 32-bit integers. It is append-only, but you can still use it as a simple, memory-optimized PHP Array, Queue, or Heap. AppendItem and ItemAt are both O(1), and it has no memory overhead. I added currentPosition/currentSize to avoid unnecessary fseek function calls. If you need to cap memory usage and switch to a temporary file automatically, use php://temp instead.
class MemoryOptimizedArray
{
private $_storage;
private $_currentPosition;
private $_currentSize;
const BYTES_PER_ENTRY = 4;
function __construct()
{
$this->_storage = fopen('php://memory', 'rw+');
$this->_currentPosition = 0;
$this->_currentSize = 0;
}
function __destruct()
{
fclose($this->_storage);
}
function AppendItem($value)
{
if($this->_currentPosition != $this->_currentSize)
{
fseek($this->_storage, SEEK_END);
}
fwrite($this->_storage, pack('l', $value));
$this->_currentSize += self::BYTES_PER_ENTRY;
$this->_currentPosition = $this->_currentSize;
}
function ItemAt($index)
{
$itemPosition = $index * self::BYTES_PER_ENTRY;
if($this->_currentPosition != $itemPosition)
{
fseek($this->_storage, $itemPosition);
}
$binaryData = fread($this->_storage, self::BYTES_PER_ENTRY);
$this->_currentPosition = $itemPosition + self::BYTES_PER_ENTRY;
$unpackedElements = unpack('l', $binaryData);
return $unpackedElements[1];
}
}
$arr = new MemoryOptimizedArray();
for($i = 0; $i < 3; $i++)
{
$v = rand(-2000000000,2000000000);
$arr->AddToEnd($v);
print("added $v\n");
}
for($i = 0; $i < 3; $i++)
{
print($arr->ItemAt($i)."\n");
}
for($i = 2; $i >=0; $i--)
{
print($arr->ItemAt($i)."\n");
}

Php. Number of zeros in float

I have to send the JSON:
{"val": 5000.00}
Neither {"val": "5000.00"} nor {"val": 5000} is correct format.
json_encode() converts 5000.00 to 5000
Is it possible to send correct json's format (two zeros) with json_encode from
array("val" => (float) 5000.00) ?
No, it's not possible, because what you think are "correct" and "incorrect" values are actually the same value. It's purely a rendering decision to show/hide trailing zeros.
It's up to you at display time to render the value with the correct number of decimal places. You can't force a floating point number to be stored or transfered with a certain number of decimals.
If you really really need to do that you may consider two options:
Option nr. 1:
You build up the json econded string by your own. If the data you have to encode have a simple structure and that structure is not subject to change in the future the task is easy. You'll have a slower script so another requirement is that the data is not too much.
For a single object like the example you posted...
$json = sprintf('{"val": %.2f}', $floatValue);
Of course for structured data, like an array of objects, arrays of arrays... you'll have to write the necessary loops, place accurately , : [ ] { } ". Hope I gave you the idea...
Option nr. 2:
You build up the data you have to encode just as you would if you didn't have the weird requirement you asked for. But store float values into strings that will also contains some pattern characters.
For example you'll store 5000.00 as "##5000.00##"
Encode the data with json_ecode.
Permorm string replacements to eliminate "## and ##"
$floatValue = 5000.00;
$data = array("val" => sprintf("##%.2f##", $floatValue));
$json = json_encode($data);
$json = str_replace('"##', '', $json);
$json = str_replace('##"', '', $json);
Choose your pattern characters (## in the example) carefully to avoid conflicts with other strings that may contain the same pattern elsewere in you data to be encoded.
just preg_replace('/:"(\d+\.\d+)",/', ':$1,', json_encode($a, JSON_FORCE_OBJECT))

How to iterate over a bit value?

I want to build a chessboard via bitboard system.
Starting with 12 bitboards i want to display a table (chessboard), during loop/iteration a piece must be drawn.
How do i loop through all bitvalues?
I was thinking of something like:
for(i=0;i<64;i++)
draw table / build array / draw empty square
These are my my values to start a game:
function init_game($whitePlayer,$blackPlayer)
{
$WhitePawns = '0000000000000000000000000000000000000000000000001111111100000000';
$WhiteKnights = '0000000000000000000000000000000000000000000000000000000001000010';
$WhiteBishops = '0000000000000000000000000000000000000000000000000000000000100100';
$WhiteRooks = '0000000000000000000000000000000000000000000000000000000010000001';
$WhiteQueens = '0000000000000000000000000000000000000000000000000000000000010000';
$WhiteKing = '0000000000000000000000000000000000000000000000000000000000001000';
$BlackPawns = '0000000011111111000000000000000000000000000000000000000000000000';
$BlackKnights = '0100001000000000000000000000000000000000000000000000000001000010';
$BlackBishops = '0010010000000000000000000000000000000000000000000000000000100100';
$BlackRooks = '1000000100000000000000000000000000000000000000000000000000000000';
$BlackQueens = '0000100000000000000000000000000000000000000000000000000000000000';
$BlackKing = '0001000000000000000000000000000000000000000000000000000000000000';
$WhitePieces = $WhitePawns|$WhiteKnights|$WhiteBishops|$WhiteRooks|$WhiteQueens|$WhiteKing;
$BlackPieces = $BlackPawns|$BlackKnights|$BlackBishops|$BlackRooks|$BlackQueens|$BlackKing;
}
Some people asked me: why bitboard appoach?
Answer:
About bitboard
A bitboard, often used for boardgames such as chess, checkers and othello, is a specialization of the bitset data structure, where each bit represents a game position or state, designed for optimization of speed and/or memory or disk use in mass calculations. Bits in the same bitboard relate to each other in the rules of the game often forming a game position when taken together. Other bitboards are commonly used as masks to transform or answer queries about positions. The "game" may be any game-like system where information is tightly packed in a structured form with "rules" affecting how the individual units or pieces relate.
First you have to check if your PHP version supports 64bit integers, otherwise you will have strange results.
Just run:
echo PHP_INT_MAX;
and if result is 9223372036854775807 then it should work.
You're using strings and I suppose that when you'll do $string | $string in form like you're doing it above then it will be cast as integer with base 10, so the result won't be what you want. Since PHP 5.4 you can use 0b000 notation, for lower PHP version you'll need to keep it in hexadecimal or base 10 format. If you're storing values in DB or somewhere like that and you'll receive value as string or you just want to keep it in format presented above, then you have to use intVal($value, 2) first to cast it properly.
To iterate over the value you can use just for loop (as you suggested):
$value = intVal($WhitePieces,2);
for ($i = 0 ; $i < 64 ; ++$i) {
if ((pow(2,$i) & $value)) {
// draw piece
}
}
You do not have bitvalues, you do have strings. And strings should be difficult to or.
How do you loop? Use an array and foreach.
How do you use 64bit values? Use PHP 5.4 and the binary number format: 0b00001111 => 16 - alternatively express the integer value as hex or decimal, which should be completely ok for a game setup routine that will not change because the rules are known for centuries.
Remember that you have to use a 64Bit system to execute your code, otherwise PHP will be unable to support 64Bit integers, and either treat them as float values, or shorten them to 32Bit values, depending on what you actually do.
Because of all this, I'd suggest NOT to use bit fields for the solution. They seem like a great idea to program more assembler-like, but you are not writing assembler, and will probably pay for this approach with non-optimal performance compared to anything else.

Memory optimization in PHP array

I'm working with a large array which is a height map, 1024x1024 and of course, i'm stuck with the memory limit. In my test machine i can increase the mem limit to 1gb if i want, but in my tiny VPS with only 256 ram, it's not an option.
I've been searching in stack and google and found several "well, you are using PHP not because memory efficiency, ditch it and rewrite in c++" and honestly, that's ok and I recognize PHP loves memory.
But, when digging more inside PHP memory management, I did not find what memory consumes every data type. Or if casting to another type of data reduces mem consumption.
The only "optimization" technique i found was to unset variables and arrays, that's it.
Converting the code to c++ using some PHP parsers would solve the problem?
Thanks!
If you want a real indexed array, use SplFixedArray. It uses less memory. Also, PHP 5.3 has a much better garbage collector.
Other than that, well, PHP will use more memory than a more carefully written C/C++ equivalent.
Memory Usage for 1024x1024 integer array:
Standard array: 218,756,848
SplFixedArray: 92,914,208
as measured by memory_get_peak_usage()
$array = new SplFixedArray(1024 * 1024); // array();
for ($i = 0; $i < 1024 * 1024; ++$i)
$array[$i] = 0;
echo memory_get_peak_usage();
Note that the same array in C using 64-bit integers would be 8M.
As others have suggested, you could pack the data into a string. This is slower but much more memory efficient. If using 8 bit values it's super easy:
$x = str_repeat(chr(0), 1024*1024);
$x[$i] = chr($v & 0xff); // store value $v into $x[$i]
$v = ord($x[$i]); // get value $v from $x[$i]
Here the memory will only be about 1.5MB (that is, when considering the entire overhead of PHP with just this integer string array).
For the fun of it, I created a simple benchmark of creating 1024x1024 8-bit integers and then looping through them once. The packed versions all used ArrayAccess so that the user code looked the same.
mem write read
array 218M 0.589s 0.176s
packed array 32.7M 1.85s 1.13s
packed spl array 13.8M 1.91s 1.18s
packed string 1.72M 1.11s 1.08s
The packed arrays used native 64-bit integers (only packing 7 bytes to avoid dealing with signed data) and the packed string used ord and chr. Obviously implementation details and computer specs will affect things a bit, but I would expect you to get similar results.
So while the array was 6x faster it also used 125x the memory as the next best alternative: packed strings. Obviously the speed is irrelevant if you are running out of memory. (When I used packed strings directly without an ArrayAccess class they were only 3x slower than native arrays.)
In short, to summarize, I would use something other than pure PHP to process this data if speed is of any concern.
In addition to the accepted answer and suggestions in the comments, I'd like to suggest PHP Judy array implementation.
Quick tests showed interesting results. An array with 1 million entries using regular PHP array data structure takes ~200 MB. SplFixedArray uses around 90 megabytes. Judy uses 8 megs. Tradeoff is in performance, Judy takes about double the time of regular php array implementation.
A little bit late to the party, but if you have a multidimensional array you can save a lot of RAM when you store the complete array as json.
$array = [];
$data = [];
$data["a"] = "hello";
$data["b"] = "world";
To store this array just use:
$array[] = json_encode($data);
instead of
$array[] = $data;
If you want to get the arrry back, just use something like:
$myData = json_decode($array[0], true);
I had a big array with 275.000 sets and saved about 36% RAM consumption.
EDIT:
I found a more better way, when you zip the json string:
$array[] = gzencode(json_encode($data));
and unzip it when you need it:
$myData = json_decode(gzdecode($array[0], true));
This saved me nearly 75% of RAM peak usage.

Categories