We like to store database values in array. But we do not know the maximum size of an array which is allowed in PHP?
There is no max on the limit of an array. There is a limit on the amount of memory your script can use. This can be changed in the 'memory_limit' in your php.ini configuration.
Array size is limited only by amount of memory your server has. If your array gets too big, you will get "out of memory" error.
It seems to me to be the 16-bit signed integer limit. (2^15)
$ar = [];
while (array_push($ar, null)) {
print 'a';
}
Length of output: 32768
If, like me, you need to use a huge array in a class in PHP 5.6.40 and have found that there is a limit to the size of class arrays so that they get overflowed and overwritten when surpassing 32768 elements, then here is the solution I found to work.
Create a public function with the huge array in it as a local variable. Then assign that local variable to the class variable. Call this function right in the constructor. You will see that it prints the correct size of the array instead of the overflow leftover size.
class Zipcode_search {
public $zipcodes;
public function __construct() {
$this->setHugeArray();
print "size is ".sizeof($this->zipcodes). "<br />";
}
public function setHugeArray(){
$zipcodes=[too much stuff];//actual array with +40,000 elements etc.
$this->zipcodes = $zipcodes;
}
}
2,147,483,647 items, even on 64-bit PHP. (PHP 7.2.24)
In PHP, typedef struct _hashtable is defined with uint values for nTableSize and nNumOfElements.
Because of this, the largest array you can create with array_fill() or range() appears to be 2^32-1 items. While keys can be anything, including numbers outside that range, if you start at zero, with a step size of 1, your highest index can be 2147483646.
If you are asking this question, you have likely seen an error like:
# php -r 'array_fill(0, 2147483648, 0);'
PHP Warning: array_fill(): Too many elements in Command line code on line 1
or even:
# php -r 'array_fill(0, 2147483647, 0);'
Segmentation fault (core dumped)
...or, most likely, the error which explicitly refers to the "maximum array size":
php -r 'range(0,2147483647);'
PHP Warning: range(): The supplied range exceeds the maximum array size:
start=0 end=2147483647 in Command line code on line 1
A caution for those reading this question:
The most common place you'll run into this, is through misuse/abuse of the range() operator, as if it was an iterator. It is in other languages, but in PHP it is not: it is an array-filling operator, just like array_fill().
So odds are good that you can avoid the array use entirely. It is unsafe to do things like:
foreach (range($min, $max, $step)) { ... stuff ... }
Instead do:
for ($i = $min; $i <= $max; $i += $step) { ... stuff ... }
Equally, I've seen people doing:
// Takes 3 minutes with $min=0, $max=1e9, $step=1.
// Segfaults with slightly larger ranges.
$numItems = count(range($min, $max, $step));
Which can instead be rewritten in a more secure, idiomatic and performant way:
// Takes microseconds with $min=0, $max=1e9, $step=1.
// Can handle vastly larger numbers, too.
$numItems = ($max - $min) % $step;
If you are running into errors about array size, odds are good that you are doing crazy stuff that you probably should avoid.
Related
In my PHP script I need to create an array of >600k integers. Unfortunately my webservers memory_limit is set to 32M so when initializing the array the script aborts with message
Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 71 bytes) in /home/www/myaccount/html/mem_test.php on line 8
I am aware of the fact, that PHP does not store the array values as plain integers, but rather as zvalues which are much bigger than the plain integer value (8 bytes on my 64-bit system). I wrote a small script to estimate how much memory each array entry uses and it turns out, that it's pretty exactly 128 bytes. 128!!! I'd need >73M just to store the array. Unfortunately the webserver is not under my control so I cannot increase the memory_limit.
My question is, is there any possibility in PHP to create an array-like structure that uses less memory. I don't need this structure to be associative (plain index-access is sufficient). It also does not need to have dynamic resizing - I know exactly how big the array will be. Also, all elements would be of the same type. Just like a good old C-array.
Edit:
So deceze's solution works out-of-the-box with 32-bit integers. But even if you're on a 64-bit system, pack() does not seem to support 64-bit integers. In order to use 64-bit integers in my array I applied some bit-manipulation. Perhaps the below snippets will be of help for someone:
function push_back(&$storage, $value)
{
// split the 64-bit value into two 32-bit chunks, then pass these to pack().
$storage .= pack('ll', ($value>>32), $value);
}
function get(&$storage, $idx)
{
// read two 32-bit chunks from $storage and glue them back together.
return (current(unpack('l', substr($storage, $idx * 8, 4)))<<32 |
current(unpack('l', substr($storage, $idx * 8+4, 4))));
}
The most memory efficient you'll get is probably by storing everything in a string, packed in binary, and use manual indexing to it.
$storage = '';
$storage .= pack('l', 42);
// ...
// get 10th entry
$int = current(unpack('l', substr($storage, 9 * 4, 4)));
This can be feasible if the "array" initialisation can be done in one fell swoop and you're just reading from the structure. If you need a lot of appending to the string, this becomes extremely inefficient. Even this can be done using a resource handle though:
$storage = fopen('php://memory', 'r+');
fwrite($storage, pack('l', 42));
...
This is very efficient. You can then read this buffer back into a variable and use it as string, or you can continue to work with the resource and fseek.
A PHP Judy Array will use significantly less memory than a standard PHP array, and an SplFixedArray.
I quote "An array with 1 million entries using regular PHP array data structure takes 200MB. SplFixedArray uses around 90 megabytes. Judy uses 8 megs. Tradeoff is in performance, Judy takes about double the time of regular php array implementation."
You could use an object if possible. These often use less memory than array's.
Also SplFixedArray is an good option.
But it really depends on the implementation that you need to do. If you need an function to return an array and are using PHP 5.5. You could use the generator yield to stream the array back.
You can try to use a SplFixedArray, it's faster and take less memory (the doc comment say ~30% less). Test here and here.
Use a string - that's what I'd do. Store it in a string on fixed offsets (16 or 20 digits should do it I guess?) and use substr to get the one needed. Blazing fast write / read, super easy, and 600.000 integers will only take ~12M to store.
base_convert() - if you need something more compact but with minimum effort, convert your integers to base-36 instead of base-10; in this case, a 14-digit number would be stored in 9 alphanumeric characters. You'll need to make 2 pieces of 64-bit ints, but I'm sure that's not a problem. (I'd split them to 9-digit chunks where conversion gives you a 6-char version.)
pack()/unpack() - binary packing is the same thing with a bit more efficiency. Use it if nothing else works; split your numbers to make them fit to two 32-bit pieces.
600K is a lot of elements. If you are open to alternative methods, I personally would use a database for that. Then use standard sql/nosql select syntax to pull things out. Perhaps memcache or redis if you have an easy host for that, such as garantiadata.com. Maybe APC.
Depending on how you are generate the integers, you could potentially use PHP's generators, assuming you are traversing the array and doing something with individual values.
I took the answer by #deceze and wrapped it in a class that can handle 32-bit integers. It is append-only, but you can still use it as a simple, memory-optimized PHP Array, Queue, or Heap. AppendItem and ItemAt are both O(1), and it has no memory overhead. I added currentPosition/currentSize to avoid unnecessary fseek function calls. If you need to cap memory usage and switch to a temporary file automatically, use php://temp instead.
class MemoryOptimizedArray
{
private $_storage;
private $_currentPosition;
private $_currentSize;
const BYTES_PER_ENTRY = 4;
function __construct()
{
$this->_storage = fopen('php://memory', 'rw+');
$this->_currentPosition = 0;
$this->_currentSize = 0;
}
function __destruct()
{
fclose($this->_storage);
}
function AppendItem($value)
{
if($this->_currentPosition != $this->_currentSize)
{
fseek($this->_storage, SEEK_END);
}
fwrite($this->_storage, pack('l', $value));
$this->_currentSize += self::BYTES_PER_ENTRY;
$this->_currentPosition = $this->_currentSize;
}
function ItemAt($index)
{
$itemPosition = $index * self::BYTES_PER_ENTRY;
if($this->_currentPosition != $itemPosition)
{
fseek($this->_storage, $itemPosition);
}
$binaryData = fread($this->_storage, self::BYTES_PER_ENTRY);
$this->_currentPosition = $itemPosition + self::BYTES_PER_ENTRY;
$unpackedElements = unpack('l', $binaryData);
return $unpackedElements[1];
}
}
$arr = new MemoryOptimizedArray();
for($i = 0; $i < 3; $i++)
{
$v = rand(-2000000000,2000000000);
$arr->AddToEnd($v);
print("added $v\n");
}
for($i = 0; $i < 3; $i++)
{
print($arr->ItemAt($i)."\n");
}
for($i = 2; $i >=0; $i--)
{
print($arr->ItemAt($i)."\n");
}
I am using the following code in an application based on ZF1:
$select = $db->select()->from('table', array('id', 'int', 'float'))->limit(10000, (($i - 1) * 10000));
$data = $select->query();
while ($row = $data->fetch()) {
# ...
}
This operation is happening in a foreach loop for some 800 times. I output the memory usage for each pass and can see it increasing by about 5MB per pass. I suppose that is because Zend apparently does not free the result from the query once the pass is complete. A simple unset didn't solve the issue. Using fetchAll also did not improve (or change) the situation.
Is there any way to free the result from a Zend_Db_Statement_PDO thus freeing the memory used by it? Or do you suspect another reason?
I believe you want to do this:
$sql = "SELECT something FROM random-table-with-an-obscene-large-amount-of-entries";
$res = $db->query($sql);
while ($row = $res->fetch(Zend_Db::FETCH_NUM)) {
// do some with the data returned in $row
}
Zend_Db::FETCH_NUM - return data in an array of arrays. The arrays are indexed by integers, corresponding to the position of the respective field in the select-list of the query.
Since you overwrite $row on each loop, the memory should be reclaimed. If you are paranoid you can unset($row) at the bottom of the loop I believe. I've not tested this myself recently, but I ran into a batch problem about a year ago that was similar, and I seem to recall using this solution.
Actually the problem was hidden somewhere else:
Inside the loop some integer results were stored in an array for modification at a later planned stage in the workflow.
While one might expect PHP arrays to be small, that is not the case: Arrays grow big really fast and a PHP array is on average 18 times larger than it is to be 'expected'. Watch out while working with arrays, even if you only store integers in them!
In case the linked article disappears sometime:
In this post I want to investigate the memory usage of PHP arrays (and values in general) using the following script as an example, which creates 100000 unique integer array elements and measures the resulting memory usage:
$startMemory = memory_get_usage();
$array = range(1, 100000);
echo memory_get_usage() - $startMemory, ' bytes';
How much would you expect it to be? Simple, one integer is 8 bytes (on a 64 bit unix machine and using the long type) and you got 100000 integers, so you obviously will need 800000 bytes. That’s something like 0.76 MBs.
Now try and run the above code. This gives me 14649024 bytes. Yes, you heard right, that’s 13.97 MB - eightteen times more than we estimated.
I'm looking for a way to measure the amount of data stored in a PHP array. I'm not talking about the number of elements in the array (which you can figure out with count($array, COUNT_RECURSIVE)), but the cumulative amount of data from all the keys and their corresponding values. For instance:
array('abc'=>123); // size = 6
array('a'=>1,'b'=>2); // size = 4
As what I'm interested in is order of magnitude rather than the exact amount (I want to compare the processing memory and time usage versus the size of the arrays) I thought about using the following trick:
strlen(print_r($array,true));
However the amount of overhead coming from print_r varies depending on the structure of the array which doesn't give me consistent results:
echo strlen(print_r(array('abc'=>123),true)); // 27
echo strlen(print_r(array('a'=>1,'b'=>2),true)); // 35
Is there a way (ideally in a one-liner and without impacting too much performance as I need to execute this at run-time on production) to measure the amount of data stored in an array in PHP?
Does this do the trick:
<?php
$arr = array('abc'=>123);
echo strlen(implode('',array_keys($arr)).implode('',$arr));
?>
Short answer: mission impossible
You could try something like:
strlen(serialize($myArray)) // either this
strlen(json_encode($myArray)) // or this
But to approximate the true memory footprint of an array, you will have to do a lot more than that. If you're looking for a ballpark estimate, arrays take 3-8x more than their serialized version, depending on what you store in them and how many elements you have. It increases gradually, in bigger and bigger chunks as your array grows. To give you an idea of what's happening, here's an array estimation function I came up with, after many hours of trying, for one-level arrays only:
function estimateArrayFootprint($a) { // copied from one of my failed quests :(
$size = 0;
foreach($a as $k=>$v) {
foreach([$k,$v] as $x) {
$n = strlen($x);
do{
if($n>8192 ) {$n = (1+($n>>12)<<12);break;}
if($n>1024 ) {$n = (1+($n>> 9)<< 9);break;}
if($n>512 ) {$n = (1+($n>> 8)<< 8);break;}
if($n>0 ) {$n = (1+($n>> 5)<< 5);break;}
}while(0);
$size += $n + 96;
}
}
return $size;
}
So that's how easy it is, not. And again, it's not a reliable estimation, it probably depends on the PHP memory limit, the architecture, the PHP version and a lot more. The question is how accurately do you need this value.
Also let's not forget that these values came from a memory_get_usage(1) which is not very exact either. PHP allocates memory in big blocks in order to avoid a noticeable overhead as your string/array/whatever else grows, like in a for(...) $x.="yada" situation.
I wish I could say anything more useful.
Here is my code, which creates 2d array filled with zeros, array dimensions are (795,6942):
function zeros($rowCount, $colCount){
$matrix = array();
for ($rowIndx=0; $rowIndx<$rowCount; $rowIndx++){
$matrix[] = array();
for($colIndx=0; $colIndx<$colCount; $colIndx++){
$matrix[$rowIndx][$colIndx]=0;
}
}
return $matrix;
}
$matrix = zeros(795,6942);
And here is the error that I receive:
Allowed memory size of 134217728 bytes exhausted (tried to allocate 35 bytes)
Any ideas how to solve this?
As a quick calculation, you are trying to create an array that contains :
795*6942 = 5,518,890
integers.
If we consider that one integer is stored on 4 bytes (i.e. 32 bits ; using PHP, it not be less), it means :
5518890*4 = 22,075,560
bytes.
OK, quick calculation... result is "it should be OK".
But things are not that easy, unfortunatly :-(
I suppose it's related to the fact that data is stored by PHP using an internal data-structure that's much more complicated than a plain 32 bits integer
Now, just to be curious, let's modify your function so it outputs how much memory is used at the end of each one of the outer for-loop :
function zeros($rowCount, $colCount){
$matrix = array();
for ($rowIndx=0; $rowIndx<$rowCount; $rowIndx++){
$matrix[] = array();
for($colIndx=0; $colIndx<$colCount; $colIndx++){
$matrix[$rowIndx][$colIndx]=0;
}
var_dump(memory_get_usage());
}
return $matrix;
}
With this, I'm getting this kind of output (PHP 5.3.2-dev on a 64bits system ; memory_limit is set to 128MB -- which is already a lot !) :
int 1631968
int 2641888
int 3651808
...
...
int 132924168
int 133934088
Fatal error: Allowed memory size of 134217728 bytes exhausted
Which means each iteration of the outer for-loop requires something like 1.5 MB of memory -- and I only get to 131 iterations before the script runs out of memory ; and not 765 like you wanted.
Considering you set your memory_limit to 128M, you'd have to set it to something really much higher -- like
128*(765/131) = 747 MB
Well, even with
ini_set('memory_limit', '750M');
it's still not enough... with 800MB, it seems enough ;-)
But I would definitly not recommend setting memory_limit to such a high value !
(If you have 2GB of RAM, your server will not be able to handle more than 2 concurrent users ^^ ;; I wouldn't actually test this if my computer had 2GB of RAM, to be honest)
The only solution I see here is for you to re-think your design : there has to be something else you can do than use this portion of code :-)
(BTW : maybe "re-think your design" means using another language PHP : PHP is great when it comes to developping web-sites, but is not suited to every kind of problem)
The default PHP array implementation is very memory-intensive. If you are just storing integers (and lots of them), you probably want to look at SplFixedArray. It uses a regular contiguous memory block to store the data, as opposed to the traditional hash structure.
You should try increasing the amount of memory available to PHP:
ini_set('memory_limit', '32M');
in your PHP file.
I'm starting out my expedition into Project Euler. And as many others I've figured I need to make a prime number generator. Problem is: PHP doesn't like big numbers. If I use the standard Sieve of Eratosthenes function, and set the limit to 2 million, it will crash. It doesn't like creating arrays of that size. Understandable.
So now I'm trying to optimize it. One way, I found, was to instead of creating an array with 2 million variable, I only need 1 million (only odd numbers can be prime numbers). But now it's crashing because it exceeds the maximum execution time...
function getPrimes($limit) {
$count = 0;
for ($i = 3; $i < $limit; $i += 2) {
$primes[$count++] = $i;
}
for ($n = 3; $n < $limit; $n += 2) {
//array will be half the size of $limit
for ($i = 1; $i < $limit/2; $i++) {
if ($primes[$i] % $n === 0 && $primes[$i] !== $n) {
$primes[$i] = 0;
}
}
}
return $primes;
}
The function works, but as I said, it's a bit slow...any suggestions?
One thing I've found to make it a bit faster is to switch the loop around.
foreach ($primes as $value) {
//$limitSq is the sqrt of the limit, as that is as high as I have to go
for ($n = 3; $n = $limitSq; $n += 2) {
if ($value !== $n && $value % $n === 0) {
$primes[$count] = 0;
$n = $limitSq; //breaking the inner loop
}
}
$count++;
}
And in addition setting the time and memory limit (thanks Greg), I've finally managed to get an answer. phjew.
Without knowing much about the algorithm:
You're recalculating $limit/2 each time around the $i loop
Your if statement will be evaluated in order, so think about (or test) whether it would be faster to test $primes[$i] !== $n first.
Side note, you can use set_time_limit() to give it longer to run and give it more memory using
ini_set('memory_limit', '128M');
Assuming your setup allows this, of course - on a shared host you may be restricted.
From Algorithmist's proposed solution
This is a modification of the standard
Sieve of Eratosthenes. It would be
highly inefficient, using up far too
much memory and time, to run the
standard sieve all the way up to n.
However, no composite number less than
or equal to n will have a factor
greater than sqrt{n},
so we only need to know all primes up
to this limit, which is no greater
than 31622 (square root of 10^9). This
is accomplished with a sieve. Then,
for each query, we sieve through only
the range given, using our
pre-computed table of primes to
eliminate composite numbers.
This problem has also appeared on UVA's and Sphere's online judges. Here's how it's enunciated on Sphere.
You can use a bit field to store your sieve. That is, it's roughly identical to an array of booleans, except you pack your booleans into a large integer. For instance if you had 8-bit integers you would store 8 bits (booleans) per integer which would further reduce your space requirements.
Additionally, using a bit field allows the possibility of using bit masks to perform your sieve operation. For example, if your sieve kept all numbers (not just odd ones), you could construct a bit mask of b01010101 which you could then AND against every element in your array. For threes you could use three integers as the mask: b00100100 b10010010 b01001001.
Finally, you do not need to check numbers that are lower than $n, in fact you don't need to check for numbers less than $n*$n-1.
Once you know the number is not a prime, I would exit the enter loop. I don't know php, but you need a statement like a break in C or a last in Perl.
If that is not available, I would set a flag and use it to exit the inter loop as a condition of continuing the interloop. This should speed up your execution as you are not checking $limit/2 items if it is not a prime.
if you want speed, don’t use PHP on this one :P
no, seriously, i really like PHP and it’s a cool language, but it’s not suited at all for such algorithms