What is the maximum length of a String in PHP? - php

So how big can a $variable in PHP get? I've tried to test this, but I'm not sure that I have enough system memory (~2gb). I figure there has to be some kind of limit. What happens when a string gets too large? Is it concatenated, or does PHP throw an exception?

http://php.net/manual/en/language.types.string.php says:
Note: As of PHP 7.0.0, there are no particular restrictions regarding the length of a string on 64-bit builds. On 32-bit builds and in earlier versions, a string can be as large as up to 2GB (2147483647 bytes maximum)
In PHP 5.x, strings were limited to 231-1 bytes, because internal code recorded the length in a signed 32-bit integer.
You can slurp in the contents of an entire file, for instance using file_get_contents()
However, a PHP script has a limit on the total memory it can allocate for all variables in a given script execution, so this effectively places a limit on the length of a single string variable too.
This limit is the memory_limit directive in the php.ini configuration file. The memory limit defaults to 128MB in PHP 5.2, and 8MB in earlier releases.
If you don't specify a memory limit in your php.ini file, it uses the default, which is compiled into the PHP binary. In theory you can modify the source and rebuild PHP to change this default value.
If you specify -1 as the memory limit in your php.ini file, it stop checking and permits your script to use as much memory as the operating system will allocate. This is still a practical limit, but depends on system resources and architecture.
Re comment from #c2:
Here's a test:
<?php
// limit memory usage to 1MB
ini_set('memory_limit', 1024*1024);
// initially, PHP seems to allocate 768KB for basic operation
printf("memory: %d\n", memory_get_usage(true));
$str = str_repeat('a', 255*1024);
echo "Allocated string of 255KB\n";
// now we have allocated all of the 1MB of memory allowed
printf("memory: %d\n", memory_get_usage(true));
// going over the limit causes a fatal error, so no output follows
$str = str_repeat('a', 256*1024);
echo "Allocated string of 256KB\n";
printf("memory: %d\n", memory_get_usage(true));

String can be as large as 2GB.
Source

PHP's string length is limited by the way strings are represented in PHP; memory does not have anything to do with it.
According to phpinternalsbook.com, strings are stored in struct { char *val; int len; } and since the maximum size of an int in C is 4 bytes, this effectively limits the maximum string size to 2GB.

In a new upcoming php7 among many other features, they added a support for strings bigger than 2^31 bytes:
Support for strings with length >= 2^31 bytes in 64 bit builds.
Sadly they did not specify how much bigger can it be.

The maximum length of a string variable is only 2GiB - (2^(32-1) bits). Variables can be addressed on a character (8 bits/1 byte) basis and the addressing is done by signed integers which is why the limit is what it is. Arrays can contain multiple variables that each follow the previous restriction but can have a total cumulative size up to memory_limit of which a string variable is also subject to.

To properly answer this qustion you need to consider PHP internals or the target that PHP is built for.
To answer this from a typical Linux perspective on x86...
Sizes of types in C:
https://usrmisc.wordpress.com/2012/12/27/integer-sizes-in-c-on-32-bit-and-64-bit-linux/
Types used in PHP for variables:
http://php.net/manual/en/internals2.variables.intro.php
Strings are always 2GB as the length is always 32bits and a bit is wasted because it uses int rather than uint. int is impractical for lengths over 2GB as it requires a cast to avoid breaking arithmetic or "than" comparisons. The extra bit is likely being used for overflow checks.
Strangely, hash keys might internally support 4GB as uint is used although I have never put this to the test. PHP hash keys have a +1 to the length for a trailing null byte which to my knowledge gets ignored so it may need to be unsigned for that edge case rather than to allow longer keys.
A 32bit system may impose more external limits.

Related

Does PHP do any parsing on the php.ini file?

Running PHP Version 7.1.30 under RHEL 7.7.
I'm wanting to bump memory_limit, but wasn't sure if I had the syntax right (i.e. 256M or 256MB). So to start with I put a bad value "Hugo" in as the memory_limit setting. The trouble with this is the result of phpinfo() (run under httpd) literally has the string "Hugo" in place, i.e.:
So this has me somewhat concerned that PHP doesn't actually do any sanity checking for the value(s). (If the value provided was bad I would expect it to revert to a default, e.g.)
Can anyone comment on this - in particular, how do you know whether PHP will be enforcing things (if an arbitary string can be provided).
The confusing thing here is that the setting looks like an integer with some special syntax, but is internally defined as a string. The string is then parsed into a separate global variable whenever the value is changed. Crucially, the result of parsing the string to an integer isn't saved back to the settings table, so when you call phpinfo(), you see the original input, not the parsed value.
You can see this in the source:
The setting is defined with a callback for when it is changed
The callback is fed the raw string value
The callback for that particular setting parses the string to an integer using a function called zend_atol, which handles the special suffixes
It then calls a function which sets a global variable ("AG" means "Allocation Manager Global Variable", the macro is used to manage thread-safety if that's compiled in)
The supported syntax is ultimately defined in zend_atol, which:
parses the string for a numeric value, ignoring any additional text
looks at the last character of the string, and multiplies the preceding value if it is g, G, m, M, k, or K
A value with no digits at the start will be parsed as zero. When setting the global variable, this will set the memory limit to the minimum allowed, based on the constant ZEND_MM_CHUNK_SIZE.
You can see the effect by setting the memory limit, then running a loop that quickly allocates a large amount of memory and seeing what comes out in the error message. For instance:
# Invalid string; sets to compiled minimum
php -r 'ini_set("memory_limit", "HUGO"); while(true) $a[]=$a;'
# -> PHP Fatal error: Allowed memory size of 2097152 bytes exhausted
# Number followed by a string; takes the number
php -r 'ini_set("memory_limit", "4000000 HUGO"); while(true) $a[]=$a;'
# -> PHP Fatal error: Allowed memory size of 4000000 bytes exhausted
# Number followed by a string, but ending in one of the recognised suffixes
# This finds both the number and the suffix, so is equivalent to "4M", i.e. 4MiB
php -r 'ini_set("memory_limit", "4 HUGO M"); while(true) $a[]=$a;'
# -> PHP Fatal error: Allowed memory size of 4194304 bytes exhausted
First thing first,
We first need to understand how PHP.ini work in the way of interpretation workflow.
memory_limit is directives for PHP.
when using with PHP function you have to do something like this ini_set(‘memory_limit’,’256MB’). So, this function will temporarily set your value to the interpreter variable. If you see closer then you can get the two columns One is for the Local and One is for global. That shows the capability of the values to the individual respectively.
But, When you defined for global you need to set as a suffix with K, M, G respectively. If we exceed this value using apache .htaccess it requires the same for the PHP fpm.

Is there a way to increase String size in php to load 3GB content? [duplicate]

So how big can a $variable in PHP get? I've tried to test this, but I'm not sure that I have enough system memory (~2gb). I figure there has to be some kind of limit. What happens when a string gets too large? Is it concatenated, or does PHP throw an exception?
http://php.net/manual/en/language.types.string.php says:
Note: As of PHP 7.0.0, there are no particular restrictions regarding the length of a string on 64-bit builds. On 32-bit builds and in earlier versions, a string can be as large as up to 2GB (2147483647 bytes maximum)
In PHP 5.x, strings were limited to 231-1 bytes, because internal code recorded the length in a signed 32-bit integer.
You can slurp in the contents of an entire file, for instance using file_get_contents()
However, a PHP script has a limit on the total memory it can allocate for all variables in a given script execution, so this effectively places a limit on the length of a single string variable too.
This limit is the memory_limit directive in the php.ini configuration file. The memory limit defaults to 128MB in PHP 5.2, and 8MB in earlier releases.
If you don't specify a memory limit in your php.ini file, it uses the default, which is compiled into the PHP binary. In theory you can modify the source and rebuild PHP to change this default value.
If you specify -1 as the memory limit in your php.ini file, it stop checking and permits your script to use as much memory as the operating system will allocate. This is still a practical limit, but depends on system resources and architecture.
Re comment from #c2:
Here's a test:
<?php
// limit memory usage to 1MB
ini_set('memory_limit', 1024*1024);
// initially, PHP seems to allocate 768KB for basic operation
printf("memory: %d\n", memory_get_usage(true));
$str = str_repeat('a', 255*1024);
echo "Allocated string of 255KB\n";
// now we have allocated all of the 1MB of memory allowed
printf("memory: %d\n", memory_get_usage(true));
// going over the limit causes a fatal error, so no output follows
$str = str_repeat('a', 256*1024);
echo "Allocated string of 256KB\n";
printf("memory: %d\n", memory_get_usage(true));
String can be as large as 2GB.
Source
PHP's string length is limited by the way strings are represented in PHP; memory does not have anything to do with it.
According to phpinternalsbook.com, strings are stored in struct { char *val; int len; } and since the maximum size of an int in C is 4 bytes, this effectively limits the maximum string size to 2GB.
In a new upcoming php7 among many other features, they added a support for strings bigger than 2^31 bytes:
Support for strings with length >= 2^31 bytes in 64 bit builds.
Sadly they did not specify how much bigger can it be.
The maximum length of a string variable is only 2GiB - (2^(32-1) bits). Variables can be addressed on a character (8 bits/1 byte) basis and the addressing is done by signed integers which is why the limit is what it is. Arrays can contain multiple variables that each follow the previous restriction but can have a total cumulative size up to memory_limit of which a string variable is also subject to.
To properly answer this qustion you need to consider PHP internals or the target that PHP is built for.
To answer this from a typical Linux perspective on x86...
Sizes of types in C:
https://usrmisc.wordpress.com/2012/12/27/integer-sizes-in-c-on-32-bit-and-64-bit-linux/
Types used in PHP for variables:
http://php.net/manual/en/internals2.variables.intro.php
Strings are always 2GB as the length is always 32bits and a bit is wasted because it uses int rather than uint. int is impractical for lengths over 2GB as it requires a cast to avoid breaking arithmetic or "than" comparisons. The extra bit is likely being used for overflow checks.
Strangely, hash keys might internally support 4GB as uint is used although I have never put this to the test. PHP hash keys have a +1 to the length for a trailing null byte which to my knowledge gets ignored so it may need to be unsigned for that edge case rather than to allow longer keys.
A 32bit system may impose more external limits.

PHP: Calculating File HASH for Files Larger than 2GB

Would you advise please, how to calculate file HASH on files larger than 2GB in PHP?
The only PHP function known to me is:
string hash_file ( string $algo , string $filename [, bool $raw_output = false ] )
This function however has a limitation. It returns HASH for files smaller than 2GB. For larger files, hash_file() throws error.
Here are some constraints/requests:
should work on Linux Ubuntu 64bit server
compatible with PHP 5+
there should be no file size limit
should be as fast as possible
This is all the information I have now. Thank you very much.
UPDATE
I have a solution that is more practical and efficient than any hash calculation from data >2GB.
I have realized, that I do not have to generate hash from complete files that are over 2GB. To uniquely identify any file, calculating hash from say first 10KB of data of any file should be sufficient. Moreover, it will be faster than >2GB calculation. In other words, ability to calculate hash from a data string that is over 2GB probably is not necessary at all.
I will wait for your reactions. In couple of days, I will close this question.
I would use exec() to run a local hashing function in the shell and return the value back to the php script. Here's an example with md5 but any algo available can be used.
$results = array();
$filename = '/full/path/to/file';
exec("md5sum $filename", $results);
Then parse the result array (the output of the shell command).
In general, I like to avoid doing anything directly in PHP that requires more than 1G of memory, especially if running in php-fpm or as an apache module--sort of time reinforced prejudice. This is definitely my advice when there is a native application that can accomplish the goal and you don't particularly need portablitly cross platform (like run on both linux and windows machines).

How to change the limit of a string's max lengt?

The title tells everything, can i change the limit of a strings length? I've checked this Page on stackoverflow, and it only tells that there is a limit, but how can i change it?
I've tried changing the memory limit using PHP like this:
ini_set('memory_limit', '-1');
But it didn't work...
From the documentation:
Note: string can be as large as up to 2GB (2147483647 bytes maximum)
and also:
The string in PHP is implemented as an array of bytes and an integer indicating the length of the buffer. It has no information about how those bytes translate to characters, leaving that task to the programmer.
That integer is probably the limit.

Searching for hex string in a file in php?

I'm currently using the following two methods in my class to get the job done:
function xseek($h,$pos){
rewind($h);
if($pos>0)
fread($h,$pos);
}
function find($str){
return $this->startingindex($this->name,$str);
}
function startingindex($a,$b){
$lim = 1 + filesize($a) - strlen($b)/2;
$h = fopen($a,"rb");
rewind($h);
for($i=0;$i<$lim;$i++){
$this->xseek($h,$i);
if($b==strtoupper(bin2hex(fread($h,strlen($b)/2)))){
fclose($h);
return $i;
}
}
fclose($h);
return -1;
}
I realize this is quite inefficient, especially for PHP, but I'm not allowed any other language on my hosting plan.
I ran a couple tests, and when the hex string is towards the beginning of the file, it runs quickly and returns the offset. When the hex string isn't found, however, the page hangs for a while. This kills me inside because last time I tested with PHP and had hanging pages, my webhost shut my site down for 24 hours due to too much cpu time.
Is there a better way to accomplish this (finding a hex string's offset in a file)? Is there certain aspects of this that could be improved to speed up execution?
I would read the entire contents of the file into one hex string and use strrpos, but I was getting errors about maximum memory being exceeded. Would this be a better method if I chopped the file up and searched large pieces with strrpos?
edit:
To specify, I'm dealing with a settings file for a game. The settings and their values are in a block where there is a 32-bit int before the setting, then the setting, a 32-bit int before the value, and then the value. Both ints represent the lengths of the following strings. For example, if the setting was "test" and the value was "0", it would look like (in hex): 00000004746573740000000130. Now that you mention it, this does seem like a bad way to go about it. What would you recommend?
edit 2:
I tried a file that was below the maximum memory I'm allowed and tried strrpos, but it was very much slower than the way I've been trying.
edit 3: in reply to Charles:
What's unknown is the length of the settings block and where it starts. What I do know is what the first and last settings USUALLY are. I've been using these searching methods to find the location of the first and last setting and determine the length of the settings block. I also know where the parent block starts. The settings block is generally no more than 50 bytes into its parent, so I could start the search for the first setting there and limit how far it will search. The problem is that I also need to find the last setting. The length of the settings block is variable and could be any length. I could read the file the way I assume the game does, by reading the size of the setting, reading the setting, reading the size of the value, reading the value, etc. until I reached a byte with value -1, or FF in hex. Would a combination of limiting the search for the first setting and reading the settings properly make this much more efficient?
You have a lot of garbage code. For example, this code is doing nearly nothing:
function xseek($h,$pos){
rewind($h);
if($pos>0)
fread($h,$pos);
}
because it reads everytime from the begining of the file. Furthemore, why do you need to read something if you are not returning it? May be you looke for fseek()?
If you need to find a hex string in binary file, may be better to use something like this: http://pastebin.com/fpDBdsvV (tell me if there some bugs/problems).
But, if you are parsing game's settings file, I'd advise you to use fseek(), fread() and unpack() to seek to a place of where setting is, read portion of bytes and unpack it to PHP's variable types.

Categories