I have an API endpoint that can receive a POST JSON/XML body (or even raw binary data) inside the body content as payload that should be written immediately to a file on the filesystem.
For backwards compatibility reasons, it cannot be a multipart/form-data.
It works with no problems for body content up to a certain size (around 2.3GB with a 8GB script memory limit).
I've tried all of the followings:
both with and without setting the buffers' sizes
$filename = '/tmp/test_big_file.bin';
$input = fopen('php://input', 'rb');
$output = fopen($filename, 'wb');
stream_set_read_buffer($input, 4096);
stream_set_write_buffer($output, 4096);
stream_copy_to_stream($input, $output);
fclose($input);
fclose($output);
and
$filename = '/tmp/test_big_file.bin';
file_put_contents($filename, file_get_contents('php://input'));
and
$filename = '/tmp/test_big_file.bin';
$input = fopen('php://input', 'rb');
$output = fopen($filename, 'wb');
while (!feof($input)) {
fwrite($output, fread($input, 8192), 8192);
}
fclose($input);
fclose($output);
Unfortunately, none of them works. At one point, I get always the same error:
PHP Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 2475803056 bytes) in Unknown on line 0
Also unsetting enable_post_data_reading makes no difference and all the php.ini post/memory/whatever sizes are set to 8GB.
I'm using php-fpm.
Looking what's happening at the memory with free -mt, I can see that the memory used increases slowly at the beginning, going faster after a while, up to a point that no more free memory is left, so the error.
On the temp directory, the file is not directly stream-copied, but instead it is written on a temporary file named php7NARsX or other random strings which is not deleted after the script crashes, so that at the following free -mt check, the available memory is 2.3GB less.
Now my questions:
Why the stream is not copied directly from php://input to the output instead of loading it into memory? (also using php://temp as output stream leads to the same error)
Why is PHP using so much memory? I'm sending a 3GB payload, so why it needs more than 8GB?
Of course, any working solution will be much appreciated. Thank You!
Related
I'm using a stream in PHP by using the GuzzleHttp\Stream\Stream class. Whilst using it i'm getting PHP memory allocation problems. Is there a method i can use which doesn't use up much memory?
Problem
When i need to serve a Content-Range of 0-381855148 bytes (for example) this causes me a memory allocation issues. Is there a method how can i serve the content while not needing that much memory? Something that passes the data straight through, instead of "reserving" it in memory?
This is part of my code responsible for the error...
$stream = GuzzleHttp\Stream\Stream::factory(fopen($path, 'r'));
$stream->seek($offset);
while (!$stream->eof()) {
echo $stream->read($length);
}
$stream->close();
This is passed as a callback function for my stream.
Background
First i tried fixing the problem by providing a maximum chunk length for my stream. I did this by giving my stream a maximum offset. It's fixes the memory allocation problem, but new problems arise in Firefox when distributing my dynamic video content. Chrome doesn't have problems with it.
It's because Firefox asks for a "0-" Content-Range but i give a Content-Range "0-" back. Instead i need to give back the whole range (until maximum) but this causes the infamous "Allowed memory size of 262144 bytes exhausted (tried to allocate 576 bytes)" error.
Disclaimer: it's actually a little bit more technical. But i wanted to keep it simple.
Does someone knows a solution?
Thanks.
Found my answer on a different forum.
The reason for the memory exhausted problem was because of Guzzle and the way it is build (PSR-7).
A more in in depth article about the problem: https://evertpot.com/psr-7-issues/
I fixed it using this code:
if ($i = ob_get_level()) {
# Clear buffering:
while ($i-- && ob_end_clean());
if (!ob_get_level()) header('Content-Encoding: ');
}
ob_implicit_flush();
$fp = fopen($path, 'rb');
fseek($fp, $offset);
while ($length && !feof($fp)) {
$chunk = min(8192, $length);
echo fread($fp, $chunk);
$length -= $chunk;
}
fclose($fp);
Credits goes to djmaze
I want to send an external MP4 file in chunks of 1 MB each to a user. With each chunk I update a database entry to keep track of the download progress. I use fread() to read the file in chunks. Here is the stripped down code:
$filehandle = fopen($file, 'r');
while(!feof($filehandle)){
$buffer = fread($filehandle, 1024*1024);
//do some database stuff
echo $buffer;
ob_flush();
flush();
}
However, when I check the chunk size at some iteration inside the while loop, with
$chunk_length = strlen($buffer);
die("$chunk_length");
I do never get the desired chunk size. It fluctates somewhere around 7000 - 8000 bytes. Nowhere near 1024*1024 bytes.
When I decrease the chunk size to a smaller number, for example 1024 bytes, it works as expected.
According to the PHP fread() manual:
"When reading from anything that is not a regular local file, such as
streams returned when reading remote files or from popen() and
fsockopen(), reading will stop after a packet is available."
In this case I opened a remote file. Apparently, this makes fread() stop not at the specified length, but when the first package has arrived.
I wanted to keep track of a download of an external file.
If you to do this (or keep track of an upload), use CURL instead:
curl_setopt($curl_handle, CURLOPT_NOPROGRESS, false);
curl_setopt($curl_handle, CURLOPT_PROGRESSFUNCTION, 'callbackFunction');
function callbackFunction($download_size, $downloaded, $upload_size, $uploaded){
//do stuff with the parameters
}
Found this:
https://stackoverflow.com/a/11373078/530599 - great, but
how about stream_filter_append($fp, 'zlib.inflate', STREAM_FILTER_*
Looking for another way to uncompress data.
$fp = fopen($src, 'rb');
$to = fopen($output, 'wb');
// some filtering here?
stream_copy_to_stream($fp, $to);
fclose($fp);
fclose($to);
Where $src is some url to http://.../file.gz for example 200+ Mb :)
Added test-code that works, but in 2 steps:
<?php
$src = 'http://is.auto.ru/catalog/catalog.xml.gz';
$fp = fopen($src, 'rb');
$to = fopen(dirname(__FILE__) . '/output.txt.gz', 'wb');
stream_copy_to_stream($fp, $to);
fclose($fp);
fclose($to);
copy('compress.zlib://' . dirname(__FILE__) . '/output.txt.gz', dirname(__FILE__) . '/output.txt');
Try gzopen which opens a gzip (.gz) file for reading or writing. If the file is not compress, it transparently reads it so you can safely read a non-gzipped file.
$fp = gzopen($src, 'rb');
$to = fopen($output, 'w+b');
while (!feof($fp)) {
fwrite($to, gzread($fp, 2048)); // writes decompressed data from $fp to $to
}
fclose($fp);
fclose($to);
One of the annoying omissions in PHP's stream filter subsystem is the lack of a gzip filter. Gzip is essentially contents compressed using the deflate method. It adds a 2-byte header before the deflated data, however, and a Adler-32 checksum at the end. If you just add an zlib.inflate filter to a stream, it's not going to work. You have to skip the first two bytes before attaching the filter.
Note that there's a serious bug with stream filters in PHP version 5.2.X. It's due to stream buffering. Basically PHP would fail to pass data already in the stream's internal buffer through the filter. If you do a fread($handle, 2) to read the gzip header before attaching the inflate filter, there's a good chance that it's going to fail. A call to fread() would cause PHP to try to fill up the its buffer. Even if the call to fread() asks for only two bytes, PHP might actually read many more bytes (let say 1024) from the physical medium in an attempt to improve performance. Due to the aforementioned bug, the extra 1022 bytes would not get send to the decompression routine.
I write an php script that help with limit speed and connections in download files. I used fopen() and fseek() something like this:
$f = fopen($file, 'rb');
if($f){
fseek($f,$start);//$start extracted from $_SERVER['HTTP_RANGE']
while(!feof($f)){
echo fread($f,$speed);//$speed is bytes per second
flush();
ob_flush();
sleep(1);
}
fclose($f);
}
download process may take several hours to complete, is whole file be in memory until end of download? and how I can optimize this?
No, fread uses an internal buffer to stream the data (8KB by default), so only a very small part of the file actually resides in memory.
I try to export some documents from mongodb to .csv. For some large lists, the files would be something like 40M, I get errors about memory limit:
Fatal error: Allowed memory size of 134217728 bytes exhausted
(tried to allocate 44992513 bytes) in
/usr/share/php/Zend/Controller/Response/Abstract.php on line 586
I wonder why this error happens. What consumes such an amount of memory? How do I avoid such error without changing memory_limit which is set 128M now.
I use something like this:
public static function exportList($listId, $state = self::SUBSCRIBED)
{
$list = new Model_List();
$fieldsInfo = $list->getDescriptionsOfFields($listId);
$headers = array();
$params['list_id'] = $listId;
$mongodbCursor = self::getCursor($params, $fieldsInfo, $headers);
$mongodbCursor->timeout(0);
$fp = fopen('php://output', 'w');
foreach ($mongodbCursor as $subscriber) {
foreach ($fieldsInfo as $fieldInfo) {
$field = ($fieldInfo['constant']) ? $fieldInfo['field_tag'] : $fieldInfo['field_id'];
if (!isset($subscriber->$field)) {
$row[$field] = '';
} elseif (Model_CustomField::isMultivaluedType($fieldInfo['type'])) {
$row[$field] = array();
foreach ($subscriber->$field as $value) {
$row[$field][] = $value;
}
$row[$field] = implode(self::MULTIVALUED_DELEMITOR, $row[$field]);
} else {
$row[$field] = $subscriber->$field;
}
}
fputcsv($fp, $row);
}
}
Then in my controller I try to call it something like this:
public function exportAction()
{
set_time_limit(300);
$this->_helper->layout->disableLayout();
$this->_helper->viewRenderer->setNoRender();
$fileName = $list->list_name . '.csv';
$this->getResponse()->setHeader('Content-Type', 'text/csv; charset=utf-8')
->setHeader('Content-Disposition', 'attachment; filename="'. $fileName . '"');
Model_Subscriber1::exportList($listId);
echo 'Peak memory usage: ', memory_get_peak_usage()/1024, ' Memory usage: ', memory_get_usage()/1024;
}
So I'm at the end of the file where I export data. It's rather strange that for the list I export with something like 1M documents, it exports successfully and displays:
> Peak memory usage: 50034.921875 Kb Memory usage: 45902.546875 Kb
But when I try to export 1.3M documents, then after several minutes I only get in export file:
Fatal error: Allowed memory size of 134217728 bytes exhausted
(tried to allocate 44992513 bytes) in
/usr/share/php/Zend/Controller/Response/Abstract.php on line 586.
The size of documents I export are approximately the same.
I increased memory_limit to 256M and tried to export 1.3M list, this is what it showed:
Peak memory usage: 60330.4609375Kb Memory usage: 56894.421875 Kb.
It seems very confusing to me. Isn't this data so inaccurate? Otherwise, why it causes memory exhausted error with memory_limit set to 128M?
While the size of the documents may be about the same, the size allocated by PHP to process them isn't directly proportional to the document size or number of documents. This is because different types require different memory allocation in PHP. You may be able to free some memory as you go, but I don't see any place where you can in your code.
The best answer is to probably just increase the memory limit.
One thing you could do is offload the processing to an external script and call that from PHP. Many languages do this sort of processing in a more memory efficient way than PHP.
I've also noticed that the memory_get_peak_usage() isn't always accurate. I would try an experiment to increase the mem_limit to say 256 and run it on the larger data set (the 1.3 million). You are likely to find that it reports below the 128 limit as well.
I could reproduce this issue in a similar case of exporting a CSV file, where my system should have had enough memory, as shown by memory_get_usage(), but ended up with the same fatal error:
Fatal error: Allowed memory size.
I circumvented this issue by outputting the CSV contents into a physical temporary file, that I eventually zipped, before reading it out.
I wrote the file in a loop, so that each iteration wrote only a limited chunk of data, so that I never exceded the memory limit.
After zipping, the compression ratio was such, that I could handle raw files of over 10 times the size I initially hit the wall at. All up, it was a success.
Hint: when creating your archive, don't unlink the archive component(s) before invoking $zip->close(), as this call seems to be the one doing the business. Otherwise you'll end up with an empty archive!
Code sample:
<?php
$zip = new ZipArchive;
if ($zip->open($full_zip_path, ZipArchive::CREATE) === TRUE) {
$zip->addFile($full_csv_path, $csv_name);
$zip->close();
$Response->setHeader("Content-type", "application/zip; charset=utf-8");
$Response->setHeader("Content-disposition", "attachment; filename=" . $zip_name);
$Response->setBody(file_get_contents($full_zip_path));
}
else {
var_dump(error_get_last());
echo utf8_decode("Couldn't create zip archive '$full_zip_path'."), "\r\n";
}
unset($zip);
?>
Attention: when adding items to the zip archive, don't prepend a leading slash to the item's name if using Windows based OS.
Discussion over the original issue:
The Zend file at the line quoted is the
public function outputBody()
{
$body = implode('', $this->_body);
echo $body;
}
from the outputBody() method of the Zend_Controller_Response_Abstract class.
It looks like, however you do it, through echo, or print, or readfile, the output is always captured, and stuck into the response body, even if your turn the response return feature off before the dispatch.
I even tried to use the clearBody() class method, within the echo loop, with in mind that each $response->sendResponse() followed by $response->clearBody() would release memory, but it failed.
The way Zend handles the sending of the response is such that I always got the memory allocation of the full size of the raw CSV file.
Yet to be determined how it would be possible to tell Zend not to "capture" the output buffer.