PHP fread() chunk length not taken into acount correctly - php

I want to send an external MP4 file in chunks of 1 MB each to a user. With each chunk I update a database entry to keep track of the download progress. I use fread() to read the file in chunks. Here is the stripped down code:
$filehandle = fopen($file, 'r');
while(!feof($filehandle)){
$buffer = fread($filehandle, 1024*1024);
//do some database stuff
echo $buffer;
ob_flush();
flush();
}
However, when I check the chunk size at some iteration inside the while loop, with
$chunk_length = strlen($buffer);
die("$chunk_length");
I do never get the desired chunk size. It fluctates somewhere around 7000 - 8000 bytes. Nowhere near 1024*1024 bytes.
When I decrease the chunk size to a smaller number, for example 1024 bytes, it works as expected.

According to the PHP fread() manual:
"When reading from anything that is not a regular local file, such as
streams returned when reading remote files or from popen() and
fsockopen(), reading will stop after a packet is available."
In this case I opened a remote file. Apparently, this makes fread() stop not at the specified length, but when the first package has arrived.
I wanted to keep track of a download of an external file.
If you to do this (or keep track of an upload), use CURL instead:
curl_setopt($curl_handle, CURLOPT_NOPROGRESS, false);
curl_setopt($curl_handle, CURLOPT_PROGRESSFUNCTION, 'callbackFunction');
function callbackFunction($download_size, $downloaded, $upload_size, $uploaded){
//do stuff with the parameters
}

Related

difference between readfile() and fopen()

These two codes both do the same thing in reading files , so what's the main difference ?
1-First code :
$handle = fopen($file, 'r');
$data = fread($handle, filesize($file));
2-Second code :
readfile($file);
There's a significant difference between fread() and readfile().
First, readfile() does a number of things that fread() does not. readfile() opens the file for reading, reads it, and then prints it to the output buffer all in one go. fread() only does one of those things: it reads bytes from a given file handle.
Additionally, readfile() has some benefits that fread() does not. For example, it can take advantage of memory-mapped I/O where available rather than slower disk reads. This significantly increases the performance of reading the file since it delegates the process away from PHP itself and more towards operating system calls.
Errata
I previously noted that readfile() could run without PHP (this is corrected below).
For truly large files (think several gigs like media files or large archive backups), you may want to consider delegating the reading of the file away from PHP entirely with X-Sendfile headers to your webserver instead (so that you don't keep your PHP worker tied up for the length of an upload that could potentially take hours).
So you could do something like this instead of readfile():
<?php
/* process some things in php here */
header("X-Sendfile: /path/to/file");
exit; // don't need to keep PHP busy for this
Reading the docs, readfile reads the whole content and writes it into STDOUT.
$data = fread($handle, filesize($file));
While fread puts the content into the variable $data.

Memory problems with php://input

I have an API endpoint that can receive a POST JSON/XML body (or even raw binary data) inside the body content as payload that should be written immediately to a file on the filesystem.
For backwards compatibility reasons, it cannot be a multipart/form-data.
It works with no problems for body content up to a certain size (around 2.3GB with a 8GB script memory limit).
I've tried all of the followings:
both with and without setting the buffers' sizes
$filename = '/tmp/test_big_file.bin';
$input = fopen('php://input', 'rb');
$output = fopen($filename, 'wb');
stream_set_read_buffer($input, 4096);
stream_set_write_buffer($output, 4096);
stream_copy_to_stream($input, $output);
fclose($input);
fclose($output);
and
$filename = '/tmp/test_big_file.bin';
file_put_contents($filename, file_get_contents('php://input'));
and
$filename = '/tmp/test_big_file.bin';
$input = fopen('php://input', 'rb');
$output = fopen($filename, 'wb');
while (!feof($input)) {
fwrite($output, fread($input, 8192), 8192);
}
fclose($input);
fclose($output);
Unfortunately, none of them works. At one point, I get always the same error:
PHP Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 2475803056 bytes) in Unknown on line 0
Also unsetting enable_post_data_reading makes no difference and all the php.ini post/memory/whatever sizes are set to 8GB.
I'm using php-fpm.
Looking what's happening at the memory with free -mt, I can see that the memory used increases slowly at the beginning, going faster after a while, up to a point that no more free memory is left, so the error.
On the temp directory, the file is not directly stream-copied, but instead it is written on a temporary file named php7NARsX or other random strings which is not deleted after the script crashes, so that at the following free -mt check, the available memory is 2.3GB less.
Now my questions:
Why the stream is not copied directly from php://input to the output instead of loading it into memory? (also using php://temp as output stream leads to the same error)
Why is PHP using so much memory? I'm sending a 3GB payload, so why it needs more than 8GB?
Of course, any working solution will be much appreciated. Thank You!

Does OCI-Lob::import function stream the data?

I am using OCI-LOB::import to store a file into database.
What will happen if the file is large, larger than php memory_limit setting? Will OCI-LOB::import do streaming and send file data to database by smaller chunks, or not?
Are there any OCI functions, which can control the LOB-related streaming of data? Most important, for setting chunk size, for example.
1) you don't have to worry about php's memory_limit when you write large data into lob
2) you can write data to lob object by chunks using OCI-Lob::write function
$chunkSize = 1024;
$f = fopen ($filename, 'r');
while ($buf = fread($f, $chunkSize))
{
$lob->write($buf);
}
After examination of oci8_lob.c from PHP 5.3.18 source found that
OCI-LOB::import reads file data and write into LOB descriptor using fixed sized buffer. Length of buffer is set to 8192 bytes and is hardcoded in source. That means, OCI-LOB::import sends data to database using 8K-sized chunks.
It is impossible to modify chunk size, which is used by OCI-LOB::import, since it is hardcoded in source.

is php load whole file when we use fopen() command

I write an php script that help with limit speed and connections in download files. I used fopen() and fseek() something like this:
$f = fopen($file, 'rb');
if($f){
fseek($f,$start);//$start extracted from $_SERVER['HTTP_RANGE']
while(!feof($f)){
echo fread($f,$speed);//$speed is bytes per second
flush();
ob_flush();
sleep(1);
}
fclose($f);
}
download process may take several hours to complete, is whole file be in memory until end of download? and how I can optimize this?
No, fread uses an internal buffer to stream the data (8KB by default), so only a very small part of the file actually resides in memory.

How to decompress gzip stream chunk by chunk using PHP?

I can't read from an active http gzip stream chunk by chunk.
In short, it can't decompress the stream chunk by chunk, it requires the first chunk when it decompress the second one, it requires the first and second one when decompress the third one, or it will return strange characters(gzip string I guess).
I guess there are no existing ways for this as I have googled it for 2 days, anyway, I'll be appreciative if you have any suggestions.
Following is the function which I am using for decompressing:
function gzdecode1($data){
$g = tempnam('./','gz');
file_put_contents($g,$data);
ob_start();
readgzfile($g);
$d = ob_get_clean();
unlink($g);
return $d;
}
Here are ten example chunks
http://2.youpiaoma.com/chunk_s.rar
Use gzopen() and gzread()
$h = gzopen($filename, 'r');
while ($chunk = gzread($h, $chunksize)) {
// do magic
}
If it's a remote you might need to enable that remote file opens, I've never done it in that kind of environment though.

Categories