How to decompress gzip stream chunk by chunk using PHP? - php

I can't read from an active http gzip stream chunk by chunk.
In short, it can't decompress the stream chunk by chunk, it requires the first chunk when it decompress the second one, it requires the first and second one when decompress the third one, or it will return strange characters(gzip string I guess).
I guess there are no existing ways for this as I have googled it for 2 days, anyway, I'll be appreciative if you have any suggestions.
Following is the function which I am using for decompressing:
function gzdecode1($data){
$g = tempnam('./','gz');
file_put_contents($g,$data);
ob_start();
readgzfile($g);
$d = ob_get_clean();
unlink($g);
return $d;
}
Here are ten example chunks
http://2.youpiaoma.com/chunk_s.rar

Use gzopen() and gzread()
$h = gzopen($filename, 'r');
while ($chunk = gzread($h, $chunksize)) {
// do magic
}
If it's a remote you might need to enable that remote file opens, I've never done it in that kind of environment though.

Related

xml_parse huge file PHP

I have a issue with PHP function xml_parse. It's not working with huge files - I have xml file with 10MB size.
Problem is, that I have old XML-RPC library from Zend and there are another functions (element handlers and case folding).
$parser_resource = xml_parser_create('utf-8');
xml_parser_set_option($parser_resource, XML_OPTION_CASE_FOLDING, true);
xml_set_element_handler($parser_resource, 'XML_RPC_se', 'XML_RPC_ee');
xml_set_character_data_handler($parser_resource, 'XML_RPC_cd');
if (!xml_parse($parser_resource, $data, 1)) {
// ends here with 10MB file
}
On another place, I just use siple_load_xml_file with option LIBXML_PARSEHUGE, but in this case I don't know what can I do.
Best way will be, if function xml_parse will have some parameter for huge files too.
Thank you for your advices
Error is:
XML error: No memory at line ...
The chunk length of file to parse could be to huge.
if you use fread
while ($data = fread($fp, 1024*1024)) {...}
use smaller length (at my case it has to be smaller than 10 MB) e.g. 1MB and put the xml_parse function in the while loop.

PHP fread() chunk length not taken into acount correctly

I want to send an external MP4 file in chunks of 1 MB each to a user. With each chunk I update a database entry to keep track of the download progress. I use fread() to read the file in chunks. Here is the stripped down code:
$filehandle = fopen($file, 'r');
while(!feof($filehandle)){
$buffer = fread($filehandle, 1024*1024);
//do some database stuff
echo $buffer;
ob_flush();
flush();
}
However, when I check the chunk size at some iteration inside the while loop, with
$chunk_length = strlen($buffer);
die("$chunk_length");
I do never get the desired chunk size. It fluctates somewhere around 7000 - 8000 bytes. Nowhere near 1024*1024 bytes.
When I decrease the chunk size to a smaller number, for example 1024 bytes, it works as expected.
According to the PHP fread() manual:
"When reading from anything that is not a regular local file, such as
streams returned when reading remote files or from popen() and
fsockopen(), reading will stop after a packet is available."
In this case I opened a remote file. Apparently, this makes fread() stop not at the specified length, but when the first package has arrived.
I wanted to keep track of a download of an external file.
If you to do this (or keep track of an upload), use CURL instead:
curl_setopt($curl_handle, CURLOPT_NOPROGRESS, false);
curl_setopt($curl_handle, CURLOPT_PROGRESSFUNCTION, 'callbackFunction');
function callbackFunction($download_size, $downloaded, $upload_size, $uploaded){
//do stuff with the parameters
}

How to seek through a stream with file_get_contents?

I'm using PHP to run a webservice. One of the url need to get couple images from the POST body. I currently using a fopen/fread solution :
$size1 = intval($_REQUEST['img1']);
$size2 = intval($_REQUEST['img2']);
$sizeToRead = 4096;
$datas1 = null;
$datas2 = null;
$h = fopen('php://input','r+');
while($size1 > 0) {
if($sizeToRead > $size1)
$sizeToRead = $size1;
$datas1 .= fread($h,$sizeToRead);
$size1 -= $sizeToRead;
}
fclose($h);
file_put_contents('received1.png',$datas1);
//Same thing for the second image
This solution works fine , but i'm trying to make it more readable by using file_get_contents() :
file_put_contents('received1.png',file_get_contents('php://input',false,null,0,$size1));
But it's give me an error :
file_get_contents() stream does not support seeking
file_get_contents() Failed to seek to position 21694 in the stream
Is that even possible to seek in a stream ? Maybe by changing the third parameter to something else ?
If not , is there way to make my code more elegant / efficient ?
Thanks.
Quoting from the manual:
Note: A stream opened with php://input can only be read once; the stream does not support seek operations. However, depending on the SAPI implementation, it may be possible to open another php://input stream and restart reading. This is only possible if the request body data has been saved. Typically, this is the case for POST requests, but not other request methods, such as PUT or PROPFIND.
(my emphasis)

PHP readfile on a file which is increasing in size

Is it possible to use PHP readfile function on a remote file whose size is unknown and is increasing in size? Here is the scenario:
I'm developing a script which downloads a video from a third party website and simultaneously trans-codes the video into MP3 format. This MP3 is then transferred to the user via readfile.
The query used for the above process is like this:
wget -q -O- "VideoURLHere" | ffmpeg -i - "Output.mp3" > /dev/null 2>&1 &
So the file is fetched and encoded at the same time.
Now when the above process is in progress I begin sending the output mp3 to the user via readfile. The problem is that the encoding process takes some time and therefore depending on the users download speed readfile reaches an assumed EoF before the whole file is encoded, resulting in the user receiving partial content/incomplete files.
My first attempt to fix this was to apply a speed limit on the users download, but this is not foolproof as the encoding time and speed vary with load and this still led to partial downloads.
So is there a way to implement this system in such a way that I can serve the downloads simultaneously along with the encoding and also guarantee sending the complete file to the end user?
Any help is appreciated.
EDIT:
In response to Peter, I'm actually using fread(read readfile_chunked):
<?php
function readfile_chunked($filename,$retbytes=true) {
$chunksize = 1*(1024*1024); // how many bytes per chunk
$totChunk = 0;
$buffer = '';
$cnt =0;
$handle = fopen($filename, 'rb');
if ($handle === false) {
return false;
}
while (!feof($handle)) {
//usleep(120000); //Used to impose an artificial speed limit
$buffer = fread($handle, $chunksize);
echo $buffer;
ob_flush();
flush();
if ($retbytes) {
$cnt += strlen($buffer);
}
}
$status = fclose($handle);
if ($retbytes && $status) {
return $cnt; // return num. bytes delivered like readfile() does.
}
return $status;
}
readfile_chunked($linkToMp3);
?>
This still does not guarantee complete downloads as depending on the users download speed and the encoding speed, the EOF() may be reached prematurely.
Also in response to theJeztah's comment, I'm trying to achieve this without having to make the user wait..so that's not an option.
Since you are dealing with streams, you probably should use stream handling functions :). passthru comes to mind, although this will only work if the download | transcode command is started in your script.
If it is started externally, take a look at stream_get_contents.
Libevent as mentioned by Evert seems like the general solution where you have to use a file as a buffer. However in your case, you could do it all inline in your script without using a file as a buffer:
<?php
header("Content-Type: audio/mpeg");
passthru("wget -q -O- http://localhost/test.avi | ffmpeg -i - -f mp3 -");
?>
I don't think there's any of being notified about there being new data, short of something like inotify.
I suggest that if you hit EOF, you start polling the modification time of the file (using clearstatcache() between calls) every 200 ms or so. When you find the file size has increased, you can reopen the file, seek to the last position and continue.
I can highly recommend using libevent for applications like this.
It works perfect for cases like this.
The PHP documentation is a bit sparse for this, but you should be able to find more solid examples around the web.

Cannot resume downloads bigger than 300M

I am working on a program with php to download files.
the script request is like: http://localhost/download.php?file=abc.zip
I use some script mentioned in Resumable downloads when using PHP to send the file?
it definitely works for files under 300M, either multithread or single-thread download, but, when i try to download a file >300M, I get a problem in single-thread downloading, I downloaded only about 250M data, then it seems like the http connection is broken. it doesnot break in the break-point ..Why?
debugging the script, I pinpointed where it broke:
$max_bf_size = 10240;
$pf = fopen("$file_path", "rb");
fseek($pf, $offset);
while(1)
{
$rd_length = $length < $max_bf_size? $length:$max_bf_size;
$data = fread($pf, $rd_length);
print $data;
$length = $length - $rd_length;
if( $length <= 0 )
{
//__break-point__
break;
}
}
this seems like every requested document can only get 250M data buffer to echo or print..But it works when i use a multi-thread to download a file
fread() will read up to the number of bytes you ask for, so you are doing some unnecessary work calculating the number of bytes to read. I don't know what you mean by single-thread and multi-thread downloading. Do you know about readfile() to just dump an entire file? I assume you need to read a portion of the file starting at $offset up to $length bytes, correct?
I'd also check my web server (Apache?) configuration and ISP limits if applicable; your maximum response size or time may be throttled.
Try this:
define(MAX_BUF_SIZE, 10240);
$pf = fopen($file_path, 'rb');
fseek($pf, $offset);
while (!feof($pf)) {
$data = fread($pf, MAX_BUF_SIZE);
if ($data === false)
break;
print $data;
}
fclose($pf);

Categories