How to detect stream_copy_to_stream errors?

How to detect stream_copy_to_stream errors? - php

I've got a bit of code which, simplified, looks something like:
$fout = fsockopen($host, 80);
stream_set_timeout($fout, 10*3600); // 10 hours
$fin = fopen($file, 'rb'); // not really a file stream, but good enough proxy here
$readbytes = stream_copy_to_stream($fin, $fout);
if(!$readbytes) die('copy failed');
However, I'm sometimes getting the following type of error:
Notice: stream_copy_to_stream(): send of 952 bytes failed with errno=104 Connection reset by peer in ...
Notice: stream_copy_to_stream(): send of 952 bytes failed with errno=32 Broken pipe in ...
And the check on $readbytes there won't pick up the error.
I'm aware that it may be possible to check the total length of the file with the number of bytes copied, but this only works if the total length of the stream can be determined in advance.
As this happens randomly, I presume that the connection is just being dropped for some weird reason (but if anyone has any suggestions to reduce the likeliness of this happening, I'm all ears). But it'd be nice to be able to know whether the transfer fully succeeded or not.
Is there anyway to detect a problem without:
having to know the length of the stream in advance
hook into the PHP error handler to pick up the error
...or perhaps using a buffered fread/fwrite loop, checking the length of bytes written, the best solution here?
Thanks.

Okay, so here's my stream copy function, which tries to detect errors, but it seems to fail still (and I thought fwrite was meant to return the correct number of bytes)
function stream_copy($in, $out, $limit=null, $offset=null) {
$bufsize = 32*1024;
if(isset($offset)) fseek($in, $offset, SEEK_CUR);
while(!feof($in)) {
if(isset($limit)) {
if(!$limit) break;
$data = fread($in, min($limit,$bufsize));
} else
$data = fread($in, $bufsize);
$datalen = strlen($data);
$written = fwrite($out, $data);
if($written != $datalen) {
return false; // write failed
}
if(isset($limit)) $limit -= $datalen;
}
return true;
}
In above function, I'm still getting a 'true' returned even when an error is displayed.
So I'm just going to try hooking into PHP's error handler. Haven't tested the following, but my guess is that it should work
function stream_copy_to_stream_testerr($source, $dest) {
$args = func_get_args();
global $__stream_copy_to_stream_fine;
$__stream_copy_to_stream_fine = true;
set_error_handler('__stream_copy_to_stream_testerr');
call_user_func_array('stream_copy_to_stream', $args);
restore_error_handler();
return $__stream_copy_to_stream_fine;
}
function __stream_copy_to_stream_testerr() {
$GLOBALS['__stream_copy_to_stream_fine'] = false;
return true;
}
Ugly, but the only solution I can see.

Related

How can I read newly appended lines from a LARGE (4GB+) open file?

Using PHP 7.3, I'm trying to achieve "tail -f" functionality: open a file, waiting for some other process to write to it, then read those new lines.
Unfortunately, it seems that fgets() caches the EOF condition. Even when there's new data available (filemtime changes), fgets() returns a blank line.
The important part: I cannot simply close, reopen, then seek, because the file size is tens of gigs in size, well above the 32 bit limit. The file must stay open in order to be able to read new data from the correct position.
I've attached some code to demonstrate the problem. If you append data to the input file, filemtime() detects the change, but fgets() reads nothing new.
fread() does seem to work, picking up the new data but I'd rather not have to come up with a roll-your-own "read a line" solution.
Does anyone know how I might be able to poke fgets() into realising that it's not the EOF?
$fn = $argv[1];
$fp = fopen($fn, "r");
fseek($fp, -1000, SEEK_END);
$filemtime = 0;
while (1) {
if (feof($fp)) {
echo "got EOF\n";
sleep(1);
clearstatcache();
$tmp = filemtime($fn);
if ($tmp != $filemtime) {
echo "time $filemtime -> $tmp\n";
$filemtime = $tmp;
}
}
$l = trim(fgets($fp, 8192));
echo "l=$l\n";
}
Update: I tried excluding the call to feof (thinking that may be where the state becomes cached) but the behaviour doesn't change; once fgets reaches the original file pointer position, any further fgets reads will return false, even if more data is subsequently appended.
Update 2: I ended up rolling my own function that will continue returning new data after the first EOF is reached (in fact, it has no concept of EOF, just data available / data not available). Code not heavily tested, so use at your own risk. Hope this helps someone else.
*** NOTE this code was updated 20th June 2021 to fix an off-by-one error. The comment "includes line separator" was incorrect up to this point.
define('FGETS_TAIL_CHUNK_SIZE', 4096);
define('FGETS_TAIL_SANITY', 65536);
define('FGETS_TAIL_LINE_SEPARATOR', 10);
function fgets_tail($fp) {
// Get complete line from open file which may have additional data written to it.
// Returns string (including line separator) or FALSE if there is no line available (buffer does not have complete line, or is empty because of EOF)
global $fgets_tail_buf;
if (!isset($fgets_tail_buf)) $fgets_tail_buf = "";
if (strlen($fgets_tail_buf) < FGETS_TAIL_CHUNK_SIZE) { // buffer not full, attempt to append data to it
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
$ptr = strpos($fgets_tail_buf, chr(FGETS_TAIL_LINE_SEPARATOR));
if ($ptr !== false) {
$rv = substr($fgets_tail_buf, 0, $ptr + 1); // includes line separator
$fgets_tail_buf = substr($fgets_tail_buf, $ptr + 1); // may reduce buffer to empty
return($rv);
} else {
if (strlen($fgets_tail_buf) < FGETS_TAIL_SANITY) { // line separator not found, try to append some more data
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
}
return(false);
}

The author found the solution himself how to create PHP tail viewer for gians log files 4+ Gb in size.
To mark this question as replied, I summary the solution:
define('FGETS_TAIL_CHUNK_SIZE', 4096);
define('FGETS_TAIL_SANITY', 65536);
define('FGETS_TAIL_LINE_SEPARATOR', 10);
function fgets_tail($fp) {
// Get complete line from open file which may have additional data written to it.
// Returns string (including line separator) or FALSE if there is no line available (buffer does not have complete line, or is empty because of EOF)
global $fgets_tail_buf;
if (!isset($fgets_tail_buf)) $fgets_tail_buf = "";
if (strlen($fgets_tail_buf) < FGETS_TAIL_CHUNK_SIZE) { // buffer not full, attempt to append data to it
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
$ptr = strpos($fgets_tail_buf, chr(FGETS_TAIL_LINE_SEPARATOR));
if ($ptr !== false) {
$rv = substr($fgets_tail_buf, 0, $ptr + 1); // includes line separator
$fgets_tail_buf = substr($fgets_tail_buf, $ptr + 1); // may reduce buffer to empty
return($rv);
} else {
if (strlen($fgets_tail_buf) < FGETS_TAIL_SANITY) { // line separator not found, try to append some more data
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
}
return(false);
}

Copy file from url to my own server remains file to 0mb

I'm facing to a problem and I'm not really sure if this is the right way of doing this. I need to copy a file from a remote server to my server with php.
I use the following script :
public function download($file_source, $file_target) {
$rh = fopen($file_source, 'rb');
$wh = fopen($file_target, 'w+b');
if (!$rh || !$wh) {
return false;
}
while (!feof($rh)) {
if (fwrite($wh, fread($rh, 4096)) === FALSE) {
return false;
}
echo ' ';
flush();
}
fclose($rh);
fclose($wh);
return true;
}
but in the end, the file size remains at 0.
EDIT : I update my question, because there are still some things I didn't understand :
About fread, I used 2048mb. But it didn't work.
I found the script above, which uses 4096mb.
My question : How to determine which quantity of memory (?) to use in order no get the file downloaded anytime ? Because this one works on a specific machine (dedicated), but will it on a shared host, if I cannot modify the php.ini ?
Thanks again

filesize() expects a filename/path. You're passing in a filehandle, which means filesize will FAIL and return a boolean false.
You then use that false as the size argument for your fread, which gets translated to an integer 0. So essentially you're sitting there telling php to read a file, 0 bytes at a time.
You cannot reliably get the size of a remote file anyways, so just have fread some fixed number of bytes, e.g. 2048, at a time.
while(!feof($handle)) {
$contents = fread($handle, 2048);
fwrite($f, $contents);
}
and if that file isn't too big and/or your PHP can handle it:
file_put_contents('local.mp4', file_get_contents('http://whatever/foo.mp4'));

file_get_contents => PHP Fatal error: Allowed memory exhausted

I have no experience when dealing with large files so I am not sure what to do about this. I have attempted to read several large files using file_get_contents ; the task is to clean and munge them using preg_replace().
My code runs fine on small files ; however, the large files (40 MB) trigger an Memory exhausted error:
PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 41390283 bytes)
I was thinking of using fread() instead but I am not sure that'll work either. Is there a workaround for this problem?
Thanks for your input.
This is my code:
<?php
error_reporting(E_ALL);
##get find() results and remove DOS carriage returns.
##The error is thrown on the next line for large files!
$myData = file_get_contents("tmp11");
$newData = str_replace("^M", "", $myData);
##cleanup Model-Manufacturer field.
$pattern = '/(Model-Manufacturer:)(\n)(\w+)/i';
$replacement = '$1$3';
$newData = preg_replace($pattern, $replacement, $newData);
##cleanup Test_Version field and create comma delimited layout.
$pattern = '/(Test_Version=)(\d).(\d).(\d)(\n+)/';
$replacement = '$1$2.$3.$4 ';
$newData = preg_replace($pattern, $replacement, $newData);
##cleanup occasional empty Model-Manufacturer field.
$pattern = '/(Test_Version=)(\d).(\d).(\d) (Test_Version=)/';
$replacement = '$1$2.$3.$4 Model-Manufacturer:N/A--$5';
$newData = preg_replace($pattern, $replacement, $newData);
##fix occasional Model-Manufacturer being incorrectly wrapped.
$newData = str_replace("--","\n",$newData);
##fix 'Binary file' message when find() utility cannot id file.
$pattern = '/(Binary file).*/';
$replacement = '';
$newData = preg_replace($pattern, $replacement, $newData);
$newData = removeEmptyLines($newData);
##replace colon with equal sign
$newData = str_replace("Model-Manufacturer:","Model-Manufacturer=",$newData);
##file stuff
$fh2 = fopen("tmp2","w");
fwrite($fh2, $newData);
fclose($fh2);
### Functions.
##Data cleanup
function removeEmptyLines($string)
{
return preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $string);
}
?>

Firstly you should understand that when using file_get_contents you're fetching the entire string of data into a variable, that variable is stored in the hosts memory.
If that string is greater than the size dedicated to the PHP process then PHP will halt and display the error message above.
The way around this to open the file as a pointer, and then take a chunk at a time. This way if you had a 500MB file you can read the first 1MB of data, do what you will with it, delete that 1MB from the system's memory and replace it with the next MB. This allows you to manage how much data you're putting in the memory.
An example if this can be seen below, I will create a function that acts like node.js
function file_get_contents_chunked($file,$chunk_size,$callback)
{
try
{
$handle = fopen($file, "r");
$i = 0;
while (!feof($handle))
{
call_user_func_array($callback,array(fread($handle,$chunk_size),&$handle,$i));
$i++;
}
fclose($handle);
}
catch(Exception $e)
{
trigger_error("file_get_contents_chunked::" . $e->getMessage(),E_USER_NOTICE);
return false;
}
return true;
}
and then use like so:
$success = file_get_contents_chunked("my/large/file",4096,function($chunk,&$handle,$iteration){
/*
* Do what you will with the {$chunk} here
* {$handle} is passed in case you want to seek
** to different parts of the file
* {$iteration} is the section of the file that has been read so
* ($i * 4096) is your current offset within the file.
*/
});
if(!$success)
{
//It Failed
}
One of the problems you will find is that you're trying to perform regex several times on an extremely large chunk of data. Not only that but your regex is built for matching the entire file.
With the above method your regex could become useless as you may only be matching a half set of data. What you should do is revert to the native string functions such as
strpos
substr
trim
explode
for matching the strings, I have added support in the callback so that the handle and current iteration are passed. This will allow you to work with the file directly within your callback, allowing you to use functions like fseek, ftruncate and fwrite for instance.
The way you're building your string manipulation is not efficient whatsoever, and using the proposed method above is by far a much better way.

A pretty ugly solution to adjust your memory limit depending on file size:
$filename = "yourfile.txt";
ini_set ('memory_limit', filesize ($filename) + 4000000);
$contents = file_get_contents ($filename);
The right solutuion would be to think if you can process the file in smaller chunks, or use command line tools from PHP.
If your file is line-based you can also use fgets to process it line-by-line.

For processing just n numbers of rows at a time, we can use generators in PHP.
n(use 1000)
This is how it works
Read n lines, process them, come back at n+1, then read n lines, process them come back and read next n lines and so on.
Here's the code for doing so.
<?php
class readLargeCSV{
public function __construct($filename, $delimiter = "\t"){
$this->file = fopen($filename, 'r');
$this->delimiter = $delimiter;
$this->iterator = 0;
$this->header = null;
}
public function csvToArray()
{
$data = array();
while (($row = fgetcsv($this->file, 1000, $this->delimiter)) !== false)
{
$is_mul_1000 = false;
if(!$this->header){
$this->header = $row;
}
else{
$this->iterator++;
$data[] = array_combine($this->header, $row);
if($this->iterator != 0 && $this->iterator % 1000 == 0){
$is_mul_1000 = true;
$chunk = $data;
$data = array();
yield $chunk;
}
}
}
fclose($this->file);
if(!$is_mul_1000){
yield $data;
}
return;
}
}
And for reading it, you can use this.
$file = database_path('path/to/csvfile/XYZ.csv');
$csv_reader = new readLargeCSV($file, ",");
foreach($csv_reader->csvToArray() as $data){
// you can do whatever you want with the $data.
}
Here $data contains the 1000 entries from the csv or n%1000 which will be for the last batch.
A detailed explanation for this can be found here https://medium.com/#aashish.gaba097/database-seeding-with-large-files-in-laravel-be5b2aceaa0b

My advice would be to use fread. It may be a little slower, but you won't have to use all your memory...
For instance :
//This use filesize($oldFile) memory
file_put_content($newFile, file_get_content($oldFile));
//And this 8192 bytes
$pNew=fopen($newFile, 'w');
$pOld=fopen($oldFile, 'r');
while(!feof($pOld)){
fwrite($pNew, fread($pOld, 8192));
}

PHP Readfile() number of bytes when user aborted

I'm using a PHP script to stream a live video (i.e. a file which never ends) from a remote source. The output is viewed in VLC, not a web browser. I need to keep a count of the number of bytes transferred. Here is my code:
<?php
ignore_user_abort(true);
$stream = $_GET['stream'];
if($stream == "vid1")
{
$count = readfile('http://127.0.0.1:8080/');
logThis($count);
}
function logThis($c)
{
$myFile = "bytecount.txt";
$handle = fopen($myFile,'a');
fwrite($handle,"Count: " . $c . "\n");
fclose($handle);
}
?>
However it appears that when the user presses the stop button, logThis() is never called, even though I've put in ignore_user_abort(true);
Any ideas on what I'm doing wrong?
Thanks
Update2: I've changed my code as I shoudn't be using ignore_user_abort(true) as that would continue to download the file forever even after the client has gone. I've changed my code to this:
<?php
$count = 0;
function bye()
{
//Create Dummy File with the filename of equal to count
}
register_shutdown_function('bye');
set_time_limit(0);
ignore_user_abort(false);
$stream = $_GET['stream'];
if($stream == "vid1")
{
$GLOBALS['count'] = readfile('http://127.0.0.1:8080/');
exit();
}
?>
My problem now is that when the script is aborted (i.e. user presses stop), readfile won't return a value (i.e. count remains at 0). Any ideas on how I can fix this?
Thanks

When a PHP script is running normally the NORMAL state, is active. If the remote client disconnects the ABORTED state flag is turned on. A remote client disconnect is usually caused by the user hitting his STOP button. If the PHP-imposed time limit (see set_time_limit()) is hit, the TIMEOUT state flag is turned on.
so setting the set_time_limit to 0 should help.

Ok folks I managed to fix this. The trick was to not use readfile() but read the video stream byte by byte. Ok it may not be 100% accurate, however a few bytes inaccuracy here or there is ok.
<?php
$count = 0;
function logCount()
{
//Write out dummy file with a filename equal to count
}
register_shutdown_function('logCount');
set_time_limit(0);
ignore_user_abort(false);
$stream = $_GET['stream'];
if($stream == "vid1")
{
$filename = 'http://127.0.0.1:8080/';
$f = fopen($filename, "rb");
while($chunk = fread($f, 1024)) {
echo $chunk;
flush();
if(!connection_aborted()) {
$GLOBALS['count'] += strlen($chunk);
}
else {
exit();
}
}
}
?>

Stream FTP download to output

I am trying to stream/pipe a file to the user's browser through HTTP from FTP. That is, I am trying to print the contents of a file on an FTP server.
This is what I have so far:
public function echo_contents() {
$file = fopen('php://output', 'w+');
if(!$file) {
throw new Exception('Unable to open output');
}
try {
$this->ftp->get($this->path, $file);
} catch(Exception $e) {
fclose($file); // wtb finally
throw $e;
}
fclose($file);
}
$this->ftp->get looks like this:
public function get($path, $stream) {
ftp_fget($this->ftp, $stream, $path, FTP_BINARY); // Line 200
}
With this approach, I am only able to send small files to the user's browser. For larger files, nothing gets printed and I get a fatal error (readable from Apache logs):
PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 15994881 bytes) in /xxx/ftpconnection.php on line 200
I tried replacing php://output with php://stdout without success (nothing seems to be sent to the browser).
How can I efficiently download from FTP while sending that data to the browser at the same time?
Note: I would not like to use file_get_contents('ftp://user:pass#host:port/path/to/file'); or similar.

Found a solution!
Create a socket pair (anonymous pipe?). Use the non-blocking ftp_nb_fget function to write to one end of the pipe, and echo the other end of the pipe.
Tested to be fast (easily 10MB/s on a 100Mbps connection) so there's not much I/O overhead.
Be sure to clear any output buffers. Frameworks commonly buffer your output.
public function echo_contents() {
/* FTP writes to [0]. Data passed through from [1]. */
$sockets = stream_socket_pair(STREAM_PF_UNIX, STREAM_SOCK_STREAM, STREAM_IPPROTO_IP);
if($sockets === FALSE) {
throw new Exception('Unable to create socket pair');
}
stream_set_write_buffer($sockets[0], 0);
stream_set_timeout($sockets[1], 0);
try {
// $this->ftp is an FtpConnection
$get = $this->ftp->get_non_blocking($this->path, $sockets[0]);
while(!$get->is_finished()) {
$contents = stream_get_contents($sockets[1]);
if($contents !== false) {
echo $contents;
flush();
}
$get->resume();
}
$contents = stream_get_contents($sockets[1]);
if($contents !== false) {
echo $contents;
flush();
}
} catch(Exception $e) {
fclose($sockets[0]); // wtb finally
fclose($sockets[1]);
throw $e;
}
fclose($sockets[0]);
fclose($sockets[1]);
}
// class FtpConnection
public function get_non_blocking($path, $stream) {
// $this->ftp is the FTP resource returned by ftp_connect
return new FtpNonBlockingRequest($this->ftp, $path, $stream);
}
/* TODO Error handling. */
class FtpNonBlockingRequest {
protected $ftp = NULL;
protected $status = NULL;
public function __construct($ftp, $path, $stream) {
$this->ftp = $ftp;
$this->status = ftp_nb_fget($this->ftp, $stream, $path, FTP_BINARY);
}
public function is_finished() {
return $this->status !== FTP_MOREDATA;
}
public function resume() {
if($this->is_finished()) {
throw BadMethodCallException('Cannot continue download; already finished');
}
$this->status = ftp_nb_continue($this->ftp);
}
}

Try:
#readfile('ftp://username:password#host/path/file');
I find with a lot of file operations it's worthwhile letting the underlying OS functionality take care of it for you.

Sounds like you need to turn off output buffering for that page, otherwise PHP will try to fit it in all memory.
An easy way to do this is something like:
while (ob_end_clean()) {
; # do nothing
}
Put that ahead of your call to ->get(), and I think that will resolve your issue.

I know this is old, but some may still think it's useful.
I've tried your solution on a Windows environment, and it worked almost perfectly:
$conn_id = ftp_connect($host);
ftp_login($conn_id, $user, $pass) or die();
$sockets = stream_socket_pair(STREAM_PF_INET, STREAM_SOCK_STREAM,
STREAM_IPPROTO_IP) or die();
stream_set_write_buffer($sockets[0], 0);
stream_set_timeout($sockets[1], 0);
set_time_limit(0);
$status = ftp_nb_fget($conn_id, $sockets[0], $filename, FTP_BINARY);
while ($status === FTP_MOREDATA) {
echo stream_get_contents($sockets[1]);
flush();
$status = ftp_nb_continue($conn_id);
}
echo stream_get_contents($sockets[1]);
flush();
fclose($sockets[0]);
fclose($sockets[1]);
I used STREAM_PF_INET instead of STREAM_PF_UNIX because of Windows, and it worked flawlessly... until the last chunk, which was false for no apparent reason, and I couldn't understand why. So the output was missing the last part.
So I decided to use another approach:
$ctx = stream_context_create();
stream_context_set_params($ctx, array('notification' =>
function($code, $sev, $message, $msgcode, $bytes, $length) {
switch ($code) {
case STREAM_NOTIFY_CONNECT:
// Connection estabilished
break;
case STREAM_NOTIFY_FILE_SIZE_IS:
// Getting file size
break;
case STREAM_NOTIFY_PROGRESS:
// Some bytes were transferred
break;
default: break;
}
}));
#readfile("ftp://$user:$pass#$host/$filename", false, $ctx);
This worked like a charm with PHP 5.4.5. The bad part is that you can't catch the transferred data, only the chunk size.

a quick search brought up php’s flush.
this article might also be of interest: http://www.net2ftp.org/forums/viewtopic.php?id=3774

(I've never met this problem myself, so that's just a wild guess ; but, maybe... )
Maybe changing the size of the ouput buffer for the "file" you are writing to could help ?
For that, see stream_set_write_buffer.
For instance :
$fp = fopen('php://output', 'w+');
stream_set_write_buffer($fp, 0);
With this, your code should use a non-buffered stream -- this might help...

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to detect stream_copy_to_stream errors? - php

Related

How can I read newly appended lines from a LARGE (4GB+) open file?

Copy file from url to my own server remains file to 0mb

file_get_contents => PHP Fatal error: Allowed memory exhausted

PHP Readfile() number of bytes when user aborted

Stream FTP download to output

Categories

Resources