What happens when a stream opened by file io is blocked? - php

Consider the code snippet here:
$handle = popen("some command that generates an infinite stream of output to stdout", "r");
while ($line = fgets($handle)) {
echo $line;
sleep(3);
}
My question is: what is actually happening during that sleep(3) and the command passed to popen() is still spewing output? Is that getting buffered to PHP's memory?
Is there a chance the output is trashed?

It's OS-dependent. The data may be buffered, the other program's output calls may block, or some combination thereof.

Related

Why fread (PHP) don't read entire file?

I have this code on PHP that load a local file:
$filename = "fille.txt";
$fp = fopen($filename, "rb");
$content = fread($fp, 25699);
fclose($fp);
print_r($content);
With this code I can see all the contents of the file. But when I change the $filename to a external link, like:
$filename = "https:/.../texts/fille.txt";
I can't see all the contents of the file, he appears cut to me. Whats the problem?
The fread() function can be used for network operations. But network connections work different than file system operations. A network cannot read a bigger file in a single attempt, that is not how typical networks work. Instead they work package based. So data arrives in chunks.
And if you take a look into the documentation of the function you use then you will see that:
Reading stops as soon as one of the following conditions is met:
[...]
a packet becomes available or the socket timeout occurs (for network streams)
[...]
So what you observe actually is documented behavior. You need to continue to read packages in a loop to get the whole file. Until you received an EOF.
Take a look yourself: https://www.php.net/manual/en/function.fread.php
And further down in that documentation you will see that example:
Example #3 Remote fread() examples
<?php
$handle = fopen("http://www.example.com/", "rb");
if (FALSE === $handle) {
exit("Failed to open stream to URL");
}
$contents = '';
while (!feof($handle)) {
$contents .= fread($handle, 8192);
}
fclose($handle);
?>

Pipe stdin into a shell script through php

We have a command line php application that maintains special permissions and want to use it to relay piped data into a shell script.
I know that we can read STDIN with:
while(!feof(STDIN)){
$line = fgets(STDIN);
}
But how can I redirect that STDIN into a shell script?
The STDIN is far too large to load into memory, so I can't do something like:
shell_exec("echo ".STDIN." | script.sh");
Using xenon's answer with popen seems to do the trick.
// Open the process handle
$ph = popen("./script.sh","w");
// This puts it into the file line by line.
while(($line = fgets(STDIN)) !== false){
// Put in line from STDIN. (Note that you may have to use `$line . '\n'`. I don't know
fputs($ph,$line);
}
pclose($ph);
As #Devon said, popen/pclose are very useful here.
$scriptHandle = popen("./script.sh","w");
while(($line = fgets(STDIN)) !== false){
fputs($scriptHandle,$line);
}
pclose($scriptHandle);
Alternatively, something along the lines of fputs($scriptHandle, file_get_contents("php://stdin")); might work in the place of a line-by-line approach for a smaller file.

Reading a file line by line for large files efficiently in PHP?

My app reads a large file 5MB - 10MB that has been entered in with json entries line by line.
Each line is handled by a parser that is fed to multiple parsers and treated separately. Once the file is read, the file is moved. The Program is continuously fed with files to be processed.
The program currently works with #file_get_contents($filename). The program's structure works as is.
The problem is that file_get_contents loads the entire file as an array and the entire system runs for a minute. I suspect that I can gain speed if I read it line by line rather than wait for the file to load into memory (I might be wrong and open to suggestion).
There are too many file handler that does this. What is the most effective way to achieve what I need and which file reading method is best for this?
I have fopen, fread, readfile, file, and fscanf to contend with off the top of my head. However when I read the man for them - its all code to read generic files without a clear indication what is best for larger files.
$file = fopen("file.json", "r");
if ($file)
{
while (($line = fgets($file)) !== false)
{
echo $line;
}
}
else
{
echo "Unable to open the file";
}
Fgets read until it reach EOL, or EOF. if you want, you can add how much it should read using the second arg.
For more info about fgets: http://us3.php.net/fgets

how to redirect STDOUT to a file in PHP?

The code below almost works, but it's not what I really meant:
ob_start();
echo 'xxx';
$contents = ob_get_contents();
ob_end_clean();
file_put_contents($file,$contents);
Is there a more natural way?
It is possible to write STDOUT directly to a file in PHP, which is much easier and more straightforward than using output bufferering.
Do this in the very beginning of your script:
fclose(STDIN);
fclose(STDOUT);
fclose(STDERR);
$STDIN = fopen('/dev/null', 'r');
$STDOUT = fopen('application.log', 'wb');
$STDERR = fopen('error.log', 'wb');
Why at the very beginning you may ask? No file descriptors should be opened yet, because when you close the standard input, output and error file descriptors, the first three new descriptors will become the NEW standard input, output and error file descriptors.
In my example here I redirected standard input to /dev/null and the output and error file descriptors to log files. This is common practice when making a daemon script in PHP.
To write to the application.log file, this would suffice:
echo "Hello world\n";
To write to the error.log, one would have to do:
fwrite($STDERR, "Something went wrong\n");
Please note that when you change the input, output and error descriptors, the build-in PHP constants STDIN, STDOUT and STDERR will be rendered unusable. PHP will not update these constants to the new descriptors and it is not allowed to redefine these constants (they are called constants for a reason after all).
here's a way to divert OUTPUT which appears to be the original problem
$ob_file = fopen('test.txt','w');
function ob_file_callback($buffer)
{
global $ob_file;
fwrite($ob_file,$buffer);
}
ob_start('ob_file_callback');
more info here:
http://my.opera.com/zomg/blog/2007/10/03/how-to-easily-redirect-php-output-to-a-file
None of the answers worked for my particular case where I needed a cross platform way of redirecting the output as soon as it was echo'd out so that I could follow the logs with tail -f log.txt or another log viewing app.
I came up with the following solution:
$logFp = fopen('log.txt', 'w');
ob_start(function($buffer) use($logFp){
fwrite($logFp, $buffer);
}, 1); //notice the use of chunk_size == 1
echo "first output\n";
sleep(10)
echo "second output\n";
ob_end_clean();
I haven't noticed any performance issues but if you do, you can change chunk_size to greater values.
Now just tail -f the log file:
tail -f log.txt
No, output buffering is as good as it gets. Though it's slightly nicer to just do
ob_start();
echo 'xxx';
$contents = ob_get_flush();
file_put_contents($file,$contents);
Using eio pecl module eio is very easy, also you can capture PHP internal errors, var_dump, echo, etc. In this code, you can found some examples of different situations.
$fdout = fopen('/tmp/stdout.log', 'wb');
$fderr = fopen('/tmp/stderr.log', 'wb');
eio_dup2($fdout, STDOUT);
eio_dup2($fderr, STDERR);
eio_event_loop();
fclose($fdout);
fclose($fderr);
// output examples
echo "message to stdout\n";
$v2dump = array(10, "graphinux");
var_dump($v2dump);
// php internal error/warning
$div0 = 10/0;
// user errors messages
fwrite(STDERR, "user controlled error\n");
Call to eio_event_loop is used to be sure that previous eio requests have been processed. If you need append on log, on fopen call, use mode 'ab' instead of 'wb'.
Install eio module is very easy (http://php.net/manual/es/eio.installation.php). I tested this example with version 1.2.6 of eio module.
You can install Eio extension
pecl install eio
and duplicate a file descriptor
$temp=fopen('/tmp/my_stdout','a');
$my_data='my something';
$foo=eio_dup2($temp,STDOUT,EIO_PRI_MAX,function($data,$esult,$request){
var_dump($data,$esult,$request);
var_dump(eio_get_last_error($request));
},$my_data);
eio_event_loop();
echo "something to stdout\n";
fclose($temp);
this creates new file descriptor and rewrites target stream of STDOUT
this can be done with STDERR as well
and constants STD[OUT|ERR] are still usable
I understand that this question is ancient, but people trying to do what this question asks will likely end up here... Both of you.
If you are running under a particular environment...
Running under Linux (probably most other Unix like operating systems, untested)
Running via CLI (Untested on web servers)
You can actually close all of your file descriptors (yes all, which means it's probably best to do this at the very beginning of execution... for example just after a pcntl_fork() call to background the process in a daemon (which seems like the most common need for something like this)
fclose( STDIN ); // fd 3
fclose( STDERR); // fd 2
fclose( STDOUT ); // fd 1
And then re-open the file descriptors, assigning them to a variable that will not fall out of scope and thus be garbage collected. Because Linux will predictably open them in the proper order.
$kept_in_scope_variable_fd1 = fopen(...); // fd 1
$kept_in_scope_variable_fd2 = fopen(...); // fd 2
$kept_in_scope_variable_fd3 = fopen( '/dev/null', ... ); // fd 3
You can use whatever files or devices you want for this. I gave /dev/null as the example for STDIN (fd3) because that's probably the most common case for this kind of code.
Once this is done you should be able to do normal things like echo, print_r, var_dump, etc without specifically needing to write to a file with a function. Which is useful when you're trying to background code that you do not want to, or aren't able to, rewrite to be file-pointer-output-friendly.
YMMV for other environments and things like having other FD's open, etc. My advice is to start with a small test script to prove that it works, or doesn't, in your environment and then move on to integration from there.
Good luck.
Here is an ugly solution that was useful for a problem I had (need to debug).
if(file_get_contents("out.txt") != "in progress")
{
file_put_contents("out.txt","in progress");
$content = file_get_contents('http://'.$_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI']);
file_put_contents("out.txt",$content);
}
The main drawback of that is that you'd better not to use the $_POST variables.
But you dont have to put it in the very beggining.

Streaming output to a file and the browser

So, I'm looking for something more efficient than this:
<?php
ob_start();
include 'test.php';
$content = ob_get_contents();
file_put_contents('test.html', $content);
echo $content;
?>
The problems with the above:
Client doesn't receive anything until the entire page is rendered
File might be enormous, so I'd rather not have the whole thing in memory
Any suggestions?
Interesting problem; don't think I've tried to solve this before.
I'm thinking you'll need to have a second request going from your front-facing PHP script to your server. This could be a simple call to http://localhost/test.php. If you use fopen-wrappers, you could use fread() to pull the output of test.php as it is rendered, and after each chunk is received, output it to the screen and append it to your test.html file.
Here's how that might look (untested!):
<?php
$remote_fp = fopen("http://localhost/test.php", "r");
$local_fp = fopen("test.html", "w");
while ($buf = fread($remote_fp, 1024)) {
echo $buf;
fwrite($local_fp, $buf);
}
fclose($remote_fp);
fclose($local_fp);
?>
A better way to do this is to use the first two parameters accepted by ob_start: output_callback and chunk_size. The former specifies a callback to handle output as it's buffered, and the latter specifies the size of the chunks of output to handle.
Here's an example:
$output_file = fopen('test.html', 'w');
if ($output_file === false) {
// Handle error
}
$write_ob_to_file = function($buffer) use ($output_file) {
fwrite($output_file, $buffer);
// Output string as-is
return false;
};
ob_start($write_ob_to_file, 4096);
include 'test.php';
ob_end_flush();
fclose($output_file);
In this example, the output buffer will be flushed (sent) for every 4096 bytes of output (and once more at the end by the ob_end_flush call). Each time the buffer is flushed, the callback $write_ob_to_file will be called and passed the latest chunk. This gets written to test.html. The callback then returns false, meaning "output this chunk as is". If you wanted to only write the output to file and not to PHP's output stream, you could return an empty string instead.
Pix0r's answer is what you want unless you actually need it "included" rather than just executed. For example, if you have login information before the test.php, it will not get passed into the file if you call it with fopen.
If you need it genuinely included, then what you have is the simplest method, but if you want constant output, you'll need to actually write test.php in a manner that outputs as well as stores the information as it goes. As far as I know there's no way to both collect buffer and output it as you go.
Here you go x-send-file, use mod_xsendfile to send file efficiently, really easy.

Categories