issue with stream_select() in PHP - php

I am using stream_select() but it returns 0 number of descriptors after few seconds and my function while there is still data to be read.
An unusual thing though is that if you set the time out as 0 then I always get the number of descriptors as zero.
$num = stream_select($read, $w, $e, 0);

stream_select() must be used in a loop
The stream_select() function basically just polls the stream selectors you provided in the first three arguments, which means it will wait until one of the following events occur:
some data arrives
or reaches timeout (set with $tv_sec and $tv_usec) without getting any data.
So recieving 0 as a return value is perfectly normal, it means there was no new data in the current polling cycle.
I'd suggest to put the function in a loop something like this:
$EOF = false;
do {
$tmp = null;
$ready = stream_select($read, $write, $excl, 0, 50000);
if ($ready === false ) {
// something went wrong!!
break;
} elseif ($ready > 0) {
foreach($read as $r) {
$tmp .= stream_get_contents($r);
if (feof($r)) $EOF = true;
}
if (!empty($tmp)) {
//
// DO SOMETHING WITH DATA
//
continue;
}
} else {
// No data in the current cycle
}
} while(!$EOF);
Please note that in this example, the script totally ignores everything aside from the input stream. Also, the third section of the "if" statement is completely optional.

Does it return the number 0 or a FALSE boolean? FALSE means there was some error but zero could be just because of timeout or nothing interesting has happen with the streams and you should do a new select etc.
I would guess this could happen with a zero timeout as it will check and return immediately. Also if you read the PHP manual about stream-select you will see this warning about using zero timeout:
Using a timeout value of 0 allows you to instantaneously poll the status of the streams, however, it is NOT a good idea to use a 0 timeout value in a loop as it will cause your script to consume too much CPU time.
If this is a TCP stream and you want to check for connection close you should check the return value from fread etc to determine if the other peer has closed the conneciton. About the read streams array argument:
The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).

http://www.php.net/stream_select
Due to a limitation in the current Zend Engine it is not possible to
pass a constant modifier like NULL directly as a parameter to a
function which expects this parameter to be passed by reference.
Instead use a temporary variable or an expression with the leftmost
member being a temporary variable:
<?php $e = NULL; stream_select($r, $w, $e, 0); ?>

I have a similar issue which is caused by the underlying socket timeout.
Eg. I create some streams
$streams = stream_socket_pair(STREAM_PF_UNIX, STREAM_SOCK_STREAM, STREAM_IPPROTO_IP);
Then fork, and use a block such as the following
stream_set_blocking($pipes[1], 1);
stream_set_blocking($pipes[2], 1);
$pipesToRead = array($pipes[1], $pipes[2]);
while (!feof($pipesToRead[0]) || !feof($pipesToRead[1])) {
$reads = $pipesToRead;
$writes = null;
$excepts = $pipesToRead;
$tSec = null;
stream_select($reads, $writes, $excepts, $tSec);
// while it's generating any kind of output, duplicate it wherever it
// needs to go
foreach ($reads as &$read) {
$chunk = fread($read, 8192);
foreach ($streams as &$stream)
fwrite($stream, $chunk);
}
}
Glossing over what other things might be wrong there, my $tSec argument to stream_select is ignored, and the "stream" will timeout after 60 seconds of inactivity and produce an EOF.
If I add the following after creating the streams
stream_set_timeout($streams[0], 999);
stream_set_timeout($streams[1], 999);
Then I get the result I desire, even if there's no activity on the underlying stream for longer than 60 seconds.
I feel that this might be a bug, because I don't want that EOF after 60 seconds of inactivity on the underlying stream, and I don't want to plug in some arbitrarily large value to avoid hitting the timeout if my processes are idle for some time.
In addition, even if the 60 second timeout remains, I think it should just timeout on my stream_select() call and my loop should be able to continue.

Related

usleep() in while loop causing issues - whole process freezes

I am using php, Laravel, Redis, and SQL on an Ubuntu localhost server. I have made a bunch of methods that return results from API searches after some processing. I am calling 5 of these methods which will be very slow if done synchronously, so I've been experimenting with async approaches (which I know php isn't optimised for). After a few approaches I have found some success with pcntl_fork(), but I'm running into some nasty problems.
Edit: After some messing around I have found that if I remove the while loop then the code afterward executes properly, I have removed the while loop and placed it in the second 'search' method. However it still causes a freeze of the system. This makes no sense as there shouldn't be an infinite loop as if I manually query the Redis db, all 5 results are there.
This is my code: (I have a few custom classes for making and processing the API calls, fyi these methods work flawlessly)
//this caches the individual api results to a Redis list
public static function cacheAsyncApiSearch(string $searchQuery, int $maxResults = 20)
{
$key = "search:".$searchQuery; //for Redis
if(!Redis::client()->exists($key)) {
for ($i = 0; $i < 5; $i++) {
// Create a child process
$pid = pcntl_fork();
if ($pid == -1) {
// Fork failed
exit(1);
} elseif ($pid) {
// This is the parent process
// I have tried many versions of pcntl_wait, none work! They all still don't allow code to be ran afterwards (even within this elseif block), and the best it does is cache the 1st api case (YouTube)
// while (!pcntl_wait($status, WNOHANG)) {
// $exitStatus = pcntl_wexitstatus($status);
// // Do something with the exit status of the child process
// }
// dd($pid);
// pcntl_waitpid($pid, $status, WUNTRACED);
} else {
//child processes
switch ($i) {
case 0:
$results = YouTube::search($searchQuery, $maxResults)['results'];
Redis::client()->rPush($key,SearchResultDTO::jsonEncodeArray($results));
SearchResultDTO::convertResultDTOToModels($results);
break;
case 1:
$results = Dailymotion::search($searchQuery, $maxResults)['results'];
Redis::client()->rPush($key,SearchResultDTO::jsonEncodeArray($results));
SearchResultDTO::convertResultDTOToModels($results);
break;
case 2:
$results = Vimeo::search($searchQuery, $maxResults)['results'];
Redis::client()->rPush($key,SearchResultDTO::jsonEncodeArray($results));
SearchResultDTO::convertResultDTOToModels($results);
break;
case 3:
$results = Twitch::search($searchQuery, 2)['results'];
Redis::client()->rPush($key,SearchResultDTO::jsonEncodeArray($results));
SearchResultDTO::convertResultDTOToModels($results);
break;
case 4:
$results = Podcasts::getPodcastsFromItunesResults(Podcasts::search($searchQuery, 2)["response"]->results);
Redis::client()->rPush($key,SearchResultDTO::jsonEncodeArray($results));
SearchResultDTO::convertResultDTOToModels($results);
break;
}
$i = 10000;
exit(0);
}
}
// for noting the process id of the given process that gets to this point
Redis::client()->lPush("search_pid:".$searchQuery, $pid);
// sets a time out for the redis cache
Redis::client()->expire($key, 60*60*4);
while (is_numeric( Redis::client()->lLen($key)) && Redis::client()->lLen($key) < 5) {
usleep(500000); // 0.5 seconds
// pcntl_waitpid(-1, $status); //does this even do anything? not for me
}
return false; // not already cached
}
return true; // already cached
}
This code somewhat works, It performs the api calls and caches the Redis perfectly. However when the method is ran, no code will be ran after it (unless redis has found a cached version and the process is not forked).
This made me think that all processes are being exited (possibly true? if so i dont know why), so I tried writing a version without the exit(0) line. This works, I can then perform code after the method call, however I noticed (when getting SQL race conditions) that all 6 (5 child, 1 parent) processes continued to run their own version of the code after this method (e.g. some database writes)
public static function search(string $searchQuery, int $maxResults = 20): array
{
$key = "search:".$searchQuery;
$results = [];
// the quoted method above
self::cacheAsyncApiSearch($searchQuery, $maxResults);
foreach (Redis::client()->lRange($key,0,-1) as $result){
$results = array_merge($results, SearchResultDTO::jsonDecodeArray($result));
}
$creatorDTOs = [];
$videoDTOs = [];
$streamDTOs = [];
$playlistDTOs = [];
$podcastDTOs = [];
/** #var SearchResultDTO $result */
foreach ($results as $result) {
match ($result->kind) {
Kind::Creator => $creatorDTOs[] = $result,
Kind::Video => $videoDTOs[] = $result,
Kind::Stream => $streamDTOs[] = $result,
Kind::Playlist => $playlistDTOs[] = $result,
Kind::Podcast => $podcastDTOs[] = $result,
};
}
// did this to test how many times the code was being ran (the list has 6 1's in it)
Redis::client()->lPush("here", '1');
// I know this code isn't completely efficient since I already called these conversion methods before, however I am just trying to get the forking stuff to work right now.
return [
"creators" => SearchResultDTO::convertResultDTOToModels($creatorDTOs),
"videos" => SearchResultDTO::convertResultDTOToModels($videoDTOs),
"streams" => SearchResultDTO::convertResultDTOToModels($streamDTOs),
"playlists" => SearchResultDTO::convertResultDTOToModels($playlistDTOs),
"podcasts" => SearchResultDTO::convertResultDTOToModels($podcastDTOs)
];
}
These DTO's (Data Transfer Objects) are being used to populate a UI. So for example, when I make a search (that isn't cached), the page is blank forever. But if I refresh the page (after the search is cached) then the results show just fine.
This is the most bizarre problem I have ever ran into and I really appreciate any help.
Edit please read:
After some messing around I have found that if I remove the while loop then the code afterward executes properly, I have removed the while loop and placed it in the second 'search' method. However it still causes a freeze of the system. This makes no sense as there shouldn't be an infinite loop as if I manually query the Redis db, all 5 results are there. And the dd("two") can never be excecated unless the usleep() is removed. Hopefully this narrows the problem down.
Edit 2 please read:
I have figured out that I can get the dd("two") to work when usleep() is reduced to 0.05s from 0.5 seconds, but it still doesnt seem to run long enough for it to work.
if(!self::cacheAsyncApiSearch($searchQuery, $maxResults))
{
// make sure Redis is properly returning a number not object
$len = Redis::client()->lLen($key);
while(!is_numeric($len)){
usleep(500000); // 0.5 seconds
$len = Redis::client()->lLen($key);
}
//dd($len); //this dd() works
while ($len < 5) {
dd("one"); // this dd() works
usleep(500000); // 0.5 seconds
dd("two"); **//$this does not work, why?**
$len = Redis::client()->lLen($key);
}
}

False eof from feof() with sockets fgets

I have inherited a piece of code which uses the fetchURL() function below to grab data from a url. I've just noticed that it is often getting feof() returning true before the full page of data is retrieved. I have tried some tests and using CURL of file_get_contents() both retrieve the full page every time.
The error is intermittent. On 9 calls, sometimes 7 will complete successfully and sometimes only 4. A particular 4 of the 9 (they are get requests with just a changing query string) always complete successfully. I have tried reversing the order of the requests and the same 4 query strings are still always successful whilst the remainder sometimes work and sometimes don't.
So it "seems" that the data being returned may have something to do with the problem, but it's the intermittent nature that has got me foxed. The data returned in each case is always the same (as in, every time I make a call with a query string of ?SearchString=8502806 the page returned contains the same data), but sometimes the full page is delivered by fgets/feof and sometimes not.
Does anyone have a suggestion as to what may be causing this situation? Most other posts O have seen on this subject are regarding the opposite problem whereby feof() is not returning true.
function fetchURL( $url, $ret = 'body' ) {
$url_parsed = parse_url($url);
$host = $url_parsed["host"];
$port = (isset($url_parsed["port"]))?$url_parsed["port"]:'';
if ($port==0)
$port = 80;
$path = $url_parsed["path"];
if ($url_parsed["query"] != "")
$path .= "?".$url_parsed["query"];
$out = "GET $path HTTP/1.0\r\nHost: $host\r\n\r\n";
$fp = fsockopen($host, $port, $errno, $errstr, 30);
fwrite($fp, $out);
$body = false;
$h = '';
$b = '';
while (!feof($fp)) {
$s = fgets($fp, 1024);
if ( $body )
$b .= $s;
else
$h .= $s;
if ( $s == "\r\n" )
$body = true;
}
fclose($fp);
return ($ret == 'body')?$b:(($ret == 'head')?$h:array($h, $b));
}
I see quite a few things wrong with that code.
Don't ever use feof on sockets. It'll hang until the server closes the socket, which does not necessarily happen immediately after the page was received.
feof might return true (socket is closed) while PHP still has some data in its buffer.
Your code to distinguish header from body seems to rely on PHP doing it's job properly, which is generally a bad idea. fgets doesn't necessarily read a line, it can also return just a single byte (\r, then the next call you might get the \n)
You're not properly encoding the path value
Why don't you just convert your code to use cURL or file_get_contents?
It sounds like a timeout problem to me. See stream_set_timeout() in the PHP manual.

In PHP, how can I detect that input vars were truncated due to max_input_vars being exceeded?

I know that an E_WARNING is generated by PHP
PHP Warning: Unknown: Input variables exceeded 1000
But how can I detect this in my script?
A "close enough" method would be to check if( count($_POST, COUNT_RECURSIVE) == ini_get("max_input_vars"))
This will cause a false positive if the number of POST vars happens to be exactly on the limit, but considering the default limit is 1000 it's unlikely to ever be a concern.
count($_POST, COUNT_RECURSIVE) is not accurate because it counts all nodes in the array tree whereas input_vars are only the terminal nodes. For example, $_POST['a']['b'] = 'c' has 1 input_var but using COUNT_RECURSIVE will return 3.
php://input cannot be used with enctype="multipart/form-data". http://php.net/manual/en/wrappers.php.php
Since this issue only arises with PHP >= 5.3.9, we can use anonymous functions. The following recursively counts the terminals in an array.
function count_terminals($a) {
return is_array($a)
? array_reduce($a, function($carry, $item) {return $carry + count_terminals($item);}, 0)
: 1;
}
What works for me is this. Firstly, I put this at the top of my script/handler/front controller. This is where the error will be saved (or $e0 will be null, which is OK).
$e0 = error_get_last();
Then I run a bunch of other processing, bootstrapping my application, registering plugins, establishing sessions, checking database state - lots of things - that I can accomplish regardless of exceeding this condition.. Then I check this $e0 state. If it's not null, we have an error so I bail out (assume that App is a big class with lots of your magic in it)
if (null != $e0) {
ob_end_clean(); // Purge the outputted Warning
App::bail($e0); // Spew the warning in a friendly way
}
Tweak and tune error handlers for your own state.
Registering an error handler won't catch this condition because it exists before your error handler is registered.
Checking input var count to equal the maximum is not reliable.
The above $e0 will be an array, with type => 8, and line => 0; the message will explicitly mention input_vars so you could regex match to create a very narrow condition and ensure positive identification of the specific case.
Also note, according to the PHP specs this is a Warning not an Error.
function checkMaxInputVars()
{
$max_input_vars = ini_get('max_input_vars');
# Value of the configuration option as a string, or an empty string for null values, or FALSE if the configuration option doesn't exist
if($max_input_vars == FALSE)
return FALSE;
$php_input = substr_count(file_get_contents('php://input'), '&');
$post = count($_POST, COUNT_RECURSIVE);
echo $php_input, $post, $max_input_vars;
return $php_input > $post;
}
echo checkMaxInputVars() ? 'POST has been truncated.': 'POST is not truncated.';
Call error_get_last() as soon as possible in your script (before you have a chance to cause errors, as they will obscure this one.) In my testing, the max_input_vars warning will be there if applicable.
Here is my test script with max_input_vars set to 100:
<?php
if (($error = error_get_last()) !== null) {
echo 'got error:';
var_dump($error);
return;
}
unset($error);
if (isset($_POST['0'])) {
echo 'Got ',count($_POST),' vars';
return;
}
?>
<form method="post">
<?php
for ($i = 0; $i < 200; $i++) {
echo '<input name="',$i,'" value="foo" type="hidden">';
}
?>
<input type="submit">
</form>
Output when var limit is hit:
got error:
array
'type' => int 2
'message' => string 'Unknown: Input variables exceeded 100. To increase the limit change max_input_vars in php.ini.' (length=94)
'file' => string 'Unknown' (length=7)
'line' => int 0
Tested on Ubuntu with PHP 5.3.10 and Apache 2.2.22.
I would be hesitant to check explicitly for this error string, for stability (they could change it) and general PHP good practice. I prefer to turn all PHP errors into exceptions, like this (separate subclasses may be overkill, but I like this example because it allows # error suppression.) It would be a little different coming from error_get_last() but should be pretty easy to adapt.
I don't know if there are other pre-execution errors that could get caught by this method.
What about something like that:
$num_vars = count( explode( '###', http_build_query($array, '', '###') ) );
You can repeat it both for $_POST, $_GET, $_COOKIE, whatever.
Still cant be considered 100% accurate, but I guess it get pretty close to it.

Socket_read returning '1' ..?

I have recently started practicing with sockets on PHP and got an issue for which I find no documentation. Similar cases I've seen in C++, but not a clear answer to this. The code:
do {
$input = socket_read($client, 12,PHP_BINARY_READ);
echo $input;
} while(TRUE);
Is supposed to block on the socket (code for creation, bind, etc not included) and get either 12 bytes or whatever information is available from the other side.
Oddly I just get a '1' in the variable $input if 12 bytes are read. If I send from the client side more than 12 bytes then I receive '1[REST_OF_DATA]' in the value of $input.
Any idea why this is happening?
If I change this to more data and to PHP_NORMAL_READ then I correctly receive the data.
PHP manual online does not say anything abou socket_read returning '1'..
**EDIT: Ok thanks for yout early answers :). I am saving to a file and reading (not echoing to browser) expecting any character. I think I may have just discovered something that could be good if someone with knowledge of C++ sockets can verify. Anyways, my read code actually was this (not what I posted above):
do {
$input = ($seq_id == 0) ? socket_read($client, 12,PHP_BINARY_READ) : socket_read($client,1024,PHP_BINARY_READ);
echo $input;
} while(TRUE);
I was expecting 12 bytes at the first read, then chunks of 1024, reason for that condition check. The weird '1' comes from this. If I replace that with the line I posted above the data is read normally. In fact, even reading like this:
$input = ($seq_ID == 0) ? socket_read($client,12,PHP_BINARY_READ) : socket_read($client, 12,PHP_BINARY_READ);
Results in : 1st read = '1' 2nd read = correct data, 3rd read = correct data..
The 12 you specify is the maximum length to read, so 12 can mean a return string of a size from 0-12 characters (binary string in PHP, 1 char = 1 byte).
Additionally as that can be binary, I suggest you use var_dump and a hexdump of the return string value to actually find out how many bytes were returned, echo might hide some control characters, your browser might hide whitespace.
For the comments above, yes $seq_id should increment. I just wanted to shorten the code. So now, the answer is not important for me anymore but remains an enigma, after upgrading my Ubuntu version this month I have been unable to replicate the error with the same script:
<?php
set_time_limit(0);
$address = "192.168.1.1";
$port = 3320;
$server_users = 3;
$mysock = socket_create(AF_INET, SOCK_STREAM, SOL_TCP) or die("Could not create socket\n");
$bind_result = socket_bind($mysock,$address, $port) or die("Could not bind to address\n");
$listen_result = socket_listen($mysock, $server_users) or die("Could not set up socket listener\n");
$client = socket_accept($mysock) or die("Could not accept the connection to socket\n");
$seq_id =0;
do {
$input = ($seq_id == 0) ? socket_read($client, 12,PHP_BINARY_READ) : socket_read($client,1024,PHP_BINARY_READ);
echo $input;
} while(TRUE);
?>
I execute it in terminal:
$php -q myscript.php
And test it using netcat:
$netcat 192.168.1.1 3320
Note that the question is about socket_read returning 1 as it results, which is not documented anywhere

PHP long poll fails

I have this loop to long poll:
$time = time(); //send a blank response after 15sec
while((time() - $time) < 15) {
$last_modif = filemtime("./logs.txt");
if($_SESSION['lasttime'] != $last_modif) {
$_SESSION['lasttime'] = $last_modif;
$logs = file_get_contents("./logs.txt");
print nl2br($logs);
ob_flush();
flush();
die();
}
usleep(10000);
}
problem is: the "if" condition is never entered in the middle of the while loop, even if logs.txt is modified. I have to wait 15sec till the next call to this file to obtain the updated content (so it becomes a regular, "setTimeout style" AJAX polling, not a long-poll). Any idea why ?
This is because of the filemtime() function : its results are cached. Thus each time your loop is executed the timestamp's the same for 15 seconds.
I have not tried it myself, but according to w3cschools :
The result of this function are cached. Use clearstatcache() to clear the cache.
Hope that helps !

Categories