I have read some articles (like this, or this), and all of them give me the same way to implements Long Polling in PHP (using usleep() and loop), like that:
$source; // some data source - db, etc
$data = null; // our return data
$timeout = 30; // timeout in seconds
$now = time(); // start time
// loop for $timeout seconds from $now until we get $data
while((time() - $now) < $timeout) {
// fetch $data
$data = $source->getData();
// if we got $data, break the loop
if (!empty($data)) break;
// wait 1 sec to check for new $data
usleep(10000);
}
// if there is no $data, tell the client to re-request (arbitrary status message)
if (empty($data)) $data = array('status'=>'no-data');
// send $data response to client
echo json_encode($data);
Is there another way? I know that PHP is a script language only, but i would like a way that base on event rather than checking and doing or waiting until timeout. It maybe be something like Continuations in Java that would be perfect.
You could try React: http://reactphp.org/
Is not very mature yet, but it may suit your needs. Instead of doing long pooling, you can do it async.
I would recommend: http://ape-project.org/
mature and scalable
Related
I need to write a processor that can potentially send out many HTTP requests to an external service. Since I want to maximize performance, I wish to minimize blocking. I'm using PHP 5.6 and GuzzleHTTP.
GuzzleHTTP does have an option for async requests. But since we do have only 1 thread available in PHP, I need to allocate some time for them to be processed. Unfortunately I only see one way to do it - calling wait which blocks until all the requests are processed. That's not what I want.
Instead I'd like to have some method that handles whatever has arrived, and then returns. So that I can do something along the lines of:
$allRequests = [];
while ( !checkIfNeedToEnd() ) {
$newItems = getItemsFromQueue();
$allRequests = $allRequests + spawnRequests($newItems);
GuzzleHttp::processWhatYouCan($allRequests);
removeProcessedRequests($allRequests);
}
Is this possible?
Alright... figured it out myself:
$handler = new \GuzzleHttp\Handler\CurlMultiHandler();
$client = new \GuzzleHttp\Client(['handler' => $handler]);
$promise1 = $client->getAsync("http://www.stackoverflow.com");
$promise2 = $client->getAsync("http://localhost/");
$doneCount = 0;
$promise1->then(function() use(&$doneCount) {
$doneCount++;
echo 'Promise 1 done!';
});
$promise2->then(function() use(&$doneCount) {
$doneCount++;
echo 'Promise 2 done!';
});
$last = microtime(true);
while ( $doneCount < 2 ) {
$now = microtime(true);
$delta = round(($now-$last)*1000);
echo "tick($delta) ";
$last = $now;
$handler->tick();
}
And the output I get is:
tick(0) tick(6) tick(1) tick(0) tick(1001) tick(10) tick(96) Promise 2 done!tick(97) Promise 1 done!
The magic ingredient is creating the CurlMultiHandler yoursef and then calling tick() on that when it's convenient. After that it's promises as usual. And if the queue is empty, tick() returns immediately.
Note that it can still block for up to 1 second (default) if there is no activity. This can be also changed if needed:
$handler = new \GuzzleHttp\Handler\CurlMultiHandler(['select_timeout' => 0.5]);
The value is in seconds, but with floating point.
I have been trying to implement multi-threading in php to achieve multi-upload using pthreads php.
From my understanding of multi-threading, this is how I envisioned it working.
I would upload a file,the file will start uploading in the background; even if the file is not completed to upload, another instance( thread ) will be created to upload another file. I would make multiple upload requests using AJAXand multiple files would start uploading, I would get the response of a single request individually and I can update the status of upload likewise in my site.
But this is not how it is working. This is the code that I got from one of the pthread question on SO, but I do not have the link( sorry!! ).
I tested this code to see of this really worked like I envisioned. This is the code I tested, I changed it a little.
<?php
error_reporting(E_ALL);
class AsyncWebRequest extends Thread {
public $url;
public $data;
public function __construct ($url) {
$this->url = $url;
}
public function run () {
if ( ($url = $this->url) ){
/*
* If a large amount of data is being requested, you might want to
* fsockopen and read using usleep in between reads
*/
$this->data = file_get_contents ($url);
echo $this->getThreadId ();
} else{
printf ("Thread #%lu was not provided a URL\n", $this->getThreadId ());
}
}
}
$t = microtime (true);
foreach( ["http://www.google.com/?q=". rand () * 10, 'http://localhost', 'https://facebook.com'] as $url ){
$g = new AsyncWebRequest( $url );
/* starting synchronized */
if ( $g->start () ){
printf ( $url ." took %f seconds to start ", microtime (true) - $t);
while ($g->isRunning ()) {
echo ".";
usleep (100);
}
if ( $g->join () ){
printf (" and %f seconds to finish receiving %d bytes\n", microtime (true) - $t, strlen ($g->data));
} else{
printf (" and %f seconds to finish, request failed\n", microtime (true) - $t);
}
}
echo "<hr/>";
}
So what I expected from this code was it would hit google.com, localhost and facebook.com simultaneously and run their individual threads. But every request is waiting for another request to complete.
For this it is clearly waiting for first response to complete before it is making another request because time the request are sent are after the request from the previous request is complete.
So, This is clearly not the way to achieve what I am trying to achieve. How do I do this?
You might want to look at multi curl for such multiple external requests. Pthreads is more about internal processes.
Just for further reference, you are starting threads 1 by 1 and waiting for them to finish.
This code: while ($g->isRunning ()) doesn't stop until the thread is finished. It's like having a while (true) in a for. The for executes 1 step at a time.
You need to start the threads, add them in an array, and in another while loop check each of the threads if it stopped and remove them from the array.
some time ago I used query to check the status of a minecraft server via. php, but I wasn´t so happy with the results. Sometimes it just took more than 10 seconds or didn´t even got the status although the mc server was up and the webserver within the same data center.
Which method do you think would work most stable and with a good performance: query, stream_stocket or something else ?
Should I run the test every 30 seconds via. a cronjob or just cache the results for 30 secs '?
You can create a PHP file with this code and run periodically to store the status.
<?php
/**
* #author Kristaps Karlsons <kristaps.karlsons#gmail.com>
* Licensed under MPL 1.1
*/
function mc_status($host,$port='25565') {
$timeInit = microtime();
// TODO: implement a way to store data (memcached or MySQL?) - please don't overload target server
$fp = fsockopen($host,$port,$errno,$errstr,$timeout=10);
if(!$fp) die($errstr.$errno);
else {
fputs($fp, "\xFE"); // xFE - get information about server
$response = '';
while(!feof($fp)) $response .= fgets($fp);
fclose($fp);
$timeEnd = microtime();
$response = str_replace("\x00", "", $response); // remove NULL
//$response = explode("\xFF", $response); // xFF - data start (old version, prior to 1.0?)
$response = explode("\xFF\x16", $response); // data start
$response = $response[1]; // chop off all before xFF (could be done with regex actually)
//echo(dechex(ord($response[0])));
$response = explode("\xA7", $response); // xA7 - delimiter
$timeDiff = $timeEnd-$timeInit;
$response[] = $timeDiff < 0 ? 0 : $timeDiff;
}
return $response;
}
$data = mc_status('mc.exs.lv','25592'); // even better - don't use hostname but provide IP instead (DNS lookup is a waste)
print_r($data); // [0] - motd, [1] - online, [2] - slots, [3] - time of request (in microseconds - use this to present latency information)
Credits: skakri (https://gist.github.com/skakri/2134554)
I have run into a rather strange issue with a particular part of a large PHP application. The portion of the application in question loads data from MySQL (mostly integer data) and builds a JSON string which gets output to the browser. These requests were taking sevar seconds (8 - 10 seconds each) in Chrome's developer tools as well as via curl. However the PHP shutdown handler I had reported that the requests were executing in less than 1 second.
In order to debug I added a call to fastcgi_finish_request(), and suddenly my shutdown handler reported the same time as Chrome / curl.
With some debugging, I narrowed it down to a particular function. I created the following simple test case:
<?php
$start_time = DFStdLib::exec_time();
$product = new ApparelQuoteProduct(19);
$pmatrix = $product->productMatrix();
// This function call is the problem:
$ranges = $pmatrix->ranges();
$end_time = DFStdLib::exec_time();
$duration = $end_time - $start_time;
echo "Output generation duration was: $duration sec";
fastcgi_finish_request();
$fastcgi_finish_request_duration = DFStdLib::exec_time() - $end_time;
DFSkel::log(DFSkel::LOG_INFO,"Output generation duration was: $duration sec; fastcgi_finish_request Duration was: $fastcgi_finish_request_duration sec");
If I call $pmatrix->ranges() (which is a function that executes a number of calls to mysql_query to fetch data and build an in-memory PHP object structure from that data) then I get the output:
Output generation duration was: 0.2563910484314 sec; fastcgi_finish_request Duration was: 7.3854329586029 sec
in my log file. Note that the call to $pmatrix->ranges() does not take long at all, yet somehow it causes the PHP FastCGI handler to take seven seconds to fihish the request. (This is true even if I don't call fastcgi_finish_request -- the browser takes 7-8 seconds to display the data either way)
If I comment out the call to $pmatrix->ranges() I get:
Output generation duration was: 0.0016419887542725 sec; fastcgi_finish_request Duration was: 0.00035214424133301 sec
I can post the entire source for the $pmatrix->ranges() function, but it's very long. I'd like some advice on where to even start looking.
What is it about the PHP FastCGI request process which would even cause such behavior? Does it call destructor functions / garbage collection? Does it close open resources? How can I troubleshoot this further?
EDIT: Here's a larger source sample:
<?php
class ApparelQuote_ProductPricingMatrix_TestCase
{
protected $myProductId;
protected $myQuantityRanges;
private $myProduct;
protected $myColors;
protected $mySizes;
protected $myQuantityPricing;
public function __construct($product)
{
$this->myProductId = intval($product);
}
/**
* Return an array of all ranges for this matrix.
*
* #return array
*/
public function ranges()
{
$this->myLoadPricing();
return $this->myQuantityRanges;
}
protected function myLoadPricing($force=false)
{
if($force || !$this->myQuantityPricing)
{
$this->myColors = array();
$this->mySizes = array();
$priceRec_finder = new ApparelQuote_ProductPricingRecord();
$priceRec_finder->_link = Module_ApparelQuote::dbLink();
$found_recs = $priceRec_finder->find(_ALL,"`product_id`={$this->myProductId}","`qtyrange_id`,`color_id`");
$qtyFinder = new ApparelQuote_ProductPricingQtyRange();
$qtyFinder->_link = Module_ApparelQuote::dbLink();
$this->myQuantityRanges = $qtyFinder->find(_ALL,"`product_id`=$this->myProductId");
$this->myQuantityPricing = array();
foreach ($found_recs as &$r)
{
if(false) $r = new ApparelQuote_ProductPricingRecord();
if(!isset($this->myColors[$r->color_id]))
$this->myColors[$r->color_id] = true;
if(!isset($this->mySizes[$r->size_id]))
$this->mySizes[$r->size_id] = true;
if(!is_array($this->myQuantityPricing[$r->qtyrange_id]))
$this->myQuantityPricing[$r->qtyrange_id] = array();
if(!is_array($this->myQuantityPricing[$r->qtyrange_id][$r->color_id]))
$this->myQuantityPricing[$r->qtyrange_id][$r->color_id] = array();
$this->myQuantityPricing[$r->qtyrange_id][$r->color_id][$r->size_id] = &$r;
}
$this->myColors = array_keys($this->myColors);
$this->mySizes = array_keys($this->mySizes);
}
}
}
$start_time = DFStdLib::exec_time();
$pmatrix = new ApparelQuote_ProductPricingMatrix_TestCase(19);
$ranges = $pmatrix->ranges();
$end_time = DFStdLib::exec_time();
$duration = $end_time - $start_time;
echo "Output generation duration was: $duration sec";
fastcgi_finish_request();
$fastcgi_finish_request_duration = DFStdLib::exec_time() - $end_time;
DFSkel::log(DFSkel::LOG_INFO,"Output generation duration was: $duration sec; fastcgi_finish_request Duration was: $fastcgi_finish_request_duration sec");
Upon continued debugging I have narrowed down to the following lines from the above:
if(!is_array($this->myQuantityPricing[$r->qtyrange_id][$r->color_id]))
$this->myQuantityPricing[$r->qtyrange_id][$r->color_id] = array();
These statements are building an in-memory array structure of all the data loaded from MySQL. If I comment these out, then fastcgi_finish_request takes roughly 0.0001 seconds to run. If I do not comment them out, then fastcgi_finish_request takes 7+ seconds to run.
It's actually the function call to is_array that's the issue here
Changing to:
if(!isset($this->myQuantityPricing[$r->qtyrange_id][$r->color_id]))
Resolves the problem. Why is this?
I am using stream_select() but it returns 0 number of descriptors after few seconds and my function while there is still data to be read.
An unusual thing though is that if you set the time out as 0 then I always get the number of descriptors as zero.
$num = stream_select($read, $w, $e, 0);
stream_select() must be used in a loop
The stream_select() function basically just polls the stream selectors you provided in the first three arguments, which means it will wait until one of the following events occur:
some data arrives
or reaches timeout (set with $tv_sec and $tv_usec) without getting any data.
So recieving 0 as a return value is perfectly normal, it means there was no new data in the current polling cycle.
I'd suggest to put the function in a loop something like this:
$EOF = false;
do {
$tmp = null;
$ready = stream_select($read, $write, $excl, 0, 50000);
if ($ready === false ) {
// something went wrong!!
break;
} elseif ($ready > 0) {
foreach($read as $r) {
$tmp .= stream_get_contents($r);
if (feof($r)) $EOF = true;
}
if (!empty($tmp)) {
//
// DO SOMETHING WITH DATA
//
continue;
}
} else {
// No data in the current cycle
}
} while(!$EOF);
Please note that in this example, the script totally ignores everything aside from the input stream. Also, the third section of the "if" statement is completely optional.
Does it return the number 0 or a FALSE boolean? FALSE means there was some error but zero could be just because of timeout or nothing interesting has happen with the streams and you should do a new select etc.
I would guess this could happen with a zero timeout as it will check and return immediately. Also if you read the PHP manual about stream-select you will see this warning about using zero timeout:
Using a timeout value of 0 allows you to instantaneously poll the status of the streams, however, it is NOT a good idea to use a 0 timeout value in a loop as it will cause your script to consume too much CPU time.
If this is a TCP stream and you want to check for connection close you should check the return value from fread etc to determine if the other peer has closed the conneciton. About the read streams array argument:
The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
http://www.php.net/stream_select
Due to a limitation in the current Zend Engine it is not possible to
pass a constant modifier like NULL directly as a parameter to a
function which expects this parameter to be passed by reference.
Instead use a temporary variable or an expression with the leftmost
member being a temporary variable:
<?php $e = NULL; stream_select($r, $w, $e, 0); ?>
I have a similar issue which is caused by the underlying socket timeout.
Eg. I create some streams
$streams = stream_socket_pair(STREAM_PF_UNIX, STREAM_SOCK_STREAM, STREAM_IPPROTO_IP);
Then fork, and use a block such as the following
stream_set_blocking($pipes[1], 1);
stream_set_blocking($pipes[2], 1);
$pipesToRead = array($pipes[1], $pipes[2]);
while (!feof($pipesToRead[0]) || !feof($pipesToRead[1])) {
$reads = $pipesToRead;
$writes = null;
$excepts = $pipesToRead;
$tSec = null;
stream_select($reads, $writes, $excepts, $tSec);
// while it's generating any kind of output, duplicate it wherever it
// needs to go
foreach ($reads as &$read) {
$chunk = fread($read, 8192);
foreach ($streams as &$stream)
fwrite($stream, $chunk);
}
}
Glossing over what other things might be wrong there, my $tSec argument to stream_select is ignored, and the "stream" will timeout after 60 seconds of inactivity and produce an EOF.
If I add the following after creating the streams
stream_set_timeout($streams[0], 999);
stream_set_timeout($streams[1], 999);
Then I get the result I desire, even if there's no activity on the underlying stream for longer than 60 seconds.
I feel that this might be a bug, because I don't want that EOF after 60 seconds of inactivity on the underlying stream, and I don't want to plug in some arbitrarily large value to avoid hitting the timeout if my processes are idle for some time.
In addition, even if the 60 second timeout remains, I think it should just timeout on my stream_select() call and my loop should be able to continue.