I have run into a rather strange issue with a particular part of a large PHP application. The portion of the application in question loads data from MySQL (mostly integer data) and builds a JSON string which gets output to the browser. These requests were taking sevar seconds (8 - 10 seconds each) in Chrome's developer tools as well as via curl. However the PHP shutdown handler I had reported that the requests were executing in less than 1 second.
In order to debug I added a call to fastcgi_finish_request(), and suddenly my shutdown handler reported the same time as Chrome / curl.
With some debugging, I narrowed it down to a particular function. I created the following simple test case:
<?php
$start_time = DFStdLib::exec_time();
$product = new ApparelQuoteProduct(19);
$pmatrix = $product->productMatrix();
// This function call is the problem:
$ranges = $pmatrix->ranges();
$end_time = DFStdLib::exec_time();
$duration = $end_time - $start_time;
echo "Output generation duration was: $duration sec";
fastcgi_finish_request();
$fastcgi_finish_request_duration = DFStdLib::exec_time() - $end_time;
DFSkel::log(DFSkel::LOG_INFO,"Output generation duration was: $duration sec; fastcgi_finish_request Duration was: $fastcgi_finish_request_duration sec");
If I call $pmatrix->ranges() (which is a function that executes a number of calls to mysql_query to fetch data and build an in-memory PHP object structure from that data) then I get the output:
Output generation duration was: 0.2563910484314 sec; fastcgi_finish_request Duration was: 7.3854329586029 sec
in my log file. Note that the call to $pmatrix->ranges() does not take long at all, yet somehow it causes the PHP FastCGI handler to take seven seconds to fihish the request. (This is true even if I don't call fastcgi_finish_request -- the browser takes 7-8 seconds to display the data either way)
If I comment out the call to $pmatrix->ranges() I get:
Output generation duration was: 0.0016419887542725 sec; fastcgi_finish_request Duration was: 0.00035214424133301 sec
I can post the entire source for the $pmatrix->ranges() function, but it's very long. I'd like some advice on where to even start looking.
What is it about the PHP FastCGI request process which would even cause such behavior? Does it call destructor functions / garbage collection? Does it close open resources? How can I troubleshoot this further?
EDIT: Here's a larger source sample:
<?php
class ApparelQuote_ProductPricingMatrix_TestCase
{
protected $myProductId;
protected $myQuantityRanges;
private $myProduct;
protected $myColors;
protected $mySizes;
protected $myQuantityPricing;
public function __construct($product)
{
$this->myProductId = intval($product);
}
/**
* Return an array of all ranges for this matrix.
*
* #return array
*/
public function ranges()
{
$this->myLoadPricing();
return $this->myQuantityRanges;
}
protected function myLoadPricing($force=false)
{
if($force || !$this->myQuantityPricing)
{
$this->myColors = array();
$this->mySizes = array();
$priceRec_finder = new ApparelQuote_ProductPricingRecord();
$priceRec_finder->_link = Module_ApparelQuote::dbLink();
$found_recs = $priceRec_finder->find(_ALL,"`product_id`={$this->myProductId}","`qtyrange_id`,`color_id`");
$qtyFinder = new ApparelQuote_ProductPricingQtyRange();
$qtyFinder->_link = Module_ApparelQuote::dbLink();
$this->myQuantityRanges = $qtyFinder->find(_ALL,"`product_id`=$this->myProductId");
$this->myQuantityPricing = array();
foreach ($found_recs as &$r)
{
if(false) $r = new ApparelQuote_ProductPricingRecord();
if(!isset($this->myColors[$r->color_id]))
$this->myColors[$r->color_id] = true;
if(!isset($this->mySizes[$r->size_id]))
$this->mySizes[$r->size_id] = true;
if(!is_array($this->myQuantityPricing[$r->qtyrange_id]))
$this->myQuantityPricing[$r->qtyrange_id] = array();
if(!is_array($this->myQuantityPricing[$r->qtyrange_id][$r->color_id]))
$this->myQuantityPricing[$r->qtyrange_id][$r->color_id] = array();
$this->myQuantityPricing[$r->qtyrange_id][$r->color_id][$r->size_id] = &$r;
}
$this->myColors = array_keys($this->myColors);
$this->mySizes = array_keys($this->mySizes);
}
}
}
$start_time = DFStdLib::exec_time();
$pmatrix = new ApparelQuote_ProductPricingMatrix_TestCase(19);
$ranges = $pmatrix->ranges();
$end_time = DFStdLib::exec_time();
$duration = $end_time - $start_time;
echo "Output generation duration was: $duration sec";
fastcgi_finish_request();
$fastcgi_finish_request_duration = DFStdLib::exec_time() - $end_time;
DFSkel::log(DFSkel::LOG_INFO,"Output generation duration was: $duration sec; fastcgi_finish_request Duration was: $fastcgi_finish_request_duration sec");
Upon continued debugging I have narrowed down to the following lines from the above:
if(!is_array($this->myQuantityPricing[$r->qtyrange_id][$r->color_id]))
$this->myQuantityPricing[$r->qtyrange_id][$r->color_id] = array();
These statements are building an in-memory array structure of all the data loaded from MySQL. If I comment these out, then fastcgi_finish_request takes roughly 0.0001 seconds to run. If I do not comment them out, then fastcgi_finish_request takes 7+ seconds to run.
It's actually the function call to is_array that's the issue here
Changing to:
if(!isset($this->myQuantityPricing[$r->qtyrange_id][$r->color_id]))
Resolves the problem. Why is this?
Related
I have an article on my website with the (markdown) content:
# PHP Proper Class Name
Class names in PHP are case insensitve. If you have a class declaration like:
```php
class MyWeirdClass {}
```
you can instantiate it with `new myWEIRDclaSS()` or any other variation on the case. In some instances, you may want to know, what is the correct, case-sensitive class name.
### Example Use case
For example, in one of my libraries under construction [API Doccer](https://github.com/ReedOverflow/PHP-API-Doccer), I can view documentation for a class at url `/doc/class/My-Namespace-Clazzy/` and if you enter the wrong case, like `/doc/class/my-NAMESPACE-CLAzzy`, it should automatically redirect to the proper-cased class. To do this, I use the reflection method below as it is FAR more performant than the `get_delcared_classes` method
## Reflection - get proper case
Credit goes to [l00k on StackOverflow](https://stackoverflow.com/a/35222911/802469)
```php
$className = 'My\caseINAccuRATE\CLassNamE';
$reflection = new ReflectionClass($className);
echo $reflection->getName();
```
results in `My\CaseInaccurate\ClassName`;
Running the benchmark (see below) on localhost on my laptop, getting the proper case class name of 500 classes took about 0.015 seconds, as opposed to ~0.050 seconds using the `get_declared_classes` method below.
## get_declared_classes - get proper case
This was my idea, as I hadn't even considered using reflection, until I saw [l00k's answer on StackOverflow](https://stackoverflow.com/a/35222911/802469). Guessing it would be less efficient, I wrote the code and figured it out anyway, because it's fun!
```php
$wrongCaseName = 'Some\classy\THIng';
class_exists($wrongCaseName); //so it gets autoloaded if not already done
$classes = get_declared_classes();
$map = array_combine(array_map('strtolower',$classes),$classes);
$proper = $map[strtolower($wrongCaseName)];
```
results in `$proper = 'Some\Classy\Thing'`;
Running the bencmark (see below) on localhost on my laptop, getting the proper case class name of 500 classes took about 0.050 seconds, as opposed to ~0.015 seconds with reflection (above).
## Benchmark:
I used the following code to do the benchmark, removing the `classes` directory between each run of the benchmark. It's not perfect. At all. But it gets the job done well enough, I think:
```php
<?php
$times = [];
$times['begin'] = microtime(TRUE);
spl_autoload_register(function($className){
if (file_exists($name=__DIR__.'/classes/'.strtolower($className).'.php')){
include($name);
}
});
if (is_dir(__DIR__.'/classes'))return;
mkdir(__DIR__.'/classes');
function generateRandomString($length = 10) {
$characters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $length; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}
$times['start_file_write'] = microtime(TRUE);
$names = [];
for ($i=0;$i<500;$i++){
$className = generateRandomString(10);
$file = __DIR__.'/classes/'.strtolower($className).'.php';
if (file_exists($file)){
$i = $i-1;
continue;
}
$code = "<?php \n\n".'class '.$className.' {}'."\n\n ?>";
file_put_contents($file,$code);
$names[] = strtoupper($className);
}
$times['begin_get_declared_classes_benchmark'] = microtime(TRUE);
$propers = [];
// foreach($names as $index => $name){
// $wrongCaseName = strtoupper($name);
// class_exists($wrongCaseName); //so it gets autoloaded if not already done
// $classes = get_declared_classes();
// $map = array_combine(array_map('strtolower',$classes),$classes);
// $proper = $map[strtolower($wrongCaseName)];
// if ($index%20===0){
// $times['intermediate_bench_'.$index] = microtime(TRUE);
// }
// $propers[] = $proper;
// }
// the above commented lines are the get_declared_classes() method.
// the foreach below is for reflection.
foreach ($names as $index => $name){
$className = strtoupper($name);
$reflection = new ReflectionClass($className);
if ($index%20===0){
$times['intermediate_bench_'.$index] = microtime(TRUE);
}
$propers[] = $reflection->getName();
}
$times['end_get_declared_classes_benchmark'] = microtime(TRUE);
$start = $times['begin'];
$bench = $times['begin_get_declared_classes_benchmark'];
$lastTime = 0;
foreach($times as $key => $time){
echo "\nTime since begin:".($time-$start);
echo "\nTime since last: ".($time-$lastTime)." key was {$key}";
echo "\nTime since bench start: ".($time - $bench);
$lastTime = $time;
}
print_r($times);
print_r($propers);
exit;
```
### Results
```
// get_declared_classes method
//Time since bench start: 0.052499055862427 is total time for processing get_declared_classes w/ $i=500
//Time since bench start: 0.047168016433716
// last bench time Time since begin:0.062150955200195
// 100 intermediate bench: Time since bench start: 0.0063230991363525
// 200 : Time since bench start: 0.015070915222168
// 300 intermediate bench: Time since bench start: 0.02455997467041
// 400 intermediate bench: Time since bench start: 0.033944129943848
// 480 : Time since bench start: 0.044310092926025
//reflection method:
//Time since bench start: 0.01493501663208
//Time since bench start: 0.017416954040527
// 100 intermediate: Time since bench start: 0.0035450458526611
// 200 intermediate: Time since bench start: 0.0066778659820557
// 300 intermediate: Time since bench start: 0.010055065155029
// 400 intermediate: Time since bench start: 0.014182090759277
// 480 intermediate: Time since bench start: 0.01679801940918
```
#### Results' notes
- "Time since bench start" is the entire time it took to run all the iterations. I share this twice above.
- "100 Intermediate" (200, 300, etc) are actually the results at 120, 220, etc... I messed up in copy+pasting results & didn't want to do it again. Yes. I'm lazy :)
- The results would of course vary between runs of the code, but it's pretty clear that the reflection option is significantly faster.
- All was run on a localhost server on an Acer laptop.
- PHP Version 7.2.19-0ubuntu0.19.04.1 (from `php info()`)
As shown above, I'm able to submit the article & everything works as expected - saves to DB & everything. The very last line, if I change php info() to phpinfo() (removing the space), I get this error:
Not Acceptable!
An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.
When I try to submit with phpinfo() (no space), my PHP does not execute at all and I only get this error. The network tab in firefox shows "406 Not Acceptable" for the status code. Nothing is being written to my error log in $_SERVER['DOCUMENT_ROOT'].'/error_log', which is where all the PHP errors log to, anyway. In my home folder, there is a logs folder, but it remains empty. No logs in /etc/ or /etc/my_website_name.com either.
What could be causing this problem? Is there something in the PHP.ini I could change? Could .htaccess affect this at all?
At the very least, how do I troubleshoot this problem?
Troubleshooting
I can submit an article which only contains - PHP Version 7.2.19-0ubuntu0.19.04.1 (from `phpinfo()`) in the body.
If I remove phpinfo() and add more content to the body of the post (more total data being submitted), it works
putting a space, like php info() makes it work, and is how the post currently exists.
I don't know what else to do
I am using Simple MDE now, but it happened on multiple other occasions before I started using Simple MDE. It has only been with relatively large posts that also contain code.
I am on Shared Hosting with HostGator, using HTTPS:// and PHP 7.2.19
I contacted HostGator. They added something to a white list, but didn't give me intimate details. It fixed the problem.
First agent took awhile, failed to resolve the issue, and disconnected prematurely.
The second agent was reasonably prompt & resolved the problem, saying I shouldn't have this issue with similar types of POST requests which contain code.
I have a script that makes multiple POST requests to an API. Rough outline of the script is as follows:
define("MAX_REQUESTS_PER_MINUTE", 100);
function apirequest ($data) {
// post data using cURL
}
while ($data = getdata ()) {
apirequest($data);
}
The API is throttled, it allows users to post up to 100 requests per minute. Additional requests return HTTP error + Retry-After response until the window resets. Note that the server can take anywhere between 100 milliseconds to 100 seconds to process the request.
I need to make sure that my function does not execute more than 100 times per minute. I have tried usleep function to introduce a constant delay of 0.66 seconds but this simply adds one extra minute per minute. An arbitrary value such as 0.1 second results in error one time or another. I log all requests inside a database table along with time, the other solution I used is to probe the table and count the number of requests made within last 60 seconds.
I need a solution that wastes as little time as possible.
I've put Derek's suggestion into code.
class Throttler {
private $maxRequestsPerMinute;
private $getdata;
private $apirequest;
private $firstRequestTime = null;
private $requestCount = 0;
public function __construct(
int $maxRequestsPerMinute,
$getdata,
$apirequest
) {
$this->maxRequestsPerMinute = $maxRequestsPerMinute;
$this->getdata = $getdata;
$this->apirequest = $apirequest;
}
public function run() {
while ($data = call_user_func($this->getdata)) {
if ($this->requestCount >= $this->maxRequestsPerMinute) {
sleep(ceil($this->firstRequestTime + 60 - microtime(true)));
$this->firstRequestTime = null;
$this->requestCount = 0;
}
if ($this->firstRequestTime === null) {
$this->firstRequestTime = microtime(true);
}
++$this->requestCount;
call_user_func($this->apirequest, $data);
}
}
}
$throttler = new Throttler(100, 'getdata', 'apirequest');
$throttler->run();
UPD. I've put its updated version on Packagist so you can use it with Composer: https://packagist.org/packages/ob-ivan/throttler
To install:
composer require ob-ivan/throttler
To use:
use Ob_Ivan\Throttler\JobInterface;
use Ob_Ivan\Throttler\Throttler;
class SalmanJob implements JobInterface {
private $data;
public function next(): bool {
$this->data = getdata();
return (bool)$this->data;
}
public function execute() {
apirequest($this->data);
}
}
$throttler = new Throttler(100, 60);
$throttler->run(new SalmanJob());
Please note there are other packages providing the same functionality (I haven't tested any of them):
https://packagist.org/packages/franzip/throttler
https://packagist.org/packages/andrey-mashukov/throttler
https://packagist.org/packages/queryyetsimple/throttler
I would start by recording initial time when first request is to be made and then count how many requests are being made. Once 60 requests have been made make sure the current time is at least 1 minute after initial time. If not usleep for however long is left until minute is reached. When minute is reached reset count and initial time value.
Here is my go at this:
define("MAX_REQUESTS_PER_MINUTE", 100);
function apirequest() {
static $startingTime;
static $requestCount;
if ($startingTime === null) {
$startingTime = time();
}
if ($requestCount === null) {
$requestCount = 0;
}
$consumedTime = time() - $startingTime;
if ($consumedTime >= 60) {
$startingTime = time();
$requestCount = 0;
} elseif ($requestCount === MAX_REQUESTS_PER_MINUTE) {
sleep(60 - $consumedTime);
$startingTime = time();
$requestCount = 0;
}
$requestCount++;
echo sprintf("Request %3d, Range [%d, %d)", $requestCount, $startingTime, $startingTime + 60) . PHP_EOL;
file_get_contents("http://localhost/apirequest.php");
// the above script sleeps for 200-400ms
}
for ($i = 0; $i < 1000; $i++) {
apirequest();
}
I've tried the naive solutions of static sleeps, counting requests, and doing simple math but they tended to be quite inaccurate, unreliable, and generally introduced far more sleeping that was necessary when they could have been doing work. What you want is something that only starts issuing consequential sleeps when you're approaching your rate-limit.
Lifting my solution from a previous problem for those sweet, sweet internet points:
I used some math to figure out a function that would sleep for the correct sum of time over the given request, and allow me to ramp it up exponentially towards the end.
If we express the sleep as:
y = e^( (x-A)/B )
where A and B are arbitrary values controlling the shape of the curve, then the sum of all sleeps, M, from 0 to N requests would be:
M = 0∫N e^( (x-A)/B ) dx
This is equivalent to:
M = B * e^(-A/B) * ( e^(N/B) - 1 )
and can be solved with respect to A as:
A = B * ln( -1 * (B - B * e^(N/B)) / M )
While solving for B would be far more useful, since specifying A lets you define a what point the graph ramps up aggressively, the solution to that is mathematically complex, and I've not been able to solve it myself or find anyone else that can.
/**
* #param int $period M, window size in seconds
* #param int $limit N, number of requests permitted in the window
* #param int $used x, current request number
* #param int $bias B, "bias" value
*/
protected static function ratelimit($period, $limit, $used, $bias=20) {
$period = $period * pow(10,6);
$sleep = pow(M_E, ($used - self::biasCoeff($period, $limit, $bias))/$bias);
usleep($sleep);
}
protected static function biasCoeff($period, $limit, $bias) {
$key = sprintf('%s-%s-%s', $period, $limit, $bias);
if( ! key_exists($key, self::$_bcache) ) {
self::$_bcache[$key] = $bias * log( -1 * ( ($bias - $bias * pow(M_E, $limit/$bias)) / $period ) );
}
return self::$_bcache[$key];
}
With a bit of tinkering I've found that B = 20 seems to be a decent default, though I have no mathematical basis for it. Something something slope mumble mumble exponential bs bs.
Also, if anyone wants to solve that equation for B for me I've got a bounty up on math.stackexchange.
Though I believe that our situations differ slightly in that my API provider's responses all included the number of available API calls, and the number still remaining within the window. You may need additional code to track this on your side instead.
Ok, so lets start slow...
I have a pthreads script running and working for me, tested and working 100% of the time when I run it manually from the command line via ssh. The script is as follows with the main thread process code adjusted to simulate random process' run time.
class ProcessingPool extends Worker {
public function run(){}
}
class LongRunningProcess extends Threaded implements Collectable {
public function __construct($id,$data) {
$this->id = $id;
$this->data = $data;
}
public function run() {
$data = $this->data;
$this->garbage = true;
$this->result = 'START TIME:'.time().PHP_EOL;
// Here is our actual logic which will be handled within a single thread (obviously simulated here instead of the real functionality)
sleep(rand(1,100));
$this->result .= 'ID:'.$this->id.' RESULT: '.print_r($this->data,true).PHP_EOL;
$this->result .= 'END TIME:'.time().PHP_EOL;
$this->finished = time();
}
public function __destruct () {
$Finished = 'EXITED WITHOUT FINISHING';
if($this->finished > 0) {
$Finished = 'FINISHED';
}
if ($this->id === null) {
print_r("nullified thread $Finished!");
} else {
print_r("Thread w/ ID {$this->id} $Finished!");
}
}
public function isGarbage() : bool { return $this->garbage; }
public function getData() {
return $this->data;
}
public function getResult() {
return $this->result;
}
protected $id;
protected $data;
protected $result;
private $garbage = false;
private $finished = 0;
}
$LoopDelay = 500000; // microseconds
$MinimumRunTime = 300; // seconds (5 minutes)
// So we setup our pthreads pool which will hold our collection of threads
$pool = new Pool(4, ProcessingPool::class, []);
$Count = 0;
$StillCollecting = true;
$CountCollection = 0;
do {
// Grab all items from the conversion_queue which have not been processed
$result = $DB->prepare("SELECT * FROM `processing_queue` WHERE `processed` = 0 ORDER BY `queue_id` ASC");
$result->execute();
$rows = $result->fetchAll(PDO::FETCH_ASSOC);
if(!empty($rows)) {
// for each of the rows returned from the queue, and allow the workers to run and return
foreach($rows as $id => $row) {
$update = $DB->prepare("UPDATE `processing_queue` SET `processed` = 1 WHERE `queue_id` = ?");
$update->execute([$row['queue_id']]);
$pool->submit(new LongRunningProcess($row['fqueue_id'],$row));
$Count++;
}
} else {
// 0 Rows To Add To Pool From The Queue, Do Nothing...
}
// Before we allow the loop to move on to the next part, lets try and collect anything that finished
$pool->collect(function ($Processed) use(&$CountCollection) {
global $DB;
$data = $Processed->getData();
$result = $Processed->getResult();
$update = $DB->prepare("UPDATE `processing_queue` SET `processed` = 2 WHERE `queue_id` = ?");
$update->execute([$data['queue_id']]);
$CountCollection++;
return $Processed->isGarbage();
});
print_r('Collecting Loop...'.$CountCollection.'/'.$Count);
// If we have collected the same total amount as we have processed then we can consider ourselves done collecting everything that has been added to the database during the time this script started and was running
if($CountCollection == $Count) {
$StillCollecting = false;
print_r('Done Collecting Everything...');
}
// If we have not reached the full MinimumRunTime that this cron should run for, then lets continue to loop
$EndTime = microtime(true);
$TimeElapsed = ($EndTime - $StartTime);
if(($TimeElapsed/($LoopDelay/1000000)) < ($MinimumRunTime/($LoopDelay/1000000))) {
$StillCollecting = true;
print_r('Ended To Early, Lets Force Another Loop...');
}
usleep($LoopDelay);
} while($StillCollecting);
$pool->shutdown();
So while the above script will run via a command line (which has been adjusted to the basic example, and detailed processing code has been simulated in the above example), the below command gives a different result when run from a cron setup for every 5 minutes...
/opt/php7zts/bin/php -q /home/account/cron-entry.php file=every-5-minutes/processing-queue.php
The above script, when using the above command line call, will loop over and over during the run time of the script and collect any new items from the DB queue, and insert them into the pool, which allows 4 processes at a time to run and finish, which is then collected and the queue is updated before another loop happens, pulling any new items from the DB. This script will run until we have processed and collected all processes in the queue during the execution of the script. If the script has not run for the full 5 minute expected period of time, the loop is forced to continue checking the queue, if the script has run over the 5 minute mark it allows any current threads to finish & be collected before closing. Note that the above code also includes a code based "flock" functionality which makes future crons of this idle loop and exit or start once the lock has lifted, ensuring that the queue and threads are not bumping into each other. Again, ALL OF THIS WORKS FROM THE COMMAND LINE VIA SSH.
Once I take the above command, and put it into a cron to run for every 5 minutes, essentially giving me a never ending loop, while maintaining memory, I get a different result...
That result is described as follows... The script starts, checks the flock, and continues if the lock is not there, it creates the lock, and runs the above script. The items are taken from the queue in the DB, and inserted into the pool, the pool fires off the 4 threads at a time as expected.. But the unexpected result is that the run() command does not seem to be executed, and instead the __destruct function runs, and a "Thread w/ ID 2 FINISHED!" type of message is returned to the output. This in turn means that the collection side of things does not collect anything, and the initiating script (the cron script itself /home/account/cron-entry.php file=every-5-minutes/processing-queue.php) finishes after everything has been put into the pool, and destructed. Which prematurely "finishes" the cron job, since there is nothing else to do but loop and pull nothing new from the queue, since they are considered "being processed" when processed == 1 in the queue.
The question then finally becomes... How do I make the cron's script aware of the threads that where spawned and run() them without closing the pool out before they can do anything?
(note... if you copy / paste the provided script, note that I did not test it after removing the detailed logic, so it may need some simple fixes... please do not nit-pick said code, as the key here is that pthreads works if the script is executed FROM the Command Line, but fails to properly run when the script is executed FROM a CRON. If you plan on commenting with non-constructive criticism, please go use your fingers to do something else!)
Joe Watkins! I Need Your Brilliance! Thanks In Advance!
After all of that, it seems that the issue was with regards to user permissions. I was setting this specific cron up inside of cpanel, and when running the command manually I was logged in as root.
After setting this command up in roots crontab, I was able to get it to successfully run the threads from the pool. Only issue I have now is some threads never finish, and sometimes I am unable to close the pool. But this is a different issue, so I will open another question elsewhere.
For those running into this issue, make sure you know who the owner of the cron is as it matters with php's pthreads.
I have been trying to implement multi-threading in php to achieve multi-upload using pthreads php.
From my understanding of multi-threading, this is how I envisioned it working.
I would upload a file,the file will start uploading in the background; even if the file is not completed to upload, another instance( thread ) will be created to upload another file. I would make multiple upload requests using AJAXand multiple files would start uploading, I would get the response of a single request individually and I can update the status of upload likewise in my site.
But this is not how it is working. This is the code that I got from one of the pthread question on SO, but I do not have the link( sorry!! ).
I tested this code to see of this really worked like I envisioned. This is the code I tested, I changed it a little.
<?php
error_reporting(E_ALL);
class AsyncWebRequest extends Thread {
public $url;
public $data;
public function __construct ($url) {
$this->url = $url;
}
public function run () {
if ( ($url = $this->url) ){
/*
* If a large amount of data is being requested, you might want to
* fsockopen and read using usleep in between reads
*/
$this->data = file_get_contents ($url);
echo $this->getThreadId ();
} else{
printf ("Thread #%lu was not provided a URL\n", $this->getThreadId ());
}
}
}
$t = microtime (true);
foreach( ["http://www.google.com/?q=". rand () * 10, 'http://localhost', 'https://facebook.com'] as $url ){
$g = new AsyncWebRequest( $url );
/* starting synchronized */
if ( $g->start () ){
printf ( $url ." took %f seconds to start ", microtime (true) - $t);
while ($g->isRunning ()) {
echo ".";
usleep (100);
}
if ( $g->join () ){
printf (" and %f seconds to finish receiving %d bytes\n", microtime (true) - $t, strlen ($g->data));
} else{
printf (" and %f seconds to finish, request failed\n", microtime (true) - $t);
}
}
echo "<hr/>";
}
So what I expected from this code was it would hit google.com, localhost and facebook.com simultaneously and run their individual threads. But every request is waiting for another request to complete.
For this it is clearly waiting for first response to complete before it is making another request because time the request are sent are after the request from the previous request is complete.
So, This is clearly not the way to achieve what I am trying to achieve. How do I do this?
You might want to look at multi curl for such multiple external requests. Pthreads is more about internal processes.
Just for further reference, you are starting threads 1 by 1 and waiting for them to finish.
This code: while ($g->isRunning ()) doesn't stop until the thread is finished. It's like having a while (true) in a for. The for executes 1 step at a time.
You need to start the threads, add them in an array, and in another while loop check each of the threads if it stopped and remove them from the array.
I have read some articles (like this, or this), and all of them give me the same way to implements Long Polling in PHP (using usleep() and loop), like that:
$source; // some data source - db, etc
$data = null; // our return data
$timeout = 30; // timeout in seconds
$now = time(); // start time
// loop for $timeout seconds from $now until we get $data
while((time() - $now) < $timeout) {
// fetch $data
$data = $source->getData();
// if we got $data, break the loop
if (!empty($data)) break;
// wait 1 sec to check for new $data
usleep(10000);
}
// if there is no $data, tell the client to re-request (arbitrary status message)
if (empty($data)) $data = array('status'=>'no-data');
// send $data response to client
echo json_encode($data);
Is there another way? I know that PHP is a script language only, but i would like a way that base on event rather than checking and doing or waiting until timeout. It maybe be something like Continuations in Java that would be perfect.
You could try React: http://reactphp.org/
Is not very mature yet, but it may suit your needs. Instead of doing long pooling, you can do it async.
I would recommend: http://ape-project.org/
mature and scalable