Posting with snoopy using proxy takes too long - php

Hi I have a php script which tries to do a post.
I use Snoopy class and also I use proxy.
I managed to post but when I use a proxy the posting is extremely slow.
I mean it can take till 30 minutes.
I don't want to block my script for 30 minutes waiting for a post.
Any idea how could I solve this?
The code looks like:
require('../includes/Snoopy.class.php');
$snoopy = new Snoopy();
$snoopy->proxy_host = "my.proxy.host";
$snoopy->proxy_port = "8080";
$p_data['color'] = 'Red';
$p_data['fruit'] = 'apple';
$snoopy->cookies['vegetable'] = 'carrot';
$snoopy->cookies['something'] = 'value';
$snoopy->submit('http://phpstarter.net/samples/118/data_dump.php', $p_data);
echo '' . htmlspecialchars($snoopy->results) . '';

var $read_timeout = 0; // timeout on read operations, in seconds
// set to 0 to disallow timeouts
So, you could try to set $snoopy->read_timeout to any reasonable value.

Related

How to process GuzzleHTTP async requests without blocking?

I need to write a processor that can potentially send out many HTTP requests to an external service. Since I want to maximize performance, I wish to minimize blocking. I'm using PHP 5.6 and GuzzleHTTP.
GuzzleHTTP does have an option for async requests. But since we do have only 1 thread available in PHP, I need to allocate some time for them to be processed. Unfortunately I only see one way to do it - calling wait which blocks until all the requests are processed. That's not what I want.
Instead I'd like to have some method that handles whatever has arrived, and then returns. So that I can do something along the lines of:
$allRequests = [];
while ( !checkIfNeedToEnd() ) {
$newItems = getItemsFromQueue();
$allRequests = $allRequests + spawnRequests($newItems);
GuzzleHttp::processWhatYouCan($allRequests);
removeProcessedRequests($allRequests);
}
Is this possible?
Alright... figured it out myself:
$handler = new \GuzzleHttp\Handler\CurlMultiHandler();
$client = new \GuzzleHttp\Client(['handler' => $handler]);
$promise1 = $client->getAsync("http://www.stackoverflow.com");
$promise2 = $client->getAsync("http://localhost/");
$doneCount = 0;
$promise1->then(function() use(&$doneCount) {
$doneCount++;
echo 'Promise 1 done!';
});
$promise2->then(function() use(&$doneCount) {
$doneCount++;
echo 'Promise 2 done!';
});
$last = microtime(true);
while ( $doneCount < 2 ) {
$now = microtime(true);
$delta = round(($now-$last)*1000);
echo "tick($delta) ";
$last = $now;
$handler->tick();
}
And the output I get is:
tick(0) tick(6) tick(1) tick(0) tick(1001) tick(10) tick(96) Promise 2 done!tick(97) Promise 1 done!
The magic ingredient is creating the CurlMultiHandler yoursef and then calling tick() on that when it's convenient. After that it's promises as usual. And if the queue is empty, tick() returns immediately.
Note that it can still block for up to 1 second (default) if there is no activity. This can be also changed if needed:
$handler = new \GuzzleHttp\Handler\CurlMultiHandler(['select_timeout' => 0.5]);
The value is in seconds, but with floating point.

Is there another way to implements Long Polling in PHP

I have read some articles (like this, or this), and all of them give me the same way to implements Long Polling in PHP (using usleep() and loop), like that:
$source; // some data source - db, etc
$data = null; // our return data
$timeout = 30; // timeout in seconds
$now = time(); // start time
// loop for $timeout seconds from $now until we get $data
while((time() - $now) < $timeout) {
// fetch $data
$data = $source->getData();
// if we got $data, break the loop
if (!empty($data)) break;
// wait 1 sec to check for new $data
usleep(10000);
}
// if there is no $data, tell the client to re-request (arbitrary status message)
if (empty($data)) $data = array('status'=>'no-data');
// send $data response to client
echo json_encode($data);
Is there another way? I know that PHP is a script language only, but i would like a way that base on event rather than checking and doing or waiting until timeout. It maybe be something like Continuations in Java that would be perfect.
You could try React: http://reactphp.org/
Is not very mature yet, but it may suit your needs. Instead of doing long pooling, you can do it async.
I would recommend: http://ape-project.org/
mature and scalable

Webrequest faster downloading (Googled for days)

Im working on a project that downloads up to 5000 individual pieces of data from a server. It basically is a PHP page that takes POST variable, gets the data from the DB and sends it back to the .NET client.
It is slow. It takes about 1 second per request. I've googled a lot and tried all sorts of tweaks to the code, like the famous proxy-setting etc. But nothing speeds it up.
Any idea's? All solutions that make this super fast are welcome. Even C-written DLL's or anything you can think of. This just needs to be a lot faster.
Public Function askServer(oCode As String) As String
oBytesToSend = Encoding.ASCII.GetBytes("cmd=" & System.Web.HttpUtility.UrlEncode(oCode))
Try
oRequest = WebRequest.Create(webServiceUrl)
oRequest.Timeout = 60000
oRequest.Proxy = WebRequest.DefaultWebProxy
CType(oRequest, HttpWebRequest).UserAgent = "XXXXX"
oRequest.Method = "POST"
oRequest.ContentLength = oBytesToSend.Length
oRequest.ContentType = "application/x-www-form-urlencoded"
oStream = oRequest.GetRequestStream()
oStream.Write(oBytesToSend, 0, oBytesToSend.Length)
oResponse = oRequest.GetResponse()
If CType(oResponse, HttpWebResponse).StatusCode = Net.HttpStatusCode.OK Then
oStream = oResponse.GetResponseStream()
oReader = New StreamReader(oStream)
oResponseFromServer = oReader.ReadToEnd()
oResponseFromServer = System.Web.HttpUtility.UrlDecode(oResponseFromServer)
Return oResponseFromServer
Else
MsgBox("Server error", CType(vbOKOnly + vbCritical, MsgBoxStyle), "")
Return ""
End If
Catch e As Exception
MsgBox("Oops" & vbCrLf & e.Message, CType(vbOKOnly + vbCritical, MsgBoxStyle), "")
Return ""
End Try
End Function
Some ideas :
Run the http requests in parallel. (Client)
If the data response size allows it get all data needed in one request (you need change your server implementation).
Caching data. (Server)

determining proper gearman task function to retrieve real-time job status

Very simply, I have a program that needs to perform a large process (anywhere from 5 seconds to several minutes) and I don't want to make my page wait for the process to finish to load.
I understand that I need to run this gearman job as a background process but I'm struggling to identify the proper solution to get real-time status updates as to when the worker actually finishes the process. I've used the following code snippet from the PHP examples:
do {
sleep(3);
$stat = $gmclient->jobStatus($job_handle);
if (!$stat[0]) // the job is known so it is not done
$done = true;
echo "Running: " . ($stat[1] ? "true" : "false") . ", numerator: " . $stat[2] . ", denomintor: " . $stat[3] . "\n";
} while(!$done);
echo "done!\n";
and this works, however it appears that it just returns data to the client when the worker finished telling the job what to do. Instead I want to know when the literal process of the job finished.
My real-life example:
Pull several data feeds from an API (some feeds take longer than others)
Load a couple of the ones that always load fast, place a "Waiting/Loading" animation on the section that was sent off to a worker queue
When the work is done and the results have been completely retrieved, replace the animation with the results
This is a bit late, but I stumbled across this question looking for the same answer. I was able to get a solution together, so maybe it will help someone else.
For starters, refer to the documentation on GearmanClient::jobStatus. This will be called from the client, and the function accepts a single argument: $job_handle. You retrieve this handle when you dispatch the request:
$client = new GearmanClient( );
$client->addServer( '127.0.0.1', 4730 );
$handle = $client->doBackground( 'serviceRequest', $data );
Later on, you can retrieve the status by calling the jobStatus function on the same $client object:
$status = $client->jobStatus( $handle );
This is only meaningful, though, if you actually change the status from within your worker with the sendStatus method:
$worker = new GearmanWorker( );
$worker->addFunction( 'serviceRequest', function( $job ) {
$max = 10;
// Set initial status - numerator / denominator
$job->sendStatus( 0, $max );
for( $i = 1; $i <= $max; $i++ ) {
sleep( 2 ); // Simulate a long running task
$job->sendStatus( $i, $max );
}
return GEARMAN_SUCCESS;
} );
while( $worker->work( ) ) {
$worker->wait( );
}
In versions of Gearman prior to 0.5, you would use the GearmanJob::status method to set the status of a job. Versions 0.6 to current (1.1) use the methods above.
See also this question: Problem With Gearman Job Status

Prevent timeout during large request in PHP

I'm making a large request to the brightcove servers to make a batch change of metadata in my videos. It seems like it only made it through 1000 iterations and then stopped - can anyone help in adjusting this code to prevent a timeout from happening? It needs to make about 7000/8000 iterations.
<?php
include 'echove.php';
$e = new Echove(
'xxxxx',
'xxxxx'
);
// Read Video IDs
# Define our parameters
$params = array(
'fields' => 'id,referenceId'
);
# Make our API call
$videos = $e->findAll('video', $params);
//print_r($videos);
foreach ($videos as $video) {
//print_r($video);
$ref_id = $video->referenceId;
$vid_id = $video->id;
switch ($ref_id) {
case "":
$metaData = array(
'id' => $vid_id,
'referenceId' => $vid_id
);
# Update a video with the new meta data
$e->update('video', $metaData);
echo "$vid_id updated sucessfully!<br />";
break;
default:
echo "$ref_id was not updated. <br />";
break;
}
}
?>
Thanks!
Try the set_time_limit() function. Calling set_time_limit(0) will remove any time limits for execution of the script.
Also use ignore_user_abort() to bypass browser abort. The script will keep running even if you close the browser (use with caution).
Try sending a 'Status: 102 Processing' every now and then to prevent the browser from timing out (your best bet is about 15 to 30 seconds in between). After the request has been processed you may send the final response.
The browser shouldn't time out any more this way.

Categories