I use CURL in several individual PHP scripts to download / upload files, is there a way to set a GLOBAL (not per-curl-handle) UL / DL Rate Speed Limit?
Unfortunately, you can only set a speed limit for the single session at CURL, but that's not dynamic.
As a server OS Ubuntu is used, is there an alternative way to limit CURL processes differently?
Thanks
curl/libcurl doesn't have any feature to share a bandwidth limit across curl_easy handles, much less across different processes. i propose a curl daemon to enforce bandwidth limits instead. with the client looking something like
class curl_daemon_response{
public $stdout;
public $stderr;
}
function curl_daemon(array $curl_options):curl_daemon_response{
$from_big_uint64_t=function(string $i): int {
$arr = unpack ( 'Juint64_t', $i );
return $arr ['uint64_t'];
};
$to_big_uint64_t=function(int $i): string {
return pack ( 'J', $i );
};
$conn = stream_socket_client("unix:///var/run/curl_daemon", $errno, $errstr, 3);
if (!$conn) {
throw new \RuntimeError("failed to connect to /var/run/curl_daemon! $errstr ($errno)");
}
stream_set_blocking($conn,true);
$curl_options=serialize($curl_options);
fwrite($conn,$to_big_uint64_t(strlen($curl_options)).$curl_options);
$stdoutLen=$from_big_uint64_t(fread($conn,8));
$stdout=fread($conn,$stdoutLen);
$stderrLen=$from_big_uint64_t(fread($conn,8));
$stderr=fread($conn,$stderrLen);
$ret=new curl_daemon_response();
$ret->stdout=$stdout;
$ret->stderr=$stderr;
fclose($conn);
return $ret;
}
and the daemon looking something like
<?php
declare(strict_types=1);
const MAX_DOWNLOAD_SPEED=1000*1024; // 1000 kilobytes
const MINIMUM_DOWNLOAD_SPEED=100; // 100 bytes per second,
class Client{
public $id;
public $socket;
public $curl;
public $arguments;
public $stdout;
public $stderr;
}
$clients=[];
$mh=curl_multi_init();
$srv = stream_socket_server("unix:///var/run/curl_daemon", $errno, $errstr);
if (!$srv) {
throw new \RuntimeError("failed to create unix socket /var/run/curl_daemon! $errstr ($errno)");
}
stream_set_blocking($srv,false);
while(true){
getNewClients();
$cc=count($clients);
if(!$cc){
sleep(1); // nothing to do.
continue;
}
curl_multi_exec($mh, $running);
if($running!==$cc){
// at least 1 of the curls finished!
while(false!==($info=curl_multi_info_read($mh))){
$key=curl_getinfo($info['handle'],CURLINFO_PRIVATE);
curl_multi_remove_handle($mh,$clients[$key]->curl);
curl_close($clients[$key]->curl);
$stdout=file_get_contents(stream_get_meta_data($clients[$key]->stdout)['uri']); // https://bugs.php.net/bug.php?id=76268
fclose($clients[$key]->stdout);
$stderr=file_get_contents(stream_get_meta_data($clients[$key]->stderr)['uri']); // https://bugs.php.net/bug.php?id=76268
fclose($clients[$key]->stderr);
$sock=$clients[$key]->socket;
fwrite($sock,to_big_uint64_t(strlen($stdout)).$stdout.to_big_uint64_t(strlen($stderr)).$stderr);
fclose($sock);
echo "finished request #{$key}!\n";
unset($clients[$key],$key,$stdout,$stderr,$sock);
}
updateSpeed();
}
curl_multi_select($mh);
}
function updateSpeed(){
global $clients;
static $old_speed=-1;
if(empty($clients)){
return;
}
$clientsn=count($clients);
$per_handle_speed=MAX(MINIMUM_DOWNLOAD_SPEED,(MAX_DOWNLOAD_SPEED/$clientsn));
if($per_handle_speed===$old_speed){
return;
}
$old_speed=$per_handle_speed;
echo "new per handle speed: {$per_handle_speed} - clients: {$clientsn}\n";
foreach($clients as $client){
/** #var Client $client */
curl_setopt($client->curl,CURLOPT_MAX_RECV_SPEED_LARGE,$per_hande_speed);
}
}
function getNewClients(){
global $clients,$srv,$mh;
static $counter=-1;
$newClients=false;
while(false!==($new=stream_socket_accept($srv,0))){
++$counter;
$newClients=true;
echo "new client! request #{$counter}\n";
stream_set_blocking($new,true);
$tmp=new Client();
$tmp->id=$counter;
$tmp->socket=$new;
$tmp->curl=curl_init();
$tmp->stdout=tmpfile();
$tmp->stderr=tmpfile();
$size=from_big_uint64_t(fread($new,8));
$arguments=fread($new,$size);
$arguments=unserialize($arguments);
assert(is_array($arguments));
$tmp->arguments=$arguments;
curl_setopt_array($tmp->curl,$arguments);
curl_setopt_array($tmp->curl,array(
CURLOPT_FILE=>$tmp->stdout,
CURLOPT_STDERR=>$tmp->stderr,
CURLOPT_VERBOSE=>1,
CURLOPT_PRIVATE=>$counter
));
curl_multi_add_handle($mh,$tmp->curl);
}
if($newClients){
updateSpeed();
}
}
function from_big_uint64_t(string $i): int {
$arr = unpack ( 'Juint64_t', $i );
return $arr ['uint64_t'];
}
function to_big_uint64_t(int $i): string {
return pack ( 'J', $i );
}
note: this is completely untested code, because my development environment died literally a couple of hours ago, and i wrote all this in notepad++. (my dev env won't boot at all, it's a VM, not sure wtf happened, but haven't fixed it yet )
also, the code is not at all optimized for large file transfers, if you need to support big file transfers this way (sizes you don't want packed up in ram, like gigabytes+), then modify the daemon to return filepaths instead of writing all the data over an unix socket.
I'm trying to find a way to measure bytes transferred in or out of a web application built on php+apache. One problem is that all I/O is done by a native PHP extension, which is passed a handle to one of the built-in streams: php://input or php://output.
I have examined the following alternatives:
1.) ftell on stream wrapper
After encountering this question, my first intuition was to try using ftell on the stream wrapper handle after the I/O operation; roughly:
$hOutput = fopen('php://output', 'wb');
extensionDoOutput($hOutput);
$iBytesTransferred = ftell($hOutput);
This seems to work for the input wrapper, but not the output (which always returns zero from ftell).
2.) Attach stream filter
A non-modifying stream filter would seem like a reasonable way to count bytes passing through. However, the documentation seems a bit lacking and I haven't found a way to get at lengths without doing the iterate+copy pattern as in the example:
class test_filter extends php_user_filter {
public static $iTotalBytes = 0;
function filter(&$in, &$out, &$consumed, $closing) {
while ($bucket = stream_bucket_make_writeable($in)) {
$consumed += $bucket->datalen;
stream_bucket_append($out, $bucket);
}
test_filter::$iTotalBytes += $consumed;
return PSFS_PASS_ON;
}
}
stream_filter_register("test", "test_filter")
or die("Failed to register filter");
$f = fopen("php://output", "wb");
stream_filter_append($f, "test");
// do i/o
Unfortunately this seems to impose a significant reduction in throughput (>50%) as the data is copied in and out of the extension.
3.) Implement stream wrapper
A custom stream wrapper could be used to wrap the other stream and accumulate bytes read/written:
class wrapper {
var $position;
var $handle;
function stream_open($path, $mode, $options, &$opened_path)
{
$this->position = 0;
...
$this->handle = fopen($opened_path, $mode);
return $this->handle != false;
}
function stream_read($count)
{
$ret = fread($this->handle, $count);
$this->position += strlen($ret);
return $ret;
}
function stream_write($data)
{
$written = fwrite($this->handle, $data);
$this->position += $written;
return $written;
}
function stream_tell()
{
return $this->position;
}
function stream_eof()
{
return feof($this->handle);
}
...
}
stream_wrapper_register("test", "wrapper")
or die("Failed to register protocol");
$hOutput = fopen('test://output', 'wb');
extensionDoOutput($hOutput);
$iBytesTransferred = ftell($hOutput);
Again, this imposes a reduction in throughput (~20% on output, greater on input)
4.) Output buffering with callback
A callback can be provided with ob_start to be called as chunks of output are flushed.
$totalBytes = 0;
function cb($strBuffer) {
global $totalBytes;
$totalBytes += strlen($strBuffer);
return $strBuffer;
}
$f = fopen("php://output", "wb");
ob_start('cb', 16384);
// do output...
fclose($f);
ob_end_flush();
Again, this works but imposes a certain throughput performance penalty (~25%) due to buffering.
Option #1 was forgone because it does not appear to work for output. Of the remaining three, all work functionally but affect throughput negatively due to buffer/copy mechanisms.
Is there something instrinsic to PHP (or the apache server extensions) that I can use to do this gracefully, or will I need to bite the bullet on performance? I welcome any ideas on how this might be accomplished.
(note: if possible I am interested in a PHP application-level solution... not an apache module)
I would stick to the output buffer callback you can just return FALSE to pass through:
class OutputMetricBuffer
{
private $length;
public function __construct()
{
ob_start(array($this, 'callback'));
}
public function callback($str)
{
$this->length += strlen($str);
return FALSE;
}
public function getLength()
{
ob_flush();
return $this->length;
}
}
Usage:
$metric = new OutputMetricBuffer;
# ... output ...
$length = $metric->getLength();
The reasons to use the output buffer callback is because it's more lightweight than a filter which needs to consume all buckets and copy them over. So it's more work.
I implemented the callback inside a class so it has it's own private length variable to count up with.
You can just create a global function as well and use a global variable, however another tweak might be to access it via $GLOBALS instead of the global keyword so PHP does not need to import the global variable into the local symbol table and back. But I'm not really sure if it makes a difference, just another point which could play a role.
Anyway I don't know as well if returning FALSE instead of $str will make it faster, just give it a try.
As bizarre as this is, using the STDOUT constant instead of the result of fopen('php://output') makes ftell() work correctly.
$stream = fopen('php://output','w');
fwrite($stream, "This is some data\n");
fwrite($stream, ftell($stream));
// Output:
// This is some data
// 0
However:
fwrite(STDOUT, "This is some data\n");
fwrite(STDOUT, ftell(STDOUT));
// Output:
// This is some data
// 17
Tested PHP/5.2.17 (win32)
EDIT actually, is that working correctly, or should it be 18? I never use ftell() so I'm not 100% sure either way...
ANOTHER EDIT
See whether this suits you:
$bytesOutput = 0;
function output_counter ($str) {
$GLOBALS['bytesOutput'] += strlen($str);
return $str;
}
ob_start('output_counter');
$stream = fopen('php://output','w');
fwrite($stream, "This is some data\n");
var_dump($bytesOutput);
Is there a way in PHP to make HTTP calls and not wait for a response? I don't care about the response, I just want to do something like file_get_contents(), but not wait for the request to finish before executing the rest of my code. This would be super useful for setting off "events" of a sort in my application, or triggering long processes.
Any ideas?
The answer I'd previously accepted didn't work. It still waited for responses. This does work though, taken from How do I make an asynchronous GET request in PHP?
function post_without_wait($url, $params)
{
foreach ($params as $key => &$val) {
if (is_array($val)) $val = implode(',', $val);
$post_params[] = $key.'='.urlencode($val);
}
$post_string = implode('&', $post_params);
$parts=parse_url($url);
$fp = fsockopen($parts['host'],
isset($parts['port'])?$parts['port']:80,
$errno, $errstr, 30);
$out = "POST ".$parts['path']." HTTP/1.1\r\n";
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Content-Type: application/x-www-form-urlencoded\r\n";
$out.= "Content-Length: ".strlen($post_string)."\r\n";
$out.= "Connection: Close\r\n\r\n";
if (isset($post_string)) $out.= $post_string;
fwrite($fp, $out);
fclose($fp);
}
If you control the target that you want to call asynchronously (e.g. your own "longtask.php"), you can close the connection from that end, and both scripts will run in parallel. It works like this:
quick.php opens longtask.php via cURL (no magic here)
longtask.php closes the connection and continues (magic!)
cURL returns to quick.php when the connection is closed
Both tasks continue in parallel
I have tried this, and it works just fine. But quick.php won't know anything about how longtask.php is doing, unless you create some means of communication between the processes.
Try this code in longtask.php, before you do anything else. It will close the connection, but still continue to run (and suppress any output):
while(ob_get_level()) ob_end_clean();
header('Connection: close');
ignore_user_abort();
ob_start();
echo('Connection Closed');
$size = ob_get_length();
header("Content-Length: $size");
ob_end_flush();
flush();
The code is copied from the PHP manual's user contributed notes and somewhat improved.
You can do trickery by using exec() to invoke something that can do HTTP requests, like wget, but you must direct all output from the program to somewhere, like a file or /dev/null, otherwise the PHP process will wait for that output.
If you want to separate the process from the apache thread entirely, try something like (I'm not sure about this, but I hope you get the idea):
exec('bash -c "wget -O (url goes here) > /dev/null 2>&1 &"');
It's not a nice business, and you'll probably want something like a cron job invoking a heartbeat script which polls an actual database event queue to do real asynchronous events.
You can use this library: https://github.com/stil/curl-easy
It's pretty straightforward then:
<?php
$request = new cURL\Request('http://yahoo.com/');
$request->getOptions()->set(CURLOPT_RETURNTRANSFER, true);
// Specify function to be called when your request is complete
$request->addListener('complete', function (cURL\Event $event) {
$response = $event->response;
$httpCode = $response->getInfo(CURLINFO_HTTP_CODE);
$html = $response->getContent();
echo "\nDone.\n";
});
// Loop below will run as long as request is processed
$timeStart = microtime(true);
while ($request->socketPerform()) {
printf("Running time: %dms \r", (microtime(true) - $timeStart)*1000);
// Here you can do anything else, while your request is in progress
}
Below you can see console output of above example.
It will display simple live clock indicating how much time request is running:
As of 2018, Guzzle has become the defacto standard library for HTTP requests, used in several modern frameworks. It's written in pure PHP and does not require installing any custom extensions.
It can do asynchronous HTTP calls very nicely, and even pool them such as when you need to make 100 HTTP calls, but don't want to run more than 5 at a time.
Concurrent request example
use GuzzleHttp\Client;
use GuzzleHttp\Promise;
$client = new Client(['base_uri' => 'http://httpbin.org/']);
// Initiate each request but do not block
$promises = [
'image' => $client->getAsync('/image'),
'png' => $client->getAsync('/image/png'),
'jpeg' => $client->getAsync('/image/jpeg'),
'webp' => $client->getAsync('/image/webp')
];
// Wait on all of the requests to complete. Throws a ConnectException
// if any of the requests fail
$results = Promise\unwrap($promises);
// Wait for the requests to complete, even if some of them fail
$results = Promise\settle($promises)->wait();
// You can access each result using the key provided to the unwrap
// function.
echo $results['image']['value']->getHeader('Content-Length')[0]
echo $results['png']['value']->getHeader('Content-Length')[0]
See http://docs.guzzlephp.org/en/stable/quickstart.html#concurrent-requests
/**
* Asynchronously execute/include a PHP file. Does not record the output of the file anywhere.
*
* #param string $filename file to execute, relative to calling script
* #param string $options (optional) arguments to pass to file via the command line
*/
function asyncInclude($filename, $options = '') {
exec("/path/to/php -f {$filename} {$options} >> /dev/null &");
}
Fake a request abortion using CURL setting a low CURLOPT_TIMEOUT_MS
set ignore_user_abort(true) to keep processing after the connection closed.
With this method no need to implement connection handling via headers and buffer too dependent on OS, Browser and PHP version
Master process
function async_curl($background_process=''){
//-------------get curl contents----------------
$ch = curl_init($background_process);
curl_setopt_array($ch, array(
CURLOPT_HEADER => 0,
CURLOPT_RETURNTRANSFER =>true,
CURLOPT_NOSIGNAL => 1, //to timeout immediately if the value is < 1000 ms
CURLOPT_TIMEOUT_MS => 50, //The maximum number of mseconds to allow cURL functions to execute
CURLOPT_VERBOSE => 1,
CURLOPT_HEADER => 1
));
$out = curl_exec($ch);
//-------------parse curl contents----------------
//$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
//$header = substr($out, 0, $header_size);
//$body = substr($out, $header_size);
curl_close($ch);
return true;
}
async_curl('http://example.com/background_process_1.php');
Background process
ignore_user_abort(true);
//do something...
NB
If you want cURL to timeout in less than one second, you can use
CURLOPT_TIMEOUT_MS, although there is a bug/"feature" on "Unix-like
systems" that causes libcurl to timeout immediately if the value is <
1000 ms with the error "cURL Error (28): Timeout was reached". The
explanation for this behavior is:
[...]
The solution is to disable signals using CURLOPT_NOSIGNAL
Resources
curl timeout less than 1000ms always fails?
http://www.php.net/manual/en/function.curl-setopt.php#104597
http://php.net/manual/en/features.connection-handling.php
The swoole extension. https://github.com/matyhtf/swoole
Asynchronous & concurrent networking framework for PHP.
$client = new swoole_client(SWOOLE_SOCK_TCP, SWOOLE_SOCK_ASYNC);
$client->on("connect", function($cli) {
$cli->send("hello world\n");
});
$client->on("receive", function($cli, $data){
echo "Receive: $data\n";
});
$client->on("error", function($cli){
echo "connect fail\n";
});
$client->on("close", function($cli){
echo "close\n";
});
$client->connect('127.0.0.1', 9501, 0.5);
let me show you my way :)
needs nodejs installed on the server
(my server sends 1000 https get request takes only 2 seconds)
url.php :
<?
$urls = array_fill(0, 100, 'http://google.com/blank.html');
function execinbackground($cmd) {
if (substr(php_uname(), 0, 7) == "Windows"){
pclose(popen("start /B ". $cmd, "r"));
}
else {
exec($cmd . " > /dev/null &");
}
}
fwite(fopen("urls.txt","w"),implode("\n",$urls);
execinbackground("nodejs urlscript.js urls.txt");
// { do your work while get requests being executed.. }
?>
urlscript.js >
var https = require('https');
var url = require('url');
var http = require('http');
var fs = require('fs');
var dosya = process.argv[2];
var logdosya = 'log.txt';
var count=0;
http.globalAgent.maxSockets = 300;
https.globalAgent.maxSockets = 300;
setTimeout(timeout,100000); // maximum execution time (in ms)
function trim(string) {
return string.replace(/^\s*|\s*$/g, '')
}
fs.readFile(process.argv[2], 'utf8', function (err, data) {
if (err) {
throw err;
}
parcala(data);
});
function parcala(data) {
var data = data.split("\n");
count=''+data.length+'-'+data[1];
data.forEach(function (d) {
req(trim(d));
});
/*
fs.unlink(dosya, function d() {
console.log('<%s> file deleted', dosya);
});
*/
}
function req(link) {
var linkinfo = url.parse(link);
if (linkinfo.protocol == 'https:') {
var options = {
host: linkinfo.host,
port: 443,
path: linkinfo.path,
method: 'GET'
};
https.get(options, function(res) {res.on('data', function(d) {});}).on('error', function(e) {console.error(e);});
} else {
var options = {
host: linkinfo.host,
port: 80,
path: linkinfo.path,
method: 'GET'
};
http.get(options, function(res) {res.on('data', function(d) {});}).on('error', function(e) {console.error(e);});
}
}
process.on('exit', onExit);
function onExit() {
log();
}
function timeout()
{
console.log("i am too far gone");process.exit();
}
function log()
{
var fd = fs.openSync(logdosya, 'a+');
fs.writeSync(fd, dosya + '-'+count+'\n');
fs.closeSync(fd);
}
You can use non-blocking sockets and one of pecl extensions for PHP:
http://php.net/event
http://php.net/libevent
http://php.net/ev
https://github.com/m4rw3r/php-libev
You can use library which gives you an abstraction layer between your code and a pecl extension: https://github.com/reactphp/event-loop
You can also use async http-client, based on the previous library: https://github.com/reactphp/http-client
See others libraries of ReactPHP: http://reactphp.org
Be careful with an asynchronous model.
I recommend to see this video on youtube: http://www.youtube.com/watch?v=MWNcItWuKpI
class async_file_get_contents extends Thread{
public $ret;
public $url;
public $finished;
public function __construct($url) {
$this->finished=false;
$this->url=$url;
}
public function run() {
$this->ret=file_get_contents($this->url);
$this->finished=true;
}
}
$afgc=new async_file_get_contents("http://example.org/file.ext");
Event Extension
Event extension is very appropriate. It is a port of Libevent library which is designed for event-driven I/O, mainly for networking.
I have written a sample HTTP client that allows to schedule a number of
HTTP requests and run them asynchronously.
This is a sample HTTP client class based on Event extension.
The class allows to schedule a number of HTTP requests, then run them asynchronously.
http-client.php
<?php
class MyHttpClient {
/// #var EventBase
protected $base;
/// #var array Instances of EventHttpConnection
protected $connections = [];
public function __construct() {
$this->base = new EventBase();
}
/**
* Dispatches all pending requests (events)
*
* #return void
*/
public function run() {
$this->base->dispatch();
}
public function __destruct() {
// Destroy connection objects explicitly, don't wait for GC.
// Otherwise, EventBase may be free'd earlier.
$this->connections = null;
}
/**
* #brief Adds a pending HTTP request
*
* #param string $address Hostname, or IP
* #param int $port Port number
* #param array $headers Extra HTTP headers
* #param int $cmd A EventHttpRequest::CMD_* constant
* #param string $resource HTTP request resource, e.g. '/page?a=b&c=d'
*
* #return EventHttpRequest|false
*/
public function addRequest($address, $port, array $headers,
$cmd = EventHttpRequest::CMD_GET, $resource = '/')
{
$conn = new EventHttpConnection($this->base, null, $address, $port);
$conn->setTimeout(5);
$req = new EventHttpRequest([$this, '_requestHandler'], $this->base);
foreach ($headers as $k => $v) {
$req->addHeader($k, $v, EventHttpRequest::OUTPUT_HEADER);
}
$req->addHeader('Host', $address, EventHttpRequest::OUTPUT_HEADER);
$req->addHeader('Connection', 'close', EventHttpRequest::OUTPUT_HEADER);
if ($conn->makeRequest($req, $cmd, $resource)) {
$this->connections []= $conn;
return $req;
}
return false;
}
/**
* #brief Handles an HTTP request
*
* #param EventHttpRequest $req
* #param mixed $unused
*
* #return void
*/
public function _requestHandler($req, $unused) {
if (is_null($req)) {
echo "Timed out\n";
} else {
$response_code = $req->getResponseCode();
if ($response_code == 0) {
echo "Connection refused\n";
} elseif ($response_code != 200) {
echo "Unexpected response: $response_code\n";
} else {
echo "Success: $response_code\n";
$buf = $req->getInputBuffer();
echo "Body:\n";
while ($s = $buf->readLine(EventBuffer::EOL_ANY)) {
echo $s, PHP_EOL;
}
}
}
}
}
$address = "my-host.local";
$port = 80;
$headers = [ 'User-Agent' => 'My-User-Agent/1.0', ];
$client = new MyHttpClient();
// Add pending requests
for ($i = 0; $i < 10; $i++) {
$client->addRequest($address, $port, $headers,
EventHttpRequest::CMD_GET, '/test.php?a=' . $i);
}
// Dispatch pending requests
$client->run();
test.php
This is a sample script on the server side.
<?php
echo 'GET: ', var_export($_GET, true), PHP_EOL;
echo 'User-Agent: ', $_SERVER['HTTP_USER_AGENT'] ?? '(none)', PHP_EOL;
Usage
php http-client.php
Sample Output
Success: 200
Body:
GET: array (
'a' => '1',
)
User-Agent: My-User-Agent/1.0
Success: 200
Body:
GET: array (
'a' => '0',
)
User-Agent: My-User-Agent/1.0
Success: 200
Body:
GET: array (
'a' => '3',
)
...
(Trimmed.)
Note, the code is designed for long-term processing in the CLI SAPI.
For custom protocols, consider using low-level API, i.e. buffer events, buffers. For SSL/TLS communications, I would recommend the low-level API in conjunction with Event's ssl context. Examples:
SSL echo server
SSL client
Although Libevent's HTTP API is simple, it is not as flexible as buffer events. For example, the HTTP API currently doesn't support custom HTTP methods. But it is possible to implement virtually any protocol using the low-level API.
Ev Extension
I have also written a sample of another HTTP client using Ev extension with sockets in non-blocking mode. The code is slightly more verbose than the sample based on Event, because Ev is a general purpose event loop. It doesn't provide network-specific functions, but its EvIo watcher is capable of listening to a file descriptor encapsulated into the socket resource, in particular.
This is a sample HTTP client based on Ev extension.
Ev extension implements a simple yet powerful general purpose event loop. It doesn't provide network-specific watchers, but its I/O watcher can be used for asynchronous processing of sockets.
The following code shows how HTTP requests can be scheduled for parallel processing.
http-client.php
<?php
class MyHttpRequest {
/// #var MyHttpClient
private $http_client;
/// #var string
private $address;
/// #var string HTTP resource such as /page?get=param
private $resource;
/// #var string HTTP method such as GET, POST etc.
private $method;
/// #var int
private $service_port;
/// #var resource Socket
private $socket;
/// #var double Connection timeout in seconds.
private $timeout = 10.;
/// #var int Chunk size in bytes for socket_recv()
private $chunk_size = 20;
/// #var EvTimer
private $timeout_watcher;
/// #var EvIo
private $write_watcher;
/// #var EvIo
private $read_watcher;
/// #var EvTimer
private $conn_watcher;
/// #var string buffer for incoming data
private $buffer;
/// #var array errors reported by sockets extension in non-blocking mode.
private static $e_nonblocking = [
11, // EAGAIN or EWOULDBLOCK
115, // EINPROGRESS
];
/**
* #param MyHttpClient $client
* #param string $host Hostname, e.g. google.co.uk
* #param string $resource HTTP resource, e.g. /page?a=b&c=d
* #param string $method HTTP method: GET, HEAD, POST, PUT etc.
* #throws RuntimeException
*/
public function __construct(MyHttpClient $client, $host, $resource, $method) {
$this->http_client = $client;
$this->host = $host;
$this->resource = $resource;
$this->method = $method;
// Get the port for the WWW service
$this->service_port = getservbyname('www', 'tcp');
// Get the IP address for the target host
$this->address = gethostbyname($this->host);
// Create a TCP/IP socket
$this->socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
if (!$this->socket) {
throw new RuntimeException("socket_create() failed: reason: " .
socket_strerror(socket_last_error()));
}
// Set O_NONBLOCK flag
socket_set_nonblock($this->socket);
$this->conn_watcher = $this->http_client->getLoop()
->timer(0, 0., [$this, 'connect']);
}
public function __destruct() {
$this->close();
}
private function freeWatcher(&$w) {
if ($w) {
$w->stop();
$w = null;
}
}
/**
* Deallocates all resources of the request
*/
private function close() {
if ($this->socket) {
socket_close($this->socket);
$this->socket = null;
}
$this->freeWatcher($this->timeout_watcher);
$this->freeWatcher($this->read_watcher);
$this->freeWatcher($this->write_watcher);
$this->freeWatcher($this->conn_watcher);
}
/**
* Initializes a connection on socket
* #return bool
*/
public function connect() {
$loop = $this->http_client->getLoop();
$this->timeout_watcher = $loop->timer($this->timeout, 0., [$this, '_onTimeout']);
$this->write_watcher = $loop->io($this->socket, Ev::WRITE, [$this, '_onWritable']);
return socket_connect($this->socket, $this->address, $this->service_port);
}
/**
* Callback for timeout (EvTimer) watcher
*/
public function _onTimeout(EvTimer $w) {
$w->stop();
$this->close();
}
/**
* Callback which is called when the socket becomes wriable
*/
public function _onWritable(EvIo $w) {
$this->timeout_watcher->stop();
$w->stop();
$in = implode("\r\n", [
"{$this->method} {$this->resource} HTTP/1.1",
"Host: {$this->host}",
'Connection: Close',
]) . "\r\n\r\n";
if (!socket_write($this->socket, $in, strlen($in))) {
trigger_error("Failed writing $in to socket", E_USER_ERROR);
return;
}
$loop = $this->http_client->getLoop();
$this->read_watcher = $loop->io($this->socket,
Ev::READ, [$this, '_onReadable']);
// Continue running the loop
$loop->run();
}
/**
* Callback which is called when the socket becomes readable
*/
public function _onReadable(EvIo $w) {
// recv() 20 bytes in non-blocking mode
$ret = socket_recv($this->socket, $out, 20, MSG_DONTWAIT);
if ($ret) {
// Still have data to read. Append the read chunk to the buffer.
$this->buffer .= $out;
} elseif ($ret === 0) {
// All is read
printf("\n<<<<\n%s\n>>>>", rtrim($this->buffer));
fflush(STDOUT);
$w->stop();
$this->close();
return;
}
// Caught EINPROGRESS, EAGAIN, or EWOULDBLOCK
if (in_array(socket_last_error(), static::$e_nonblocking)) {
return;
}
$w->stop();
$this->close();
}
}
/////////////////////////////////////
class MyHttpClient {
/// #var array Instances of MyHttpRequest
private $requests = [];
/// #var EvLoop
private $loop;
public function __construct() {
// Each HTTP client runs its own event loop
$this->loop = new EvLoop();
}
public function __destruct() {
$this->loop->stop();
}
/**
* #return EvLoop
*/
public function getLoop() {
return $this->loop;
}
/**
* Adds a pending request
*/
public function addRequest(MyHttpRequest $r) {
$this->requests []= $r;
}
/**
* Dispatches all pending requests
*/
public function run() {
$this->loop->run();
}
}
/////////////////////////////////////
// Usage
$client = new MyHttpClient();
foreach (range(1, 10) as $i) {
$client->addRequest(new MyHttpRequest($client, 'my-host.local', '/test.php?a=' . $i, 'GET'));
}
$client->run();
Testing
Suppose http://my-host.local/test.php script is printing the dump of $_GET:
<?php
echo 'GET: ', var_export($_GET, true), PHP_EOL;
Then the output of php http-client.php command will be similar to the following:
<<<<
HTTP/1.1 200 OK
Server: nginx/1.10.1
Date: Fri, 02 Dec 2016 12:39:54 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: close
X-Powered-By: PHP/7.0.13-pl0-gentoo
1d
GET: array (
'a' => '3',
)
0
>>>>
<<<<
HTTP/1.1 200 OK
Server: nginx/1.10.1
Date: Fri, 02 Dec 2016 12:39:54 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: close
X-Powered-By: PHP/7.0.13-pl0-gentoo
1d
GET: array (
'a' => '2',
)
0
>>>>
...
(trimmed)
Note, in PHP 5 the sockets extension may log warnings for EINPROGRESS, EAGAIN, and EWOULDBLOCK errno values. It is possible to turn off the logs with
error_reporting(E_ERROR);
Concerning "the Rest" of the Code
I just want to do something like file_get_contents(), but not wait for the request to finish before executing the rest of my code.
The code that is supposed to run in parallel with the network requests can be executed within a the callback of an Event timer, or Ev's idle watcher, for instance. You can easily figure it out by watching the samples mentioned above. Otherwise, I'll add another example :)
I find this package quite useful and very simple: https://github.com/amphp/parallel-functions
<?php
use function Amp\ParallelFunctions\parallelMap;
use function Amp\Promise\wait;
$responses = wait(parallelMap([
'https://google.com/',
'https://github.com/',
'https://stackoverflow.com/',
], function ($url) {
return file_get_contents($url);
}));
It will load all 3 urls in parallel.
You can also use class instance methods in the closure.
For example I use Laravel extension based on this package https://github.com/spatie/laravel-collection-macros#parallelmap
Here is my code:
/**
* Get domains with all needed data
*/
protected function getDomainsWithdata(): Collection
{
return $this->opensrs->getDomains()->parallelMap(function ($domain) {
$contact = $this->opensrs->getDomainContact($domain);
$contact['domain'] = $domain;
return $contact;
}, 10);
}
It loads all needed data in 10 parallel threads and instead of 50 secs without async it finished in just 8 secs.
Here is a working example, just run it and open storage.txt afterwards, to check the magical result
<?php
function curlGet($target){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $target);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec ($ch);
curl_close ($ch);
return $result;
}
// Its the next 3 lines that do the magic
ignore_user_abort(true);
header("Connection: close"); header("Content-Length: 0");
echo str_repeat("s", 100000); flush();
$i = $_GET['i'];
if(!is_numeric($i)) $i = 1;
if($i > 4) exit;
if($i == 1) file_put_contents('storage.txt', '');
file_put_contents('storage.txt', file_get_contents('storage.txt') . time() . "\n");
sleep(5);
curlGet($_SERVER['HTTP_HOST'] . $_SERVER['SCRIPT_NAME'] . '?i=' . ($i + 1));
curlGet($_SERVER['HTTP_HOST'] . $_SERVER['SCRIPT_NAME'] . '?i=' . ($i + 1));
Here is my own PHP function when I do POST to a specific URL of any page....
Sample: *** usage of my Function...
<?php
parse_str("email=myemail#ehehehahaha.com&subject=this is just a test");
$_POST['email']=$email;
$_POST['subject']=$subject;
echo HTTP_POST("http://example.com/mail.php",$_POST);***
exit;
?>
<?php
/*********HTTP POST using FSOCKOPEN **************/
// by ArbZ
function HTTP_Post($URL,$data, $referrer="") {
// parsing the given URL
$URL_Info=parse_url($URL);
// Building referrer
if($referrer=="") // if not given use this script as referrer
$referrer=$_SERVER["SCRIPT_URI"];
// making string from $data
foreach($data as $key=>$value)
$values[]="$key=".urlencode($value);
$data_string=implode("&",$values);
// Find out which port is needed - if not given use standard (=80)
if(!isset($URL_Info["port"]))
$URL_Info["port"]=80;
// building POST-request: HTTP_HEADERs
$request.="POST ".$URL_Info["path"]." HTTP/1.1\n";
$request.="Host: ".$URL_Info["host"]."\n";
$request.="Referer: $referer\n";
$request.="Content-type: application/x-www-form-urlencoded\n";
$request.="Content-length: ".strlen($data_string)."\n";
$request.="Connection: close\n";
$request.="\n";
$request.=$data_string."\n";
$fp = fsockopen($URL_Info["host"],$URL_Info["port"]);
fputs($fp, $request);
while(!feof($fp)) {
$result .= fgets($fp, 128);
}
fclose($fp); //$eco = nl2br();
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
//STORE THE FETCHED CONTENTS to a VARIABLE, because its way better and fast...
$str = $result;
$txt = getTextBetweenTags($str, "span"); $eco = $txt; $result = explode("&",$result);
return $result[1];
<span style=background-color:LightYellow;color:blue>".trim($_GET['em'])."</span>
</pre> ";
}
</pre>
ReactPHP async http client
https://github.com/shuchkin/react-http-client
Install via Composer
$ composer require shuchkin/react-http-client
Async HTTP GET
// get.php
$loop = \React\EventLoop\Factory::create();
$http = new \Shuchkin\ReactHTTP\Client( $loop );
$http->get( 'https://tools.ietf.org/rfc/rfc2068.txt' )->then(
function( $content ) {
echo $content;
},
function ( \Exception $ex ) {
echo 'HTTP error '.$ex->getCode().' '.$ex->getMessage();
}
);
$loop->run();
Run php in CLI-mode
$ php get.php
Symfony HttpClient is asynchronous https://symfony.com/doc/current/components/http_client.html.
For example you can
use Symfony\Component\HttpClient\HttpClient;
$client = HttpClient::create();
$response1 = $client->request('GET', 'https://website1');
$response2 = $client->request('GET', 'https://website1');
$response3 = $client->request('GET', 'https://website1');
//these 3 calls with return immediately
//but the requests will fire to the website1 webserver
$response1->getContent(); //this will block until content is fetched
$response2->getContent(); //same
$response3->getContent(); //same
Well, the timeout can be set in milliseconds,
see "CURLOPT_CONNECTTIMEOUT_MS" in http://www.php.net/manual/en/function.curl-setopt