I've been reading up on multi-threading with PHP, but I'm having a tough time integrating it into my command line php script.
I read multithreading
and multithread foreach.
But I'm really not sure. Any thoughts how to apply multi-threading here? The reason I need multi-threading here is that Telnet takes forever (see shell script). But I can't write to my DB concurrently ($stmt2). I'm looping through my list of devices with $stmt->fetch.
Maybe I should do something like run task specifically, with just the telnet/shell script call in the task, like that example:
$task = new class extends Thread {
private $response;
public function run()
{
$content = file_get_contents("http://google.com");
preg_match("~<title>(.+)</title>~", $content, $matches);
$this->response = $matches[1];
}
};
$task->start() && $task->join();
var_dump($task->response); // string(6) "Google"
But, I'm getting the error when I try to add this to my code below:
PHP Parse error: syntax error, unexpected T_CLASS in /opt/IBM/custom/NAC_Dslam/calix_swVerThreaded.php on line 100
this is the line:
$task = new class ...
My script looks like this:
$stmt =$mysqli->prepare("SELECT ip, model FROM TableD WHERE vendor = 'Calix' AND model in ('C7','E7') AND sw_ver IS NULL LIMIT 6000"); //AND ping_reply IS NULL AND software_version IS NULL
$stmt->bind_result($ip, $model); //list of ip's
if(!$stmt->execute())
{
//err
}
$stmt2 = $mysqli2->prepare("UPDATE TableD SET sw_ver = ?
WHERE vendor = 'Calix'
AND ip = ? ");
$stmt2->bind_param("ss", $software, $ip);
while($stmt->fetch()) {
//initializing var's
if(pingAddress($ip)=="alive") { //Ones that don't ping are dead to us.
///////this is the part that takes forever and should be multi-threaded/////
//Call shell script to telnet to calix dslam and get version for that ip
if($model == "C7"){
$task = new class extends Thread {
private $itsOutput;
public function run()
{
exec ("./calix_C7_swVer.sh $ip", $itsOutput);//takes forever/telnet
//in shell script. Can't
//be fixed. Each time I
//call this script it's a
//different ip
}
};
$task->start() && $task->join();
var_dump($task->itsOutput); //should be returned output above //takes forever to telnet
//$output = $task->itsOutput;
$output2=array_reverse($output,true);
if (!(preg_grep("/DENY/", $output2))){
$found = preg_grep("/COMPLD/", $output2);
$ind = key($found);
$version = explode(",", $output[$ind+1]);
if(strlen($version[3])>=1) { //if sw ver came back in an acceptable size
$software = $version[3];
$software = trim($software,'"'); //trim double quote (usually is there)
print "sw ver after trim: " . $software . "\n";
if(!$stmt2->execute()) { //write sw version to netcool db
$tempErr = "Failed to insert into dslam_elements_nac: " . $stmt2->error;
printf($tempErr . "\n"); //show mysql execute error if exists
$err->logThis($tempErr);
}
if(!$stmtX->execute()) { //commit it
$tempErr = "Failed to commit dslam_elements_nac: " . $stmtX->error;
printf($tempErr . "\n"); //show mysql execute error if exists
$err->logThis($tempErr);
}
} //we got a version back
else { //version not retrieved
//error processing
} //didn't get sw ver
} //not deny
} //c7
else if($model == "E7") {
exec ("./calix_E7_swVer.sh $ip", $output);
$output2=array_reverse($output,true);
if (!(preg_grep("/DENY/", $output2))){
$found = preg_grep("/yes/", $output2);
$ind = key($found);
$version = explode(" ", $output[$ind]);
if(strlen($version[5])>=1) { //if sw ver came back in an acceptable size
$software = $version[5];
print "sw ver after trim: " . $software . "\n";
if(!$stmt2->execute()) { //write sw version to netcool db
$tempErr = "Failed to insert into dslam_elements_nac: " . $stmt2->error;
printf($tempErr . "\n"); //show mysql execute error if exists
$err->logThis($tempErr);
}
if(!$stmtX->execute()) { //commit it
//err processing
}
} //we got a version back
else { //version not retrieved
//handle it
} //didn't get sw ver
} //not deny
}
} //while
update
I'm trying this (pcntl_fork), but it doesn't seem to be quite what I need because when I sleep(30), which I think is similar to my shell script call, other processes don't continue and do the next one.
<?php
declare(ticks = 1);
$max=10;
$child=0;
$res = array("aabc", "bcd", "cde", "eft", "ggg", "hhh", "iii", "jjj", "kkk", "lll", "mmm", "nnn", "ooo", "ppp", "qqq", "aabc", "bcd", "cde", "eft", "ggg", "hhh", "iii", "jjj", "kkk", "lll", "mmm", "nnn", "ooo", "ppp", "qqq");
function sig_handler($signo) {
global $child;
switch ($signo) {
case SIGCHLD:
//echo "SIGCHLD receivedn";
// clean up zombies
$pid = pcntl_waitpid(-1, $status, WNOHANG);
$child -= 1;
//exit;
}
}
pcntl_signal(SIGCHLD, "sig_handler");
//$website_scraper = new scraper();
foreach($res as $r){
while ($child >= $max) {
sleep(5); //echo " - sleep $child n";
//pcntl_waitpid(0,$status);
}
$child++;
$pid=pcntl_fork();
if ($pid==-1) {
die("Could not fork:n");
}
elseif ($pid) {
// we're in the parent fork, dont do anything
}
else {
//example of what a child process could do:
print "child process stuff \n";
sleep(30);
//$website_scraper -> scraper("http://foo.com");
exit;
}
while(pcntl_waitpid(0, $status) != -1) { //////???
$status = pcntl_wexitstatus($status);
echo "child $status completed \n";
}
print "did stuff \n";
}
?>
I've been reading up on multi-threading with PHP
Don't. PHP threading has very limited utility, as it cannot be used in a web server environment. It can only be used in command-line scripts.
The author of the PHP pthreads extension has written:
pthreads v3 is restricted to operating in CLI only: I have spent many years trying to explain that threads in a web server just don't make sense, after 1,111 commits to pthreads I have realised that, my advice is going unheeded.
So I'm promoting the advice to hard and fast fact: you can't use pthreads safely and sensibly anywhere but CLI.
If you need to communicate with multiple network devices in parallel, consider using stream_select to perform asynchronous I/O, or running multiple PHP processes as part of a worker queue to manage the connections.
Related
I'm creating a button on my web page.I want that when someone presses this button an execution of a process of Linux commands on my first server (like "cd file" "./file_to_execute"..) when these commandes are done and finished i want to connect on another server by ssh and to execute another commands.
the probleme is how can i know that the commands before are already finished to proceed to the second part which is to connect on another server .
to resume :
first step connect on the first server , execute some commands
=> when these commands are done ( the part i dont know how to do it )
second step : to connect on another server and execute some others commands.
I'm searching for a way that will allows me to add some pop up to inform the user of my web page that he finished the first step and he started the second.
<?php
$hostname = '192.177.0.252';
$username = 'pepe';
$password = '*****';
$commande = 'cd file && ./file_one.sh';
if (false === $connection_first = ssh2_connect($hostname, 22)) {
echo 'failed<br />';
exit();
}
else {
echo 'Connected<br />';
}
if (false === ssh2_auth_password($connection_first, $username, $password)) {
echo 'failed<br />';
exit();
}
else {
echo 'done !<br />';
}
if (false === $stream = ssh2_exec($connection_first, $commande)) {
echo "error<br />";
}
?>
Thanks
PS: sorry for my English, I'm from Barcelone
To handle events where an exception occurs i would recommend using a try/catch statement, like the one below:
try {
echo inverse(5) . "\n";
echo inverse(0) . "\n";
} catch (Exception $e) {
echo 'Caught exception: ', $e->getMessage(), "\n";
}
When you're trying to handle events and need to know when they finish, there are a few ways to achieve this. You can either set a boolean to true after the command has been executed (like what you are already doing). OR you can return output from the command by printing the output to a txt file and then echoing out the returns of this file. See code below:
exec ('/usr/bin/flush-cache-mage > /tmp/.tmp-mxadmin');
$out = nl2br(file_get_contents('/tmp/.tmp-mxadmin'));
echo $out;
At this point you can create conditions based off of what is returned in the $out variable.
I have an issue that's stumped me.
I'm trying to automate a CLI login to a router and run some commands obtained via a webpage. However I don't know if the router has telnet or SSH enabled (might be one,the other, or both) and I have a list of possible username/password combos that I need to try to gain access.
Oh, and I can't change either the protocol type or the credentials on the device, so that's not really an option.
I was able to figure out how to login to a router with a known protocol and login credentials and run the necessary commands(included below), but I don't know if I should use an if/else block to work through the telnet/ssh decisions, or if a switch statement might be better? Would using Expect inside PHP be an easier way to go?
function tunnelRun($commands,$user,$pass, $yubi){
$cpeIP = "1.2.3.4";
$commands_explode = explode("\n", $commands);
$screenOut = "";
$ssh = new Net_SSH2('router_jumphost');
if (!$ssh->login($user, $pass . $yubi)) {
exit('Login Failed');
}
$ssh->setTimeout(2);
$ssh->write("ssh -l username $cpeIP\n");
$ssh->read("assword:");
$ssh->write("password\n");
$ssh->read("#");
$ssh->write("\n");
$cpePrompt = $ssh->read('/.*[#|>]/', NET_SSH2_READ_REGEX);
$cpePrompt = str_replace("\n", '', trim($cpePrompt));
$ssh->write("config t\n");
foreach ($commands_explode as $i) {
$ssh->write("$i\n"); // note the "\n"
$ssh->setTimeout(2);
$screenOut .= $ssh->read();
}
$ssh->write("end\n");
$ssh->read($cpePrompt);
$ssh->write("exit\n");
echo "Router Update completed! Results below:<br><br>";
echo "<div id=\"text_out\"><textarea style=\" border:none; width: 700px;\" rows=\"20\">".$screenOut."</textarea></div>";
Update:
The solution I went with was a while/switch loop. I would of gone the Expect route, but I kept running into issues on getting the Expect module integrated into PHP on my server (Windows box.) If I had been using a Unix/Linux server Expect would of been the simplest way to achieve this.
I just made it into a working demo for now, so there are a lot of variations missing from the case statements still, and error-handling still needs to bef figured out, but the basic idea is there. I still want to move the preg_match statements around a bit more to do the matching at the top of the while loop (so I don't spam the whole case section with different preg_match lines), but that may prove to be more work than I want for now. Hope this might help someone else trying to do the same!
<?php
include('Net/SSH2.php');
define('NET_SSH2_LOGGING', NET_SSH2_LOG_COMPLEX);
ini_set('display_errors', 1);
$conn = new Net_SSH2('somewhere.outthere.com');
if (!$conn->login($user, $pass . $yubi)) {
exit('Login Failed');
}
$prompt = "Testing#";
$conn->setTimeout(2);
$conn->write("PS1=\"$prompt\"");
$conn->read();
$conn->write("\n");
$screenOut = $conn->read();
//echo "$screenOut is set on the shell<br><br>";
echo $login_db[3][0]. " ". $login_db[3][1];
$logged_in = false;
$status = "SSH";
$status_prev = "";
$login_counter = 0;
while (!$logged_in && $login_counter <=3) {
switch ($status) {
case "Telnet":
break;
case "SSH":
$conn->write("\n");
$conn->write("ssh -l " . $login_db[$login_counter][0] . " $cpeIP\n");
$status_prev = $status;
$status = $conn->read('/\n([.*])$/', NET_SSH2_READ_REGEX);
break;
case (preg_match('/Permission denied.*/', $status) ? true : false):
$conn->write(chr(3)); //Sends Ctrl+C
$status = $conn->read();
if (strstr($status, "Testing#")) {
$status = "SSH";
$login_counter++;
break;
} else {
break 2;
}
case (preg_match('/[pP]assword:/', $status) ? true : false):
$conn->write($login_db[$login_counter][1] . "\n");
$status_prev = $status;
$status = $conn->read('/\n([.*])$/', NET_SSH2_READ_REGEX);
break;
case (preg_match('/yes\/no/', $status) ? true : false):
$conn->write("yes\n");
$status_prev = $status;
$status = $conn->read('/\n([.*])$/', NET_SSH2_READ_REGEX);
break;
case (preg_match('/(^[a-zA-Z0-9_]+[#]$)|(>)/', $status,$matches) ? true : false):
$conn->write("show version\n");
$status = $conn->read(">");
if(preg_match('/ADTRAN|Adtran|Cisco/', $status)? true:false){
$logged_in = true;
break;
}
default:
echo "<br>Something done messed up! Exiting";
break 2;
}
//echo "<pre>" . $conn->getLog() . "</pre>";
}
if ($logged_in === true) {
echo "<br> Made it out of the While loop cleanly";
} else {
echo "<br> Made it out of the While loop, but not cleanly";
}
echo "<pre>" . $conn->getLog() . "</pre>";
$conn->disconnect();
echo "disconnected cleanly";
}
?>
If statements might make your code become unreadable.
In that case I would suggest you to use switch-case blocks,
since switch case will allow you to write clearer code, and will allow you to catch exceptional values more efficiently.
Using Expect in php is simple:
<?php>
ini_set("expect.loguser", "Off");
$stream = fopen("expect://ssh root#remotehost uptime", "r");
$cases = array (
array (0 => "password:", 1 => PASSWORD)
);
switch (expect_expectl ($stream, $cases)) {
case PASSWORD:
fwrite ($stream, "password\n");
break;
default:
die ("Error was occurred while connecting to the remote host!\n");
}
while ($line = fgets($stream)) {
print $line;
}
fclose ($stream);
?>
There are some complication using the expect file_wrapper. If it were me, I'd just go for a simple socket connection for telnet and poll for the prompts (with a timeout) if the ssh connection fails.
On a casual inspection, the telnet client here seems to be sensibly written - and with a bit of renaming could provide the same interface as the ssh2 client extension (apart from the connect bit).
How can you mimic a command line run of a script with arguements inside a PHP script? Or is that not possible?
In other words, let's say you have the following script:
#!/usr/bin/php
<?php
require "../src/php/whatsprot.class.php";
function fgets_u($pStdn) {
$pArr = array($pStdn);
if (false === ($num_changed_streams = stream_select($pArr, $write = NULL, $except = NULL, 0))) {
print("\$ 001 Socket Error : UNABLE TO WATCH STDIN.\n");
return FALSE;
} elseif ($num_changed_streams > 0) {
return trim(fgets($pStdn, 1024));
}
}
$nickname = "WhatsAPI Test";
$sender = ""; // Mobile number with country code (but without + or 00)
$imei = ""; // MAC Address for iOS IMEI for other platform (Android/etc)
$countrycode = substr($sender, 0, 2);
$phonenumber=substr($sender, 2);
if ($argc < 2) {
echo "USAGE: ".$_SERVER['argv'][0]." [-l] [-s <phone> <message>] [-i <phone>]\n";
echo "\tphone: full number including country code, without '+' or '00'\n";
echo "\t-s: send message\n";
echo "\t-l: listen for new messages\n";
echo "\t-i: interactive conversation with <phone>\n";
exit(1);
}
$dst=$_SERVER['argv'][2];
$msg = "";
for ($i=3; $i<$argc; $i++) {
$msg .= $_SERVER['argv'][$i]." ";
}
echo "[] Logging in as '$nickname' ($sender)\n";
$wa = new WhatsProt($sender, $imei, $nickname, true);
$url = "https://r.whatsapp.net/v1/exist.php?cc=".$countrycode."&in=".$phonenumber."&udid=".$wa->encryptPassword();
$content = file_get_contents($url);
if(stristr($content,'status="ok"') === false){
echo "Wrong Password\n";
exit(0);
}
$wa->Connect();
$wa->Login();
if ($_SERVER['argv'][1] == "-i") {
echo "\n[] Interactive conversation with $dst:\n";
stream_set_timeout(STDIN,1);
while(TRUE) {
$wa->PollMessages();
$buff = $wa->GetMessages();
if(!empty($buff)){
print_r($buff);
}
$line = fgets_u(STDIN);
if ($line != "") {
if (strrchr($line, " ")) {
// needs PHP >= 5.3.0
$command = trim(strstr($line, ' ', TRUE));
} else {
$command = $line;
}
switch ($command) {
case "/query":
$dst = trim(strstr($line, ' ', FALSE));
echo "[] Interactive conversation with $dst:\n";
break;
case "/accountinfo":
echo "[] Account Info: ";
$wa->accountInfo();
break;
case "/lastseen":
echo "[] Request last seen $dst: ";
$wa->RequestLastSeen("$dst");
break;
default:
echo "[] Send message to $dst: $line\n";
$wa->Message(time()."-1", $dst , $line);
break;
}
}
}
exit(0);
}
if ($_SERVER['argv'][1] == "-l") {
echo "\n[] Listen mode:\n";
while (TRUE) {
$wa->PollMessages();
$data = $wa->GetMessages();
if(!empty($data)) print_r($data);
sleep(1);
}
exit(0);
}
echo "\n[] Request last seen $dst: ";
$wa->RequestLastSeen($dst);
echo "\n[] Send message to $dst: $msg\n";
$wa->Message(time()."-1", $dst , $msg);
echo "\n";
?>
To run this script, you are meant to go to the Command Line, down to the directory the file is in, and then type in something like php -s "whatsapp.php" "Number" "Message".
But what if I wanted to bypass the Command Line altogether and do that directly inside the script so that I can run it at any time from my Web Server, how would I do that?
First off, you should be using getopt.
In PHP it supports both short and long formats.
Usage demos are documented at the page I've linked to. In your case, I suspect you'll have difficulty detecting whether a <message> was included as your -s tag's second parameter. It will probably be easier to make the message a parameter for its own option.
$options = getopt("ls:m:i:");
if (isset($options["s"] && !isset($options["m"])) {
die("-s needs -m");
}
As for running things from a web server ... well, you pass variables to a command line PHP script using getopt() and $argv, but you pass variables from a web server using $_GET and $_POST. If you can figure out a sensible way to map $_GET variables your command line options, you should be good to go.
Note that a variety of other considerations exist when taking a command line script and running it through a web server. Permission and security go hand in hand, usually as inverse functions of each other. That is, if you open up permissions so that it's allowed to do what it needs, you may expose or even create vulnerabilities on your server. I don't recommend you do this unless you'll more experienced, or you don't mind if things break or get attacked by script kiddies out to 0wn your server.
You're looking for backticks, see
http://php.net/manual/en/language.operators.execution.php
Or you can use shell_exec()
http://www.php.net/manual/en/function.shell-exec.php
Hi everyone once again!
We need some help to develop and implement a multi-curl functionality into our crawler. We have a huge array of "links to be scanned" and we loop throw them with a Foreach.
Let's use some pseudo code to understand the logic:
1) While ($links_to_be_scanned > 0).
2) Foreach ($links_to_be_scanned as $link_to_be_scanned).
3) Scan_the_link() and run some other functions.
4) Extract the new links from the xdom.
5) Push the new links into $links_to_be_scanned.
5) Push the current link into $links_already_scanned.
6) Remove the current link from $links_to_be_scanned.
Now, we need to define a maximum number of parallel connections and be able to run this process for each link in parallel.
I understand that we're gonna have to create a $links_being_scanned or some kind of queue.
I'm really not sure how to approach this problem to be honest, if anyone could provide some snippet or idea to solve it, it would be greatly appreciated.
Thanks in advance!
Chris;
Extended:
I just realized that is not the multi-curl itself the tricky part, but the amount of operations done with each link after the request.
Even after the muticurl, I would eventually have to find a way to run all this operations in parallel. The whole algorithm described below would have to run in parallel.
So now rethinking, we would have to do something like this:
While (There's links to be scanned)
Foreach ($Link_to_scann as $link)
If (There's less than 10 scanners running)
Launch_a_new_scanner($link)
Remove the link from $links_to_be_scanned array
Push the link into $links_on_queue array
Endif;
And each scanner does (This should be run in parallel):
Create an object with the given link
Send a curl request to the given link
Create a dom and an Xdom with the response body
Perform other operations over the response body
Remove the link from the $links_on_queue array
Push the link into the $links_already_scanned array
I assume we could approach this creating a new PHP file with the scanner algorithm, and using pcntl_fork() for each parallel proccess?
Since even using multi-curl, I would eventually have to wait looping on a regular foreach structure for the other processes.
I assume I would have to approach this using fsockopen or pcntl_fork.
Suggestions, comments, partial solutions, and even a "good luck" will be more than appreciated!
Thanks a lot!
DISCLAIMER: This answer links an open-source project with which I'm involved. There. You've been warned.
The Artax HTTP client is a socket-based HTTP library that (among other things) offers custom control over the number of concurrent open socket connections to individual hosts while making multiple asynchronous HTTP requests.
Limiting the number of concurrent connections is easily accomplished. Consider:
<?php
use Artax\Client, Artax\Response;
require dirname(__DIR__) . '/autoload.php';
$client = new Client;
// Defaults to max of 8 concurrent connections per host
$client->setOption('maxConnectionsPerHost', 2);
$requests = array(
'so-home' => 'http://stackoverflow.com',
'so-php' => 'http://stackoverflow.com/questions/tagged/php',
'so-python' => 'http://stackoverflow.com/questions/tagged/python',
'so-http' => 'http://stackoverflow.com/questions/tagged/http',
'so-html' => 'http://stackoverflow.com/questions/tagged/html',
'so-css' => 'http://stackoverflow.com/questions/tagged/css',
'so-js' => 'http://stackoverflow.com/questions/tagged/javascript'
);
$onResponse = function($requestKey, Response $r) {
echo $requestKey, ' :: ', $r->getStatus();
};
$onError = function($requestKey, Exception $e) {
echo $requestKey, ' :: ', $e->getMessage();
}
$client->requestMulti($requests, $onResponse, $onError);
IMPORTANT: In the above example the Client::requestMulti method is making all the specified requests asynchronously. Because the per-host concurrency limit is set to 2, the client will open up new connections for the first two requests and subsequently reuse those same sockets for the other requests, queuing requests until one of the two sockets become available.
you could try something like this, haven't checked it, but you should get the idea
$request_pool = array();
function CreateHandle($url) {
$handle = curl_init($url);
// set curl options here
return $handle;
}
function Process($data) {
global $request_pool;
// do something with data
array_push($request_pool , CreateHandle($some_new_url));
}
function RunMulti() {
global $request_pool;
$multi_handle = curl_multi_init();
$active_request_pool = array();
$running = 0;
$active_request_count = 0;
$active_request_max = 10; // adjust as necessary
do {
$waiting_request_count = count($request_pool);
while(($active_request_count < $active_request_max) && ($waiting_request_count > 0)) {
$request = array_shift($request_pool);
curl_multi_add_handle($multi_handle , $request);
$active_request_pool[(int)$request] = $request;
$waiting_request_count--;
$active_request_count++;
}
curl_multi_exec($multi_handle , $running);
curl_multi_select($multi_handle);
while($info = curl_multi_info_read($multi_handle)) {
$curl_handle = $info['handle'];
call_user_func('Process' , curl_multi_getcontent($curl_handle));
curl_multi_remove_handle($multi_handle , $curl_handle);
curl_close($curl_handle);
$active_request_count--;
}
} while($active_request_count > 0 || $waiting_request_count > 0);
curl_multi_close($multi_handle);
}
You should look for some more robust solution to your problem. RabbitMQ
is a very good solution I used. There is also Gearman but I think it is your choice.
I prefer RabbitMQ.
I will share with you my code which I have used to collect email addresses from certain website.
You can modify it to fit your needs.
There were some problems with relative URL's there.
And I do not use CURL here.
<?php
error_reporting(E_ALL);
$home = 'http://kharkov-reklama.com.ua/jborudovanie/';
$writer = new RWriter('C:\parser_13-09-2012_05.txt');
set_time_limit(0);
ini_set('memory_limit', '512M');
function scan_page($home, $full_url, &$writer) {
static $done = array();
$done[] = $full_url;
// Scan only internal links. Do not scan all the internet!))
if (strpos($full_url, $home) === false) {
return false;
}
$html = #file_get_contents($full_url);
if (empty($html) || (strpos($html, '<body') === false && strpos($html, '<BODY') === false)) {
return false;
}
echo $full_url . '<br />';
preg_match_all('/([A-Za-z0-9_\-]+\.)*[A-Za-z0-9_\-]+#([A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]\.)+[A-Za-z]{2,4}/', $html, $emails);
if (!empty($emails) && is_array($emails)) {
foreach ($emails as $email_group) {
if (is_array($email_group)) {
foreach ($email_group as $email) {
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
$writer->write($email);
}
}
}
}
}
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
preg_match_all("/$regexp/siU", $html, $matches, PREG_SET_ORDER);
if (is_array($matches)) {
foreach($matches as $match) {
if (!empty($match[2]) && is_scalar($match[2])) {
$url = $match[2];
if (!filter_var($url, FILTER_VALIDATE_URL)) {
$url = $home . $url;
}
if (!in_array($url, $done)) {
scan_page($home, $url, $writer);
}
}
}
}
}
class RWriter {
private $_fh = null;
private $_written = array();
public function __construct($fname) {
$this->_fh = fopen($fname, 'w+');
}
public function write($line) {
if (in_array($line, $this->_written)) {
return;
}
$this->_written[] = $line;
echo $line . '<br />';
fwrite($this->_fh, "{$line}\r\n");
}
public function __destruct() {
fclose($this->_fh);
}
}
scan_page($home, 'http://kharkov-reklama.com.ua/jborudovanie/', $writer);
I'm having a problem with gearman workers running on multiple servers which i can't seem to solve.
The problem occurs when a worker server is taken offline, rather than the worker process being cancelled, and causes all other worker processes to error and fail.
Example with just 1 client and 2 workers -
Client:
$client = new GearmanClient ();
$client->addServer ('192.168.1.200');
$client->addServer ('192.168.1.201');
$job = $client->do ('generate_tile', serialize ($arrData));
Worker:
$worker = new GearmanWorker ();
$worker->addServer ('192.168.1.200');
$worker->addServer ('192.168.1.201');
$worker->addFunction ('generate_tile', 'generate_tile');
while (1)
{
if (!$worker->work ())
{
switch ($worker->returnCode ())
{
default:
echo "Error: " . $worker->returnCode () . ': ' . $worker->error () . "\n";
break;
}
}
}
function generate_tile ($job) { ... }
The worker code is being run on 2 separate servers. When every server is up and running both workers execute jobs as expected. When one of the worker processes is cancelled, the other worker executes all jobs as expected.
However, when the server with the cancelled worker process is shutdown and taken completely offline, requests to the client script hang and the remaining worker process does not pick up any jobs.
I get the following set of errors from the remaining worker process:
Error: 46: gearman_con_wait:timeout reached
Error: 46: gearman_con_wait:timeout reached
Error: 4: gearman_con_flush:write:110
Error: 46: gearman_con_wait:timeout reached
Error: 4: gearman_con_flush:write:113
Error: 4: gearman_con_flush:write:113
Error: 4: gearman_con_flush:write:113
....
When i start-up the other server, not starting the worker process on it, the remaining worker process immediately jumps into life and executes any remaining jobs.
It seems clear to me that i need some code in the worker process to cope with any servers that may be offline, however i cannot see how to do this.
Many thanks,
Andy
Our tests with multiple gearman servers shows that if the last server in the list (192.168.1.201 in your case) is taken down, the workers stop executing the way you are describing. (Also, the workers grab jobs from the last server. They process jobs on .200 only if on .201 there are no jobs).
It seems that this is a bug with the linked list in the gearman server, which is reported to be fixed multiple times, but with all available versions of gearman, the bug persist. Sorry, I know that's not a solution, but we had the same problem and didn't found a solution. (if someone can provide working solution for this problem, I agree to give large bounty)
Further to #Darhazer 's comment above. We found that as well and solved like thus :-
// Gearman workers show a strong preference for servers at the end of a list so randomize the order
$worker = new GearmanWorker();
$s2 = explode(",", Configure::read('workers.servers'));
shuffle($s2);
$servers = implode(",", $s2);
$worker->addServers($servers);
We run 6 to 10 workers at any time, and expire them after they've completed x requests.
I use this class, which keep track of which jobs work on which servers. It hasn't been thoroughly tested, just wrote it now. I've pasted an edited version, so there might be a typo or somesuch, but otherwise appears to solve the issue.
<?
class MyGearmanClient {
static $server = "server1,server2,server3";
static $server_array = false;
static $workingServers = false;
static $gmclient = false;
static $timeout = 5000;
static $defaultTimeout = 5000;
static function randomServer() {
return self::$server_array[rand(0, count(self::$server_array) -1)];
}
static function getServer($job = false) {
if (self::$server_array == false) {
self::$server_array = explode(",", self::$server);
self::$workingServers = array();
}
$serverList = array();
if ($job) {
if (array_key_exists($job, self::$workingServers)) {
foreach (self::$server_array as $server) {
if (array_key_exists($server, self::$workingServers[$job])) {
if (self::$workingServers[$job][$server]) {
$serverList[] = $server;
}
} else {
$serverList[] = $server;
}
}
if (count($serverList) == 0) {
# All servers have failed, need to insert all the servers again and retry.
$serverList = self::$workingServers[$job] = self::$server_array;
}
return $serverList[rand(0, count($serverList) - 1)];
} else {
return self::randomServer();
}
} else {
return self::randomServer();
}
}
static function serverWorked($server, $job) {
self::$workingServers[$job][$server] = $server;
}
static function serverFailed($server, $job) {
self::$workingServers[$job][$server] = false;
}
static function Connect($server = false, $job = false) {
if ($server) {
self::$server = self::getServer();
}
self::$gmclient= new GearmanClient();
self::$gmclient->setTimeout(self::$timeout);
# add the default job server
self::$gmclient->addServer($server = self::getServer($job));
return $server;
}
static function Destroy() {
self::$gmclient = false;
}
static function Client($name, $vars, $timeout = false) {
if (is_int($timeout)) {
self::$timeout = $timeout;
} else {
self::$timeout = self::$defaultTimeout;
}
do {
$server = self::Connect(false, $name);
$value = self::$gmclient->do($name, $vars);
$return_code = self::$gmclient->returnCode();
if (!$value) {
$error_message = self::$gmclient->error();
if ($return_code == 47) {
self::serverFailed($server, $name);
if (count(self::$server_array) > 1) {
// ADDED SINGLE SERVER LOOP AVOIDANCE // echo "Timeout on server $server, trying another server...\n";
continue;
} else {
return false;
}
}
echo "ERR: $error_message ($return_code)\n";
}
# printf("Worker has returned\n");
$short_value = substr($value, 0, 80);
switch ($return_code)
{
case GEARMAN_WORK_DATA:
echo "DATA: $short_value\n";
break;
case GEARMAN_SUCCESS:
self::serverWorked($server, $name);
break;
case GEARMAN_WORK_STATUS:
list($numerator, $denominator)= self::$gmclient->doStatus();
echo "Status: $numerator/$denominator\n";
break;
case GEARMAN_TIMEOUT:
// self::Connect();
// Fall through
default:
echo "ERR: $error_message " . self::$gmclient->error() . " ($return_code)\n";
break;
}
}
while($return_code != GEARMAN_SUCCESS);
$rv = unserialize($value);
return $rv["rv"];
}
}
# Example usage:
# $rv = MyGearmanClient::Client("Function", $args);
?>
since 'addServer' from gearman client is not working properly this code can choose a jobserver randomly and if fails try the next one, this way you can balance the load.
// job servers
$jobservers = array('192.168.1.1','192.168.1.2');
// prepare gearman client
$gmclient = new GearmanClient();
// shuffle job servers (deliver jobs equally by server)
shuffle($jobservers);
// add job servers
foreach($jobservers as $jobserver) {
// add random jobserver
$gmclient->addServer($jobserver);
// check server state if ok end foreach
if (#$gmclient->ping('ping')) break;
// if connections fails reset client
$gmclient = new GearmanClient();
}
Solution tested and working ok.
$client = new GearmanClient();
if(!$client->addServer("11.11.65.73",4730))
$client->addServer("11.11.65.79",4730);