PHP iterate multidimensional array - query web service every x seconds - php

I have an array with user information and a web service on a site I can query for the status of a user (online/offline). What I would like to do is query the site every x seconds for the status of each user.
There are about 10 users and belwois an example of the array. I can change the array is needed. Only thing I need to enter manually is the username and full name. The "status" I can call from the server.
$users = array
(
"username"=>array("Fullname","Status"),
"johndoe"=>array("John Doe","Online"),
"janedoe"=>array("Jane Doe","Offline")
);
This is an example of the url I can use to query the site (the query returns only the users status (Online or Offline):
http://thesite.com:80/webservice/user/username/
This is the code I can use to get a specific user status:
$url = 'http://thesite.com:80/webservice/user/johndoe/';
$get = fopen($url, "r");
if ($get) {
while (!feof($get)) {
$state = fgets($get, 4096);
}
fclose($get);
}
echo "User johndoe is: ".$status;
// Output: User johndoe is: Online
Now I only need help with iterating through the users and site every x seconds and update the array with each user status in the last array field for the user.
Please note that below I use php and fopen as this is a cross-domain get function and I could not get ajax/jquery to work. I do not have the option to modify the webservice server.
Thanks :)

You need to create a cronjob script that runs every x seconds. That script should be an asynchronous request to this PHP function.
public function updateUsers(){
$users = $_SESSION['users'];
foreach($users as $username=>$data) {
$url = 'http://thesite.com:80/webservice/user/'.$username.'/';
$get = fopen($url, "r");
if ($get) {
while (!feof($get)) {
$status = fgets($get, 4096);
}
fclose($get);
}
$users[$username][] = $status;
}
$_SESSION['users'] = $users;
}
A guide for posting asynchronous requests . http://petewarden.typepad.com/searchbrowser/2008/06/how-to-post-an.html
Hope it helps :)

If your $users array don't changes, you can do this:
foreach($users as $username=>$userdata) {
$url = 'http://thesite.com:80/webservice/user/'.$username.'/';
$get = fopen($url, "r");
if ($get) {
while (!feof($get)) {
$state = fgets($get, 4096);
}
fclose($get);
}
$users[$username][1] = $state;
}
If you can change your $users array to be associative like this:
$users = array(
"username"=>array("fullname"=>"Fullname","status"=>"Status"),
"johndoe"=>array("fullname"=>"John Doe","status"=>"Online"),
"janedoe"=>array("fullname"=>"Jane Doe","status"=>"Offline")
);
That would let you use more key/values and a bit safer.

Related

file_get_contents occasionally causes warning, but still fetches data

My script is working most of the times, but in every 8th try or so I get an error. I'll try and explain this. This is the error I get (or similar):
{"gameName":"F1 2011","gameTrailer":"http://cdn.akamai.steamstatic.com/steam/apps/81163/movie_max.webm?t=1447354814","gameId":"44360","finalPrice":1499,"genres":"Racing"}
{"gameName":"Starscape","gameTrailer":"http://cdn.akamai.steamstatic.com/steam/apps/900679/movie_max.webm?t=1447351523","gameId":"20700","finalPrice":999,"genres":"Action"}
Warning: file_get_contents(http://store.steampowered.com/api/appdetails?appids=400160): failed to open stream: HTTP request failed! in C:\xampp\htdocs\GoStrap\game.php on line 19
{"gameName":"DRAGON: A Game About a Dragon","gameTrailer":"http://cdn.akamai.steamstatic.com/steam/apps/2038811/movie_max.webm?t=1447373449","gameId":"351150","finalPrice":599,"genres":"Adventure"}
{"gameName":"Monster Mash","gameTrailer":"http://cdn.akamai.steamstatic.com/steam/apps/900919/movie_max.webm?t=1447352342","gameId":"36210","finalPrice":449,"genres":"Casual"}
I'm making an application that fetches information on a random Steam game from the Steam store. It's quite simple.
The script takes a (somewhat) random ID from a text file (working for sure)
The ID is added to the ending of an URL for the API, and uses file_get_contents to fetch the file. It then decodes json. (might be the problem somehow)
Search for my specified data. Final price & movie webm is not always there, hence the if(!isset())
Decide final price and ship back to ajax on index.php
The error code above suggests that I get the data I need in 4 cases, and an error once. I only wanna receive ONE json string and return it, and only in-case $game['gameTrailer'] and $game['final_price'] is set.
This is the php (it's not great, be kind):
<?php
//Run the script on ajax call
if(isset($_POST)) {
fetchGame();
}
function fetchGame() {
$gameFound = false;
while(!$gameFound) {
////////// ID-picker //////////
$f_contents = file("steam.txt");
$url = $f_contents[mt_rand(0, count($f_contents) - 1)];
$answer = explode('/',$url);
$gameID = $answer[4];
$trimmed = trim($gameID);
////////// Fetch game //////////
$json = file_get_contents('http://store.steampowered.com/api/appdetails?appids='.$trimmed);
$game_json = json_decode($json, true);
if(!isset($game_json[$trimmed]['data']['movies'][0]['webm']['max']) || !isset($game_json[$trimmed]['data']['price_overview']['final'])) {
continue;
}
$gameFound = true;
////////// Store variables //////////
$game['gameName'] = $game_json[$trimmed]['data']['name'];
$game['gameTrailer'] = $game_json[$trimmed]['data']['movies'][0]['webm']['max'];
$game['gameId'] = $trimmed;
$game['free'] = $game_json[$trimmed]['data']['is_free'];
$game['price'] = $game_json[$trimmed]['data']['price_overview']['final'];
$game['genres'] = $game_json[$trimmed]['data']['genres'][0]['description'];
if ($game['free'] == TRUE) {
$game['final_price'] = "Free";
} elseif($game['free'] == FALSE || $game['final_price'] != NULL) {
$game['final_price'] = $game['price'];
} else {
$game['final_price'] = "-";
}
}
////////// Return to AJAX (index.php) //////////
echo
json_encode(array(
'gameName' => $game['gameName'],
'gameTrailer' => $game['gameTrailer'],
'gameId' => $game['gameId'],
'finalPrice' => $game['final_price'],
'genres' => $game['genres'],
))
;
}
?>
Any help will be appreciated. Like, are there obvious reason as to why this is happening? Is there a significantly better way? Why is it re-iterating itself at least 4 times when it seems to have fetched that data I need? Sorry if this post is long, just trying to be detailed with a lacking php/json-vocabulary.
Kind regards, John
EDIT:
Sometimes it returns no error, just multiple objects:
{"gameName":"Prime World: Defenders","gameTrailer":"http://cdn.akamai.steamstatic.com/steam/apps/2028642/movie_max.webm?t=1447357836","gameId":"235360","finalPrice":899,"genres":"Casual"}
{"gameName":"Grand Ages: Rome","gameTrailer":"http://cdn.akamai.steamstatic.com/steam/apps/5190/movie_max.webm?t=1447351683","gameId":"23450","finalPrice":999,"genres":"Simulation"}

Facebook api , Recursive Function and 500 error

this is my first project using facebook api and facebook PHP sdk , basically i'm trying to get all user statuses. I wrote a script that should work , but i got an 500 error (even if i changed max execution times or set time limit (0)), but only when i use a recursive function inside, take a look to the code :
$request = new FacebookRequest(
$session,
'GET',
'/me/statuses'
);
$response = $request->execute();
$graphObject = $response->getGraphObject();
$x = $graphObject->getProperty('data');
$y = $x->asArray(); //now i got an array
$paging = $graphObject->getProperty('paging'); // i pick paging with "next" and "prevoiuos "
$paged = $paging->asArray(); //as array
$counter = 0 ;
foreach ($y as $el){
echo ('<h3>'.$y[$counter]->message.'</h3>');
echo "<br/>";
echo "<br/>";
$counter++;
}
$next = $paged['next']; // now i got url for load 20 more statuses
$response = file_get_contents($next); // get content of url
//recoursive function every time i use looper with content of next
function looper($response){
$array = json_decode($response, true);
$secondarray = ($array['data']);
$paging = ($array['paging']); // again i pick url for load next statuses
$next = $paging['next'];// again i pick url for load next statuses
$nextResponse = file_get_contents($next);// again i pick url for load next statuses and i will use this.
$counter2 = 0 ;
foreach ($secondarray as $el){ // put on page next 20 statuses
echo ('<h3>'. $secondarray[$counter2]['message'] .'</h3>');
echo "<br/>";
echo "<br/>";
$counter2++;
}
if ( is_null($nextResponse) == false ){ // if in next call i got 20 more statuses(not empty) call again this function
looper($nextResponse);
} else { echo "last message" ; die();} //else stop.
}
looper($response);
}
If i dont recall the function (basically i comment out the if statement) script works fine and prints 20+20 statuses , else it give me 500 internal error.
As i said i tried changin max execution time or set_time_limit(0), but nothing happens.
I'm not sure if problem is my hosting (godaddy) , or if my script is not good / not efficent. any help?
Thanks Nico
I think I found your issue. You are asigning $nextResponse the value returned by file_get_contents. See http://php.net/manual/es/function.file-get-contents.php it returns false in case of content couldnt be retrieved. Try checking for false instead of null:
..........
if ( false != $nextResponse ){ // if in next call i got 20 more statuses(not empty) call again this function
looper($nextResponse);
} else { echo "last message" ; die();} //else stop.

Good method to authenticate files to users

I'm developing an API to let my users access to files stored on another server.
Let's call my two servers, server 1 and server 2!
server 1 is the server im hosting my web site, and
server 2 is the server im storing my files!
My site is basically Javascript based one, so I will be using Javascript to post data to API when user needs to access files which are stored on server 2.
when users requests to access files, the data will be posted to API URL via Javascript! API is made of PHP. Using that PHP script(API) on server 1, I will made another request to server 2 asking for files so there will be another PHP script(API) on server 2.
I need to know how should I do this authentication between two servers as server 2 has no access to user details on server 1?
I hope to do that like this, I can use the method which is used by most payment gateways.
When API on server 2 received a request with some unique data of the user , post back those unique data through SSL to server 1 API and match them with user data in the database, then post back result through SSL to server 2 so then server 2 knows file request is a genuine request.
In this case what kind of user data/credentials server 1 API should post to server 2 and server 2 API should post back to server 1? and which user data should be matched with the data in the database? like user ID, session, cookies, ip, time stamp, ect!
Any clear and described answer would be nice! Thanks.
I would go with this:
user initiates action, javascript asks Server 1 (ajax) for request for file on Server 2
Server 1 creates URL using hash_hmac with data: file, user ID, user secret
when clicking that URL (server2.com/?file=FILE&user_id=ID&hash=SHA_1_HASH) server 2 asks server 1 for validation (sends file, user_id and hash)
server 1 does the validation, sends response to server 2
server 2 pushes file or sends 403 HTTP response
This way, server 2 only needs to consume API of server 1, server 1 has all the logic.
Pseudocode for hash and url creation:
// getHash($userId, $file) method
$user = getUser($userId);
$hash = hash_hmac('sha1', $userId . $file, $user->getSecret());
// getUrl($userId, $file) method
return sprintf('http://server2.com/get-file?file=%1&user_id=%2&hash=%3',
$userId,
$file,
$security->getHash($userId, $file)
);
Pseudocode for validation:
$hash = $security->getHash($_GET['id'], $_GET['file']);
if ($hash === $_GET['hash']) {
// All is good
}
Edit: getHash() method accepts user ID and file (ID or string, what ever suits your needs). With that data, it produces a hash, using hash_hmac method. For the secret parameter of hash_hmac function, users "secret key" is used. That key would be stored together with users data in the db table. It would be generated with mt_rand or even something stronger as reading /dev/random or using something like https://stackoverflow.com/a/16478556/691850.
A word of advice, use mod_xsendfile on server 2 (if it is Apache) to push files.
Introduction
You can use 2 simple method
Authentication Token
Signed Request
You can also combine both of them by using Token for authentication and using signature to verify integrity of the message sent
Authentication Token
If you are going to consider matching any identification in the database perhaps you can consider creating authentication token rather than user ID, session, cookies, ip, time stamp, etc! as suggested.
Create a random token and save to Database
$token = bin2hex(mcrypt_create_iv(64, MCRYPT_DEV_URANDOM));
This can be easily generated
You can guaranteed it more difficult to guess unlike password
It can easily be deleted if compromised and re generate another key
Signed Request
The concept is simple, For each file uploaded must meat a specific signature crated using a random generated key just like the token for each specific user
This can easily be implemented with HMAC with hash_hmac_file function
Combine Both Authentication & Signed Request
Here is a simple Prof of concept
Server 1
/**
* This should be stored securly
* Only known to User
* Unique to each User
* Eg : mcrypt_create_iv(32, MCRYPT_DEV_URANDOM);
*/
$key = "d767d183315656d90cce5c8a316c596c971246fbc48d70f06f94177f6b5d7174";
$token = "3380cb5229d4737ebe8e92c1c2a90542e46ce288901da80fe8d8c456bace2a9e";
$url = "http://server 2/run.php";
// Start File Upload Manager
$request = new FileManager($key, $token);
// Send Multiple Files
$responce = $request->send($url, [
"file1" => __DIR__ . "/a.png",
"file2" => __DIR__ . "/b.css"
]);
// Decode Responce
$json = json_decode($responce->data, true);
// Output Information
foreach($json as $file) {
printf("%s - %s \n", $file['name'], $file['msg']);
}
Output
temp\14-a.png - OK
temp\14-b.css - OK
Server 2
// Where to store the files
$tmpDir = __DIR__ . "/temp";
try {
$file = new FileManager($key, $token);
echo json_encode($file->recive($tmpDir), 128);
} catch (Exception $e) {
echo json_encode([
[
"name" => "Execption",
"msg" => $e->getMessage(),
"status" => 0
]
], 128);
}
Class Used
class FileManager {
private $key;
function __construct($key, $token) {
$this->key = $key;
$this->token = $token;
}
function send($url, $files) {
$post = [];
// Convert to array fromat
$files = is_array($files) ? $files : [
$files
];
// Build Post Request
foreach($files as $name => $file) {
$file = realpath($file);
if (! (is_file($file) || is_readable($file))) {
throw new InvalidArgumentException("Invalid File");
}
// Add File
$post[$name] = "#" . $file;
// Sign File
$post[$name . "-sign"] = $this->sign($file);
}
// Start Curl ;
$ch = curl_init($url);
$options = [
CURLOPT_HTTPHEADER => [
"X-TOKEN:" . $this->token
],
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_POST => count($post),
CURLOPT_POSTFIELDS => $post
];
curl_setopt_array($ch, $options);
// Get Responce
$responce = [
"data" => curl_exec($ch),
"error" => curl_error($ch),
"error" => curl_errno($ch),
"info" => curl_getinfo($ch)
];
curl_close($ch);
return (object) $responce;
}
function recive($dir) {
if (! isset($_SERVER['HTTP_X_TOKEN'])) {
throw new ErrorException("Missing Security Token");
}
if ($_SERVER['HTTP_X_TOKEN'] !== $this->token) {
throw new ErrorException("Invalid Security Token");
}
if (! isset($_FILES)) {
throw new ErrorException("File was not uploaded");
}
$responce = [];
foreach($_FILES as $name => $file) {
$responce[$name]['status'] = 0;
// check if file is uploaded
if ($file['error'] == UPLOAD_ERR_OK) {
// Check for signatire
if (isset($_POST[$name . '-sign']) && $_POST[$name . '-sign'] === $this->sign($file['tmp_name'])) {
$path = $dir . DIRECTORY_SEPARATOR . $file['name'];
$x = 0;
while(file_exists($path)) {
$x ++;
$path = $dir . DIRECTORY_SEPARATOR . $x . "-" . $file['name'];
}
// Move File to temp folder
move_uploaded_file($file['tmp_name'], $path);
$responce[$name]['name'] = $path;
$responce[$name]['sign'] = $_POST[$name . '-sign'];
$responce[$name]['status'] = 1;
$responce[$name]['msg'] = "OK";
} else {
$responce[$name]['msg'] = sprintf("Invalid File Signature");
}
} else {
$responce[$name]['msg'] = sprintf("Upload Error : %s" . $file['error']);
}
}
return $responce;
}
private function sign($file) {
return hash_hmac_file("sha256", $file, $this->key);
}
}
Other things to consider
For better security you can consider the follow
IP Lock down
File Size Limit
File Type Validation
Public-Key Cryptography
Changing Date Based token generation
Conclusion
The sample class can be extended in so many ways and rather than use URL you can consider a proper json RCP solution
A long enough, single-use, short-lived, random generated key should suffice in this case.
Client requests for a file to Server 1
Server 1 confirms login information and generates a long single-use key and sends it to the user. Server 1 keeps track of this key and matches it with an actual file on Server 2.
Client sends a request to Server 2 along with the key
Server 2 contacts Server 1 and submits the key
Server 1 returns a file path if the key is valid. The key is invalidated (destroyed).
Server 2 sends the file to the client
Server 1 invalidates the key after say 30 seconds, even if it didn't receive a confirmation request from Server 2. Your front-end should account for this case and retry the process a couple of times before returning an error.
I do not think there is a point in sending cookie/session information along, this information can be brute-forced just like the random key.
A 1024-bit long key sounds more than reasonable. This entropy can be obtained with a string of less than 200 alphanumeric characters.
For the absolute best security you would need some communication from server 2 to server 1, to double check if the request is valid. Although this communication could be minimal, its still communication and thus slows down the proces.
If you could live with a marginally less secure solution, I would suggest the following.
Server 1 requestfile.php:
<?php
//check login
if (!$loggedon) {
die('You need to be logged on');
}
$dataKey = array();
$uniqueKey = 'fgsdjk%^347JH$#^%&5ghjksc'; //choose whatever you want.
//check file
$file = isset($_GET['file']) ? $_GET['file'] : '';
if (empty($file)) {
die('Invalid request');
}
//add user data to create a reasonably unique fingerprint.
//It will mostlikely be the same for people in the same office with the same browser, thats mainly where the security drop comes from.
//I double check if all variables are set just to be sure. Most of these will never be missing.
if (isset($_SERVER['HTTP_USER_AGENT'])) {
$dataKey[] = $_SERVER['HTTP_USER_AGENT'];
}
if (isset($_SERVER['REMOTE_ADDR'])) {
$dataKey[] = $_SERVER['REMOTE_ADDR'];
}
if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
$dataKey[] = $_SERVER['HTTP_ACCEPT_LANGUAGE'];
}
if (isset($_SERVER['HTTP_ACCEPT_ENCODING'])) {
$dataKey[] = $_SERVER['HTTP_ACCEPT_ENCODING'];
}
if (isset($_SERVER['HTTP_ACCEPT'])) {
$dataKey[] = $_SERVER['HTTP_ACCEPT'];
}
//also add the unique key
$dataKey[] = $uniqueKey;
//add the file
$dataKey[] = $file;
//add a timestamp. Since the request will be a different times, dont use the exact second
//make sure its added last
$dataKey[] = date('YmdHi');
//create a hash
$hash = md5(implode('-', $dataKey));
//send to server 2
header('Location: https://server2.com/download.php?file='.urlencode($file).'&key='.$hash);
?>
On server 2 you will do almost the same.
<?php
$valid = false;
$dataKey = array();
$uniqueKey = 'fgsdjk%^347JH$#^%&5ghjksc'; //same as on server one
//check file
$file = isset($_GET['file']) ? $_GET['file'] : '';
if (empty($file)) {
die('Invalid request');
}
//check key
$key = isset($_GET['key']) ? $_GET['key'] : '';
if (empty($key)) {
die('Invalid request');
}
//add user data to create a reasonably unique fingerprint.
if (isset($_SERVER['HTTP_USER_AGENT'])) {
$dataKey[] = $_SERVER['HTTP_USER_AGENT'];
}
if (isset($_SERVER['REMOTE_ADDR'])) {
$dataKey[] = $_SERVER['REMOTE_ADDR'];
}
if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
$dataKey[] = $_SERVER['HTTP_ACCEPT_LANGUAGE'];
}
if (isset($_SERVER['HTTP_ACCEPT_ENCODING'])) {
$dataKey[] = $_SERVER['HTTP_ACCEPT_ENCODING'];
}
if (isset($_SERVER['HTTP_ACCEPT'])) {
$dataKey[] = $_SERVER['HTTP_ACCEPT'];
}
//also add the unique key
$dataKey[] = $uniqueKey;
//add the file
$dataKey[] = $file;
//add a timestamp. Since the request will be a different times, dont use the exact second
//keep the request time in a variable
$time = time();
$dataKey[] = date('YmdHi', $time);
//create a hash
$hash = md5(implode('-', $dataKey));
if ($hash == $key) {
$valid = true;
} else {
//perhaps the request to server one was made at 2013-06-26 14:59 and the request to server 2 come in at 2013-06-26 15:00
//It would still fail when the request to server 1 and 2 are more then one minute apart, but I think thats an acceptable margin. You could always adjust for more margin though.
//drop the current time
$requesttime = array_pop($dataKey);
//go back one minute
$time -= 60;
//add the time again
$dataKey[] = date('YmdHi', $time);
//create a hash
$hash = md5(implode('-', $dataKey));
if ($hash == $key) {
$valid = true;
}
}
if ($valid!==true) {
die('Invalid request');
}
//all is ok. Put the code to download the file here
?>
You can restrict access to server2. Only server1 will be able to send request to server2. You can do this by whitelisting ip of server1 on server side or using .htaccess file. In php you can do by checking request generated ip and validate it with server1 ip.
Also you can write a algorithm which generates a unique number. Using that algorithm generate a number on server1 and send it to server2 in request. On server2 check if that number is generated by algorithm and if yes then request is valid.
I'd go with a simple symetric encryption, where server 1 encodes the date and the authenticated user using a key known only by server 1 and server 2, sending it to the client who cant read it, but can send it to server 2 as a sort of ticket to authenticate himself. The date is important to not let any client use the same "ticket" over the time. But at least one of the servers must know which user have access to which files, so unless you use dedicated folders or access groups you must keep the user and file infos together.

multi-thread, multi-curl crawler in PHP

Hi everyone once again!
We need some help to develop and implement a multi-curl functionality into our crawler. We have a huge array of "links to be scanned" and we loop throw them with a Foreach.
Let's use some pseudo code to understand the logic:
1) While ($links_to_be_scanned > 0).
2) Foreach ($links_to_be_scanned as $link_to_be_scanned).
3) Scan_the_link() and run some other functions.
4) Extract the new links from the xdom.
5) Push the new links into $links_to_be_scanned.
5) Push the current link into $links_already_scanned.
6) Remove the current link from $links_to_be_scanned.
Now, we need to define a maximum number of parallel connections and be able to run this process for each link in parallel.
I understand that we're gonna have to create a $links_being_scanned or some kind of queue.
I'm really not sure how to approach this problem to be honest, if anyone could provide some snippet or idea to solve it, it would be greatly appreciated.
Thanks in advance!
Chris;
Extended:
I just realized that is not the multi-curl itself the tricky part, but the amount of operations done with each link after the request.
Even after the muticurl, I would eventually have to find a way to run all this operations in parallel. The whole algorithm described below would have to run in parallel.
So now rethinking, we would have to do something like this:
While (There's links to be scanned)
Foreach ($Link_to_scann as $link)
If (There's less than 10 scanners running)
Launch_a_new_scanner($link)
Remove the link from $links_to_be_scanned array
Push the link into $links_on_queue array
Endif;
And each scanner does (This should be run in parallel):
Create an object with the given link
Send a curl request to the given link
Create a dom and an Xdom with the response body
Perform other operations over the response body
Remove the link from the $links_on_queue array
Push the link into the $links_already_scanned array
I assume we could approach this creating a new PHP file with the scanner algorithm, and using pcntl_fork() for each parallel proccess?
Since even using multi-curl, I would eventually have to wait looping on a regular foreach structure for the other processes.
I assume I would have to approach this using fsockopen or pcntl_fork.
Suggestions, comments, partial solutions, and even a "good luck" will be more than appreciated!
Thanks a lot!
DISCLAIMER: This answer links an open-source project with which I'm involved. There. You've been warned.
The Artax HTTP client is a socket-based HTTP library that (among other things) offers custom control over the number of concurrent open socket connections to individual hosts while making multiple asynchronous HTTP requests.
Limiting the number of concurrent connections is easily accomplished. Consider:
<?php
use Artax\Client, Artax\Response;
require dirname(__DIR__) . '/autoload.php';
$client = new Client;
// Defaults to max of 8 concurrent connections per host
$client->setOption('maxConnectionsPerHost', 2);
$requests = array(
'so-home' => 'http://stackoverflow.com',
'so-php' => 'http://stackoverflow.com/questions/tagged/php',
'so-python' => 'http://stackoverflow.com/questions/tagged/python',
'so-http' => 'http://stackoverflow.com/questions/tagged/http',
'so-html' => 'http://stackoverflow.com/questions/tagged/html',
'so-css' => 'http://stackoverflow.com/questions/tagged/css',
'so-js' => 'http://stackoverflow.com/questions/tagged/javascript'
);
$onResponse = function($requestKey, Response $r) {
echo $requestKey, ' :: ', $r->getStatus();
};
$onError = function($requestKey, Exception $e) {
echo $requestKey, ' :: ', $e->getMessage();
}
$client->requestMulti($requests, $onResponse, $onError);
IMPORTANT: In the above example the Client::requestMulti method is making all the specified requests asynchronously. Because the per-host concurrency limit is set to 2, the client will open up new connections for the first two requests and subsequently reuse those same sockets for the other requests, queuing requests until one of the two sockets become available.
you could try something like this, haven't checked it, but you should get the idea
$request_pool = array();
function CreateHandle($url) {
$handle = curl_init($url);
// set curl options here
return $handle;
}
function Process($data) {
global $request_pool;
// do something with data
array_push($request_pool , CreateHandle($some_new_url));
}
function RunMulti() {
global $request_pool;
$multi_handle = curl_multi_init();
$active_request_pool = array();
$running = 0;
$active_request_count = 0;
$active_request_max = 10; // adjust as necessary
do {
$waiting_request_count = count($request_pool);
while(($active_request_count < $active_request_max) && ($waiting_request_count > 0)) {
$request = array_shift($request_pool);
curl_multi_add_handle($multi_handle , $request);
$active_request_pool[(int)$request] = $request;
$waiting_request_count--;
$active_request_count++;
}
curl_multi_exec($multi_handle , $running);
curl_multi_select($multi_handle);
while($info = curl_multi_info_read($multi_handle)) {
$curl_handle = $info['handle'];
call_user_func('Process' , curl_multi_getcontent($curl_handle));
curl_multi_remove_handle($multi_handle , $curl_handle);
curl_close($curl_handle);
$active_request_count--;
}
} while($active_request_count > 0 || $waiting_request_count > 0);
curl_multi_close($multi_handle);
}
You should look for some more robust solution to your problem. RabbitMQ
is a very good solution I used. There is also Gearman but I think it is your choice.
I prefer RabbitMQ.
I will share with you my code which I have used to collect email addresses from certain website.
You can modify it to fit your needs.
There were some problems with relative URL's there.
And I do not use CURL here.
<?php
error_reporting(E_ALL);
$home = 'http://kharkov-reklama.com.ua/jborudovanie/';
$writer = new RWriter('C:\parser_13-09-2012_05.txt');
set_time_limit(0);
ini_set('memory_limit', '512M');
function scan_page($home, $full_url, &$writer) {
static $done = array();
$done[] = $full_url;
// Scan only internal links. Do not scan all the internet!))
if (strpos($full_url, $home) === false) {
return false;
}
$html = #file_get_contents($full_url);
if (empty($html) || (strpos($html, '<body') === false && strpos($html, '<BODY') === false)) {
return false;
}
echo $full_url . '<br />';
preg_match_all('/([A-Za-z0-9_\-]+\.)*[A-Za-z0-9_\-]+#([A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]\.)+[A-Za-z]{2,4}/', $html, $emails);
if (!empty($emails) && is_array($emails)) {
foreach ($emails as $email_group) {
if (is_array($email_group)) {
foreach ($email_group as $email) {
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
$writer->write($email);
}
}
}
}
}
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
preg_match_all("/$regexp/siU", $html, $matches, PREG_SET_ORDER);
if (is_array($matches)) {
foreach($matches as $match) {
if (!empty($match[2]) && is_scalar($match[2])) {
$url = $match[2];
if (!filter_var($url, FILTER_VALIDATE_URL)) {
$url = $home . $url;
}
if (!in_array($url, $done)) {
scan_page($home, $url, $writer);
}
}
}
}
}
class RWriter {
private $_fh = null;
private $_written = array();
public function __construct($fname) {
$this->_fh = fopen($fname, 'w+');
}
public function write($line) {
if (in_array($line, $this->_written)) {
return;
}
$this->_written[] = $line;
echo $line . '<br />';
fwrite($this->_fh, "{$line}\r\n");
}
public function __destruct() {
fclose($this->_fh);
}
}
scan_page($home, 'http://kharkov-reklama.com.ua/jborudovanie/', $writer);

PHP curl questions - running multiple times

I have this code:
<?php
foreach($items as $item) {
$site = $item['link'];
$id = $item['id'];
$newdata = $item['data_a'];
$newdata2 = $item['data_b'];
$ch = curl_init($site.'updateme.php?id='.$id.'&data1='.$newdata.'&data2='.$newdata2);
curl_exec ($ch);
// do some checking here
curl_close ($ch);
}
?>
Sample input:
$site = 'http://www.mysite.com/folder1/folder2/';
$id = 512522;
$newdata = 'Short string here';
$newdata = 'Another short string here with numbers';
Here the main process of updateme.php
if (!$id = intval(Tools::getValue('id')))
$this->_errors[] = Tools::displayError('Invalid ID!');
else
{
$history = new History();
$history->id = $id;
$history->changeState($newdata1, intval($id));
$history->id_employee = intval($employee->id_employee);
$carrier = new Carrier(intval($info->id_carrier), intval($info->id_lang));
$templateVars = array('{delivery}' => ($history->id_data_state == _READY_TO_SEND AND $info->shipping_number) ? str_replace('#', $info->shipping_number, $carrier->url) : '');
if (!$history->addWithemail(true, $templateVars))
$this->_errors[] = Tools::displayError('an error occurred while changing status or was unable to send e-mail to the employee');
}
The site will always be changing and each $items will have atleast 20 data inside it so the foreach loop will run atleast 20 times or more depending on the number of data.
The target site will update it's database with the passed variables, it will probably pass thru atleast 5 functions before it is saved to the DB so it could probably take some time too.
My question is will there be a problem with this approach? Will the script encounter a timeout error while going thru the curl process? How about if the $items data is around 50 or in the hundreds now?
Or is there a better way to do this?
UPDATES:
* Added updateme.php main process code. Additional info: updateme.php will also send an email depending on the variables passed.
Right now all of the other site are hosted in the same server.
You can have a php execution time problem.
For your curl timeout problem, you can "fix" it using the option CURLOPT_TIMEOUT.
Since the cURL script that calls updateme.php doesn't expect a response, you should make updateme.php return early.
http://gr.php.net/register_shutdown_function
function shutdown() {
if (!$id = intval(Tools::getValue('id')))
$this->_errors[] = Tools::displayError('Invalid ID!');
else
{
$history = new History();
$history->id = $id;
$history->changeState($newdata1, intval($id));
$history->id_employee = intval($employee->id_employee);
$carrier = new Carrier(intval($info->id_carrier), intval($info->id_lang));
$templateVars = array('{delivery}' => ($history->id_data_state == _READY_TO_SEND AND $info->shipping_number) ? str_replace('#', $info->shipping_number, $carrier->url) : '');
if (!$history->addWithemail(true, $templateVars))
$this->_errors[] = Tools::displayError('an error occurred while changing status or was unable to send e-mail to the employee');
}
}
register_shutdown_function('shutdown');
exit();
You can use set_time_limit(0) (0 means no time limit) to change the timeout of the PHP script execution. CURLOPT_TIMEOUT is the cURL option for setting the timeout, but I think it's unlimited by default, so you don't need to set this option on your handle.

Categories