I was wondering what the best way to do concurrent HTTP requests in PHP? I have a lot of data to get and i'd rather do multiple requests at once to retrieve it all.
Does anybody know how I can do this? Preferably in an anonymous/callback function mannor...
Thanks,
Tom.
You can use curl_multi, which internally fires off multiple separate requests under a single curl handle.
But otherwise PHP itself not in any way/shape/form "multithreaded" and will not allow things to run in parallel, except via gross hacks (multiple parallel scripts, one script firing up multiple background tasks via exec(), etc...).
You can try either curl_multi() or use a lower level function socket_select()
you can use HttpRequestPool http://www.php.net/manual/de/httprequestpool.construct.php
$multiRequests = array(
new HttpRequest('http://www.google.com', HttpRequest::METH_GET),
new HttpRequest('http://www.yahoo.com', HttpRequest::METH_GET)
new HttpRequest('http://www.bing.com', HttpRequest::METH_GET)
);
$pool = new HttpRequestPool();
foreach ($multiRequests as $request)
{
$pool->attach($request);
}
$pool->send();
foreach($pool as $request)
{
echo $request->getResponseBody();
}
Or if you want you can send you data as json. In php you can defragment it into all the values again. for eg.
xhttp.open("GET", "gotoChatRoomorNot.php?q=[{"+str+"},{"+user1+"},{"+user2"}]", true);
and in php you can follow this to get your data back: How do I extract data from JSON with PHP?
So make a string in json format and send the entire thing through http.
I think you can perform the same kind of behaviour with xml, but i am not aware of xml
Related
I want to use PHP to simultaneously download data from 2 URLs via simplexml_load_file but the script must wait until all data is gathered before going ahead processing the rest of the code.
$url1 = "http://www.example.com/api1";
$request1 = simplexml_load_file($url1);
$url2 = 'http://www.example.com/api2';
$request2 = simplexml_load_file("compress.zlib://$url2", NULL, TRUE);
echo 'finished';
I want all data is completely downloaded before printing the word finished.
How would you edit the script above to accomplish that?
Fetching URLs directly while opening "files" with functions such as simplexml_load_file is intended as a short-cut for simple cases where you don't need things like non-blocking / asynchronous I/O.
Your script as written will wait for everything to download before printing the word "finished", but it will also wait for the response from http://www.example.com/api1 to finish downloading before starting the request to http://www.example.com/api2.
You will need to break your problem down:
Download the contents of two URLs, in parallel (or more accurately "asynchronously"). Your result will be two strings.
Parse each of those strings using simplexml_load_string.
The most popular HTTP library for PHP is Guzzle, but you should be able to find many alternatives, and guides to writing your own using the built-in cURL functions if you search for terms like "PHP asynchronous HTTP" or "PHP parallel HTTP requests".
I created a little script that imports wordpress posts from an xml file:
if(isset($_POST['wiki_import_posted'])) {
// Get uploaded file
$file = file_get_contents($_FILES['xml']['tmp_name']);
$file = str_replace('&', '&', $file);
// Get and parse XML
$data = new SimpleXMLElement( $file , LIBXML_NOCDATA);
foreach($data->RECORD as $key => $item) {
// Build post array
$post = array(
'post_title' => $item->title,
........
);
// Insert new post
$id = wp_insert_post( $post );
}
}
The problem is that my xml file is really big, and when i submit the form, the browser just hangs for a couple of minutes.
Is it possible to display some messages during the import, like displaying a dot after every item is imported?
Unfortunately, no, not easily. Especially if you're building this on top of the WP framework you'll find it not worth your while at all. When you're interacting with a PHP script you are sending a request and awaiting a response. However long it takes that PHP script to finish processing and start sending output is how long it usually takes the client to start seeing a response.
There are a few things to consider if what you want is for output to start showing as soon as possible (i.e. as soon as the first echo or output statement is reached).
Turn off output buffering so that output begins sending immediately.
Output whatever you want inside the loop that would indicate to you the progress you wish to be know about.
Note that if you're doing this with an AJAX request content may not be ready immediately to transport to the DOM via your XMLHttpRequest object. Also note that some browsers do their own buffering before content can be ready for the user to display (like IE for example).
Some suggestions you may want to look into to speed up your script, however:
Why are you doing str_replace('&','&',$file) on a large file? You realize that has cost with no benefit, right? You've acomplished nothing and if you meant you want to replace the HTML entity & then you probably have some of your logic very wrong. Encoding is something you want to let the XML parser handle.
You can use curl_multi instead of file_get_contents to do multiple HTTP requests concurrently to save time if you are transferring a lot of files. It will be much faster since it's none-blocking I/O.
You should use DOMDocument instead of SimpleXML and a DOMXPath query can get you your array much faster than what you're currently doing. It's a much nicer interface than SimpleXML and I always recommend it above SimpleXML since in most cases SimpleXML makes things incredibly difficult to do and for no good reason. Don't let the name fool you.
I have a process defined in a batch file that runs 3 php scripts, one after another in sequence, and I want to create a web front-end for the process. The process is triggered when someone uploads a file using a webform.
I want the ability to notify the user after each script in the batch file is run with some useful messages, but I am not quite sure what the right way to go about is.
I am thinking in terms of javascript that sends request to the 3 php files in sequence, with the php scripts echoing the status messages as they are executed. But I would rather have the three files executed in a single trigger from the client instead of having javascript calling the the three scripts separately.
Any ideas whats the best way to go about it?
You could create a single php file that runs the 3 php script and echoes their ouputs. By executing a server side request trough AJAX (I suggest jQuery framework) the outputs may be collected and the client side scripts can process and show the results to the user.
Store messages in the database or other centralized storage system. Use an ajax call to pull the newest messages.
Send an ajax call to a php file on your server. You don't specify how your php scripts are running, but what I would do is include other the other php files (which would have functions to process whatever they're supposed to do) and then call the functions from the main file. These functions should return some sort of success/error messages that you can collect in an array. So something like this:
$response[] = firstFileProcessing(); //your function names should be better
$response[] = secondFileProcessing();
$response[] = thirdFileProcessing();
When all the scripts are done processing, do:
echo json_encode(array("responses" => $response));
Then back in your success function for the ajax call, you'd do:
var res = jQuery.parseJSON(response_from_server);
//or if you told ajax to expect a json response,
//response_from_server would automatically be an object based on the json sent
and step through each one to see what php said.
alert(res.responses[0]); //the first file said this
Or you could make your $response from php much more detailed - an array of its own, with $response['success'] = TRUE, $response['msg'] = "Worked!", etc - any amount of data.
iam using json object in my php file but i dont want my json object to be displayed in source code as it increases my page size a lot.
this is what im doing in php
$json = new Services_JSON();
$arr = array();
$qs=mysql_query("my own query");
while($obj = mysql_fetch_object($qs))
{
$arr[] = $obj;
}
$total=sizeof($arr);
$jsn_obj='{"abc":'.$json->encode($arr).',"totalrow":"'.$total.'"}';
and this is javascript
echo '<script language=\'javascript\'>
var dataref = new Object();
dataref = eval('.$jsn_obj.');
</script>';
but i want to hide this $jsn_obj objects value from my source,how can i do that??? plz help !!
I'm not sure there's a way around your problem, other than to change your mind about whether it's a problem at all (it's not, really).
You can't use the JSON object in your page if you don't output it. The only other way to get the object would be to make a separate AJAX request for it. If you did it that way, you're still transferring the exact same number of bytes that you would have originally, but now you've added the overhead of an extra HTTP request (which will be larger than it would have been originally, since there are now HTTP headers on the transfer). This way would also be slower on your page load, since you'd have to load the page, then send the AJAX request and run the result.
There's much better ways to manage the size of your pages. JSON is just text, so you should look into a server-side solution to zip your content, like mod_deflate. mod_deflate works beautifully on dynamic PHP output as well as static pages. If you don't have control over your web server, you could use PHP's built in zlib compression.
Instead of writing the JSON date directly to the document instead you can use an XMLHttpRequest in or use a library like JQuery to load the JSON data during script runtime.
It depends largely on your json data. If the data you're printing inline in the html is huge you might wanna consider using ajax to load the json data. That is assuming you wanted your page to be loaded faster, even without data.
If the data isn't that big, try to keep the data inline, without making extra http requests. To speed up your page, try using YSlow! to see what other areas you could optimize.
I'm looking for a PHP library that allows me to scrap webpages and takes care about all the cookies and prefilling the forms with the default values, that's what annoys me the most.
I'm tired of having to match every single input element with xpath and I would love if something better existed. I've come across phpQuery but the manual isn't much clear and I can't find out how to make POST requests.
Can someone help me? Thanks.
#Jonathan Fingland:
In the example provided by the manual for browserGet() we have:
require_once('phpQuery/phpQuery.php');
phpQuery::browserGet('http://google.com/', 'success1');
function success1($browser)
{
$browser->WebBrowser('success2')
->find('input[name=q]')->val('search phrase')
->parents('form')
->submit();
}
function success2($browser)
{
echo $browser;
}
I suppose all the other fields are scrapped and send back in the GET request, I want to do the same with the phpQuery::browserPost() method but I don't know how to do it. The form I'm trying to scrape has a input token and I would love if phpQuery could be smart enough to scrape the token and just let me change the other fields (in this case username and password), submiting via POST everything.
PS: Rest assured, this is not going to be used for spamming.
See http://code.google.com/p/phpquery/wiki/Ajax and in particular:
phpQuery::post($url, $data, $callback, $type)
and
# data Object, String which defines the data parameter as being either an Object or a String. POST requests should be possible using query string format, e.g.:
$data = "username=Jon&password=123456";
$url = "http://www.mysite.com/login.php";
phpQuery::post($url, $data, $callback, $type)
as phpQuery is a jQuery port the method signature is the same (the docs link directly to the jquery site -- http://docs.jquery.com/Ajax/jQuery.post)
Edit
Two things:
There is also a phpQuery::browserPost function which might meet your needs better.
However, also note that the success2 callback is only called on the submit() or click() methods so you can fill in all of the form fields prior to that.
e.g.
require_once('phpQuery/phpQuery.php');
phpQuery::browserGet('http://www.mysite.com/login.php', 'success1');
function success1($browser) {
$handle = $browser
->WebBrowser('success2');
$handle
->find('input[name=username]')
->val('Jon');
$handle
->find('input[name=password]')
->val('123456');
->parents('form')
->submit();
}
function success2($browser) {
print $browser;
}
(Note that this has not been tested, but should work)
I've used SimpleTest's ScriptableBrowser for such stuff in the past. It's part of the SimpleTest testing framework, but you can use it stand-alone.
I would use a dedicated library for parsing HTML files and a dedicated library for processing HTTP requests. Using the same library for both seems like a bad idea, IMO.
For processing HTTP requests, check out eg. Httpful, Unirest, Requests or Guzzle. Guzzle is especially popular these days, but in the end, whichever library works best for you is still a matter of personal taste.
For parsing HTML files I would recommend a library that I wrote myself : DOM-Query. It allows you to (1) load an HTML file and then (2) select or change parts of your HTML pretty much the same way you'd do it if you'd be using jQuery in a frontend app.