Parallel downloads using simplexml_load_file in PHP - php

I want to use PHP to simultaneously download data from 2 URLs via simplexml_load_file but the script must wait until all data is gathered before going ahead processing the rest of the code.
$url1 = "http://www.example.com/api1";
$request1 = simplexml_load_file($url1);
$url2 = 'http://www.example.com/api2';
$request2 = simplexml_load_file("compress.zlib://$url2", NULL, TRUE);
echo 'finished';
I want all data is completely downloaded before printing the word finished.
How would you edit the script above to accomplish that?

Fetching URLs directly while opening "files" with functions such as simplexml_load_file is intended as a short-cut for simple cases where you don't need things like non-blocking / asynchronous I/O.
Your script as written will wait for everything to download before printing the word "finished", but it will also wait for the response from http://www.example.com/api1 to finish downloading before starting the request to http://www.example.com/api2.
You will need to break your problem down:
Download the contents of two URLs, in parallel (or more accurately "asynchronously"). Your result will be two strings.
Parse each of those strings using simplexml_load_string.
The most popular HTTP library for PHP is Guzzle, but you should be able to find many alternatives, and guides to writing your own using the built-in cURL functions if you search for terms like "PHP asynchronous HTTP" or "PHP parallel HTTP requests".

Related

How to make speed test with curl?

I want to use fast.com but when I use
$curlHandle = curl_init('http://fast.com');
it was returned 0. So, Page is not loading.
How can I get speed result with curl?
Some library like this https://github.com/aln-1/speedtest-php might be better suited - it uses speedtest.net.
Just curl-downloading fast.com would only download you the webpage, not perform an actual speed test.
If you want to do it with curl you need to download an actual huge file, for example as found on https://speed.hetzner.de/.
Additionally curl_init does not do the actual transfer - you need to execute it. Read the curl docs for this :-)

Status report during form process

I created a little script that imports wordpress posts from an xml file:
if(isset($_POST['wiki_import_posted'])) {
// Get uploaded file
$file = file_get_contents($_FILES['xml']['tmp_name']);
$file = str_replace('&', '&', $file);
// Get and parse XML
$data = new SimpleXMLElement( $file , LIBXML_NOCDATA);
foreach($data->RECORD as $key => $item) {
// Build post array
$post = array(
'post_title' => $item->title,
........
);
// Insert new post
$id = wp_insert_post( $post );
}
}
The problem is that my xml file is really big, and when i submit the form, the browser just hangs for a couple of minutes.
Is it possible to display some messages during the import, like displaying a dot after every item is imported?
Unfortunately, no, not easily. Especially if you're building this on top of the WP framework you'll find it not worth your while at all. When you're interacting with a PHP script you are sending a request and awaiting a response. However long it takes that PHP script to finish processing and start sending output is how long it usually takes the client to start seeing a response.
There are a few things to consider if what you want is for output to start showing as soon as possible (i.e. as soon as the first echo or output statement is reached).
Turn off output buffering so that output begins sending immediately.
Output whatever you want inside the loop that would indicate to you the progress you wish to be know about.
Note that if you're doing this with an AJAX request content may not be ready immediately to transport to the DOM via your XMLHttpRequest object. Also note that some browsers do their own buffering before content can be ready for the user to display (like IE for example).
Some suggestions you may want to look into to speed up your script, however:
Why are you doing str_replace('&','&',$file) on a large file? You realize that has cost with no benefit, right? You've acomplished nothing and if you meant you want to replace the HTML entity & then you probably have some of your logic very wrong. Encoding is something you want to let the XML parser handle.
You can use curl_multi instead of file_get_contents to do multiple HTTP requests concurrently to save time if you are transferring a lot of files. It will be much faster since it's none-blocking I/O.
You should use DOMDocument instead of SimpleXML and a DOMXPath query can get you your array much faster than what you're currently doing. It's a much nicer interface than SimpleXML and I always recommend it above SimpleXML since in most cases SimpleXML makes things incredibly difficult to do and for no good reason. Don't let the name fool you.

PHP Concurrent HTTP requests?

I was wondering what the best way to do concurrent HTTP requests in PHP? I have a lot of data to get and i'd rather do multiple requests at once to retrieve it all.
Does anybody know how I can do this? Preferably in an anonymous/callback function mannor...
Thanks,
Tom.
You can use curl_multi, which internally fires off multiple separate requests under a single curl handle.
But otherwise PHP itself not in any way/shape/form "multithreaded" and will not allow things to run in parallel, except via gross hacks (multiple parallel scripts, one script firing up multiple background tasks via exec(), etc...).
You can try either curl_multi() or use a lower level function socket_select()
you can use HttpRequestPool http://www.php.net/manual/de/httprequestpool.construct.php
$multiRequests = array(
new HttpRequest('http://www.google.com', HttpRequest::METH_GET),
new HttpRequest('http://www.yahoo.com', HttpRequest::METH_GET)
new HttpRequest('http://www.bing.com', HttpRequest::METH_GET)
);
$pool = new HttpRequestPool();
foreach ($multiRequests as $request)
{
$pool->attach($request);
}
$pool->send();
foreach($pool as $request)
{
echo $request->getResponseBody();
}
Or if you want you can send you data as json. In php you can defragment it into all the values again. for eg.
xhttp.open("GET", "gotoChatRoomorNot.php?q=[{"+str+"},{"+user1+"},{"+user2"}]", true);
and in php you can follow this to get your data back: How do I extract data from JSON with PHP?
So make a string in json format and send the entire thing through http.
I think you can perform the same kind of behaviour with xml, but i am not aware of xml

Loading Javascript through PHP

From a tutorial I read on Sitepoint, I learned that I could load JS files through PHP (it was a comment, anyway). The code for this was in this form:
<script src="js.php?script1=jquery.js&scipt2=main.js" />
The purpose of using PHP was to reduce the number of HTTP requests for JS files. But from the markup above, it seems to me that there are still going to be the same number of requests as if I had written two tags for the JS files (I could be wrong, that's why I'm asking).
The question is how is the PHP code supposed to be written and what is/are the advantage(s) of this approach over the 'normal' method?
The original poster was presumably meaning that
<script src="js.php?script1=jquery.js&scipt2=main.js" />
Will cause less http requests than
<script src="jquery.js" />
<script src="main.js" />
That is because js.php will read all script names from GET parameters and then print it out to a single file. This means that there's only one roundtrip to the server to get all scripts.
js.php would probably be implemented like this:
<?php
$script1 = $_GET['script1'];
$script2 = $_GET['script2'];
echo file_get_contents($script1); // Load the content of jquery.js and print it to browser
echo file_get_contents($script2); // Load the content of main.js and print it to browser
Note that this may not be an optimal solution if there is a low number of scripts that is required. The main issue is that web browser does not load an infinitely number of scripts in parallel from the same domain.
You will need to implement caching to avoid loading and concatenating all your scripts on every request. Loading and combining all scripts on every request will eat very much CPU.
IMO, the best way to do this is to combine and minify all script files into a big one before deploying your website, and then reference that file. This way, the client just makes one roundtrip to the server, and the server does not have any extra load upon each request.
Please note that the PHP solution provided is by no means a good approach, it's just a simple demonstration of the procedure.
The main advantage of this approach is that there is only a single request between the browser and server.
Once the server receives the request, the PHP script combines the javascript files and spits the results out.
Building a PHP script that simply combines JS files is not at all difficult. You simply include the JS files and send the appropriate content-type header.
When it gets more difficult is based on whether or not you want to worry about caching.
I recommend you check out minify.
<script src="js.php?script1=jquery.js&scipt2=main.js" />
That's:
invalid (ampersands have to be encoded)
hard to expand (using script[]= would make PHP treat it as an array you can loop over)
not HTML compatible (always use <script></script>, never <script />)
The purpose of using PHP was to reduce the number of HTTP requests for JS files. But from the markup above, it seems to me that there are still going to be the same number of requests as if I had written two tags for the JS files (I could be wrong, that's why I'm asking).
You're wrong. The browser makes a single request. The server makes a single response. It just digs around in multiple files to construct it.
The question is how is the PHP code supposed to be written
The steps are listed in this answer
and what is/are the advantage(s) of this approach over the 'normal' method?
You get a single request and response, so you avoid the overhead of making multiple HTTP requests.
You lose the benefits of the generally sane cache control headers that servers send for static files, so you have to set up suitable headers in your script.
You can do this like this:
The concept is quite easy, but you may make it a bit more advanced
Step 1: merging the file
<?php
$scripts = $_GET['script'];
$contents = "";
foreach ($scripts as $script)
{
// validate the $script here to prevent inclusion of arbitrary files
$contents .= file_get_contents($pathto . "/" . $script);
}
// post processing here
// eg. jsmin, google closure, etc.
echo $contents();
?>
usage:
<script src="js.php?script[]=jquery.js&script[]=otherfile.js" type="text/javascript"></script>
Step 2: caching
<?php
function cacheScripts($scriptsArray,$outputdir)
{
$filename = sha1(join("-",$scripts) . ".js";
$path = $outputdir . "/" . $filename;
if (file_exists($path))
{
return $filename;
}
$contents = "";
foreach ($scripts as $script)
{
// validate the $script here to prevent inclusion of arbitrary files
$contents .= file_get_contents($pathto . "/" . $script);
}
// post processing here
// eg. jsmin, google closure, etc.
$filename = sha1(join("-",$scripts) . ".js";
file_write_contents( , $contents);
return $filename;
}
?>
<script src="/js/<?php echo cacheScripts(array('jquery.js', 'myscript.js'),"/path/to/js/dir"); ?>" type="text/javascript"></script>
This makes it a bit more advanced. Please note, this is semi-pseudo code to explain the concepts. In practice you will need to do more error checking and you need to do some cache invalidation.
To do this is a more managed and automated way, there's assetic (if you may use php 5.3):
https://github.com/kriswallsmith/assetic
(Which more or less does this, but much better)
Assetic
Documentation
https://github.com/kriswallsmith/assetic/blob/master/README.md
The workflow will be something along the lines of this:
use Assetic\Asset\AssetCollection;
use Assetic\Asset\FileAsset;
use Assetic\Asset\GlobAsset;
$js = new AssetCollection(array(
new GlobAsset('/path/to/js/*'),
new FileAsset('/path/to/another.js'),
));
// the code is merged when the asset is dumped
echo $js->dump();
There is a lot of support for many formats:
js
css
lot's of minifiers and optimizers (css,js, png, etc.)
Support for sass, http://sass-lang.com/
Explaining everything is a bit outside the scope of this question. But feel free to open a new question!
PHP will simply concatenate the two script files and sends only 1 script with the contents of both files, so you will only have 1 request to the server.
Using this method, there will still be the same number of disk IO requests as if you had not used the PHP method. However, in the case of a web application, disk IO on the server is never the bottle neck, the network is. What this allows you to do is reduce the overhead associated with requesting the file from the server over the network via HTTP. (Reduce the number of messages sent over the network.) The PHP script outputs the concatenation of all of the requested files so you get all of your scripts in one HTTP request operation rather than multiple.
Looking at the parameters it's passing to js.php it can load two javascript files (or any number for that matter) in one request. It would just look at its parameters (script1, script2, scriptN) and load them all in one go as opposed to loading them one by one with your normal script directive.
The PHP file could also do other things like minimizing before outputting. Although it's probably not a good idea to minimize every request on the fly.
The way the PHP code would be written is, it would look at the script parameters and just load the files from a given directory. However, it's important to note that you should check the file type and or location before loading. You don't want allow a people a backdoor where they can read all the files on your server.

json to php (server) possible and how?

Basically i want this json output to be transferred to my server/php.
i try this
$url="http://maps.google.com/maps/nav?q=from:9500 wil to:3000 bern";
$conte = file_get_contents($url);
echo $conte;
the json is not echo, how can i save the output to my server?
You need to urlencode the GET parameters:
echo file_get_contents('http://maps.google.com/maps/nav?q=from:9500%20wil%20to:3000%20bern');
# Returns
# {"name":"from:9500 wil to:3000 bern","Status":{"code":200,"request":"directions"},"Placemark":[{"id":"","address":"Wil, Switzerland","AddressDetails":{"Country":{"CountryNameCode":"CH","CountryName":"Schweiz","AdministrativeArea":{"AdministrativeAreaName":"St. Gallen","SubAdministrativeArea":{"SubAdministrativeAreaName":"Wil","Locality":{"LocalityName":"Wil"}}}},"Accuracy": 4},"Point":{"coordinates":[9.048081,47.463817,0]}},{"id":"","address":"Frohbergweg 7, 3012 Bern District, Switzerland","AddressDetails":{"Country":{"CountryNameCode":"CH","AdministrativeArea":{"AdministrativeAreaName":"BE","SubAdministrativeArea":{"SubAdministrativeAreaName":"Bern","Locality":{"LocalityName":"Bern District","DependentLocality":{"DependentLocalityName":"Länggasse-Felsenau","Thoroughfare":{"ThoroughfareName":"Frohbergweg 7"},"PostalCode":{"PostalCodeNumber":"3012"}}}}}},"Accuracy": 0},"Point":{"coordinates":[7.436386,46.954897,0]}}],"Directions":{"copyrightsHtml":"Map data \u0026#169;2010 Google, Tele Atlas ","summaryHtml":"178\u0026nbsp;km (about 2 hours 2 mins)","Distance":{"meters":177791,"html":"178\u0026nbsp;km"},"Duration":{"seconds":7343,"html":"2 hours 2 mins"},"Routes":[{"Distance":{"meters":177791,"html":"178\u0026nbsp;km"},"Duration":{"seconds":7343,"html":"2 hours 2 mins"},"summaryHtml":"178\u0026nbsp;km (about 2 hours 2 mins)","Steps":[{"descriptionHtml":"Head \u003Cb\u003Esouth\u003C\/b\u003E on \u003Cb\u003EToggenburgerstrasse\u003C\/b\u003E toward \u003Cb\u003ELerchenfeldstrasse\/\u003Cwbr\/\u003ERoute 16\/\u003Cwbr\/\u003ERoute 7\u003C\/b\u003E","Distance":{"meters":29,"html":"29\u0026nbsp;m"},"Duration":{"seconds":2,"html":"2 secs"},"Point":{"coordinates":[9.048030,47.463830,0]}},{"descriptionHtml":"Take the 1st left onto \u003Cb\u003ERoute 7\u003C\/b\u003E","Distance":{"meters":625,"html":"650\u0026nbsp;m"},"Duration":{"seconds":109,"html":"2 mins"},"Point":{"coordinates":[9.047930,47.463570,0]}},{"descriptionHtml":"At the traffic circle, take the \u003Cb\u003E1st\u003C\/b\u003E exit onto \u003Cb\u003EGeorg Rennerstrasse\u003C\/b\u003E","Distance":{"meters":871,"html":"850\u0026nbsp;m"},"Duration":{"seconds":77,"html":"1 min"},"Point":{"coordinates":[9.056170,47.463110,0]}},{"descriptionHtml":"Take the ramp to \u003Cb\u003EZürich\/\u003Cwbr\/\u003EFrauenfeld\u003C\/b\u003E","Distance":{"meters":330,"html":"350\u0026nbsp;m"},"Duration":{"seconds":22,"html":"22 secs"},"Point":{"coordinates":[9.053350,47.455800,0]}},{"descriptionHtml":"Merge onto \u003Cb\u003EA1\u003C\/b\u003E\u003Cdiv class=\"google_impnote\"\u003EToll road\u003C\/div\u003E","Distance":{"meters":173696,"html":"174\u0026nbsp;km"},"Duration":{"seconds":6790,"html":"1 hour 53 mins"},"Point":{"coordinates":[9.050270,47.453900,0]}},{"descriptionHtml":"Take exit \u003Cb\u003E36-Bern-Neufeld\u003C\/b\u003E toward \u003Cb\u003EBremgarten\u003C\/b\u003E","Distance":{"meters":579,"html":"600\u0026nbsp;m"},"Duration":{"seconds":33,"html":"33 secs"},"Point":{"coordinates":[7.436980,46.966570,0]}},{"descriptionHtml":"At the traffic circle, take the \u003Cb\u003E2nd\u003C\/b\u003E exit onto \u003Cb\u003ENeubrückstrasse\u003C\/b\u003E","Distance":{"meters":1357,"html":"1.4\u0026nbsp;km"},"Duration":{"seconds":243,"html":"4 mins"},"Point":{"coordinates":[7.429580,46.966790,0]}},{"descriptionHtml":"Turn right at \u003Cb\u003EMittelstrasse\u003C\/b\u003E","Distance":{"meters":146,"html":"150\u0026nbsp;m"},"Duration":{"seconds":24,"html":"24 secs"},"Point":{"coordinates":[7.437750,46.956720,0]}},{"descriptionHtml":"Take the 1st left onto \u003Cb\u003EBrückfeldstrasse\u003C\/b\u003E","Distance":{"meters":104,"html":"100\u0026nbsp;m"},"Duration":{"seconds":33,"html":"33 secs"},"Point":{"coordinates":[7.436060,46.956100,0]}},{"descriptionHtml":"Take the 1st right onto \u003Cb\u003EFrohbergweg\u003C\/b\u003E\u003Cdiv class=\"google_note\"\u003EDestination will be on the left\u003C\/div\u003E","Distance":{"meters":54,"html":"54\u0026nbsp;m"},"Duration":{"seconds":10,"html":"10 secs"},"Point":{"coordinates":[7.436830,46.955320,0]}}],"End":{"coordinates":[7.436234,46.955057,0]}}]}}
How do you want to save it? As a file?
If you can open via file_get_contents(), then URL opening for fopen() wrappers is on.
Then you can just do...
$url = 'http://maps.google.com/maps/nav?q=from:9500 wil to:3000 bern';
$content = file_get_contents($url);
file_put_contents('google-map-json.txt', $content);
You can get that into a usable object in PHP with json_decode().
You may want to do that if you want to save it to your database.
If you don't want to overwrite the file each time, you could generate a random hash of the response for the filename, or something similar.
Update
sorry my bad. i know how to save file. but the json is not even echoed through file_get_contents.
You may not have URL fopen() wrappers enabled.
You can find out by running this...
var_dump(ini_get('allow_url_fopen'));
If it is disabled, and you can't or don't want to turn it on, you can use cURL (if you have that library installed).
Update
When I tried to access the page via file_get_contents(), I got...
HTTP/1.0 400 Bad Request
You may need to use cURL, and mimic a browser (user agent etc).
Or you can set
ini_set('user_agent', 'Mozilla or something');
And then use file_get_contents().
Update
I tried with cURL too, and it didn't work :(
I think the next step is to examine all the headers your browser sends (when it works), and then send the equivalent via cURL.
Update
I noticed the Markdown editor wasn't liking the URL (see my OP's edit), and it dawned me - urlencode() the GET params!
You can write the data to a file using PHP's IO functions
$fp = fopen('data.txt', 'a');
fwrite($fp, $conte);
fclose($fp);
You can use json_decode to parse the data from the file_get_contents() variable. (I would personally use cURL instead of file_get_contents()).

Categories