Are there any other options for rest clients besides CURL? - php

Are there alternatives to CURL in PHP that will allow for a client to connect o a REST architecture server ?
PUT, DELETE, file upload are some of the things that need to work.

You can write your own library. It's even possible to do it completely in PHP, using fsockopen and friends. For example:
function httpget($host, $uri) {
$msg = 'GET '.$uri." HTTP/1.1\r\n".
'Host: '.$host."\r\n".
"Connection: close\r\n\r\n";
$fh = fsockopen($host, 80);
fwrite($fh, $msg);
$result = '';
while(!feof($fh)) {
$result .= fgets($fh);
}
fclose($fh);
return $result;
}

I recommend Zend_Http_Client (from Zend) or HTTP_Request2 (from PEAR). They both provide a well-designed object model for making HTTP requests.
In my personal experience, I've found the Zend version to be a little more mature (mostly in dealing with edge cases).

Related

wikipedia doesn't like file_get_contents

I use the PHP function file_get_contents as a proxy to fetch websites on two different web hosts.
It works for all websites except Wikipedia.
It gives me this output every time:
WIKIMEDIA FOUNDATION
Error
Our servers are currently experiencing a technical problem. This is probably temporary and
should be fixed soon. Please try again in a few minutes.
Anyone know what the problem is?
You're probably not passing the correct User-Agent. See here.
You should pass a context to file_get_contents:
PHP: file_get_contents - Manual
PHP: stream_context_create - Manual
Wikimedia Foundation policy is to block requests with non-descriptive or missing User-Agent headers because these tend to originate from misbehaving scripts. "PHP" is one of the blacklisted values for this header.
You should change the default User-Agent header to one that identifies your script and how the system administrators can contact you if necessary:
ini_set('user_agent', 'MyCoolTool/1.1 (http://example.com/MyCoolTool/; MyCoolTool#example.com)');
Of course, be sure to change the name, URL, and e-mail address rather than copying the code verbatim.
Wikipedia requires a User-Agent HTTP header be sent with the request. By default, file_get_contents does not send this.
You should use fsockopen, fputs, feof and fgets to send a full HTTP request, or you may be able to do it with cURL. My personal experience is with the f* functions, so here's an example:
$attempts = 0;
do {
$fp = #fsockopen("en.wikipedia.org",80,$errno,$errstr,5);
$attempts++;
} while(!$fp && $attempts < 5);
if( !$fp) die("Failed to connect");
fputs($fp,"GET /wiki/Page_name_here HTTP/1.0\r\n"
."Host: en.wikipedia.org\r\n"
."User-Agent: PHP-scraper (your-email#yourwebsite.com)\r\n\r\n");
$out = "";
while(!feof($fp)) {
$out .= fgets($fp);
}
fclose($fp);
list($head,$body) = explode("\r\n\r\n",$out);
$head = explode("\r\n",$head);
list($http,$status,$statustext) = explode(" ",array_shift($head),3);
if( $status != 200) die("HTTP status ".$status." ".$statustext);
echo $body;
Use cURL for this:
$ch = curl_init('http://wikipedia.org');
curl_setopt_array($ch, array(
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 5.1; rv:18.0) Gecko/20100101 Firefox/18.0',
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_RETURNTRANSFER => true
);
$data = curl_exec($ch);
echo $data;
I assume you have already "tried again in a few minutes".
Next thing you could try is using cURL instead of file_get_contents, and setting the user-agent one of a common browser.
If it still doesn't work, it should at least give you some more info.

Connect through HTTPS instead of HTTP

I want to use a simple API and i want to do it in the secure way.
It is currently using sockets and port 80. As far as I know port 80 is open and it doesn't seem such a secure connection.
As the data to send contains user and password i want to use HTTPS instead of HTTP to make it secure.
I was wondering if it is so simple as just changing this line;
$headers = "POST /api/api.php HTTP/1.0\r\n";
For this other line
$headers = "POST /api/api.php HTTPS/1.0\r\n";
And changing the port to 443
Here is the connect function:
// api connect function
function api_connect($Username, $Password, $ParameterArray)
{
// Create the URL to send the message.
// The variables are set using the input from an HTML form
$err = array();
$url = "api.text-connect.co.uk";
$headers = "POST /api/api.php HTTP/1.0\r\n";
$headers .= "Host: ".$url."\r\n";
// Create post string
// Username and Password
$poststring = "Username=".$Username."&";
$poststring .= "Password=".$Password;
// Turn the parameter array into the variables
while (list($Key, $Value)=#each($ParameterArray))
{
$poststring .= "&".$Key."=".urlencode($Value);
}
// Finish off the headers
$headers .= "Content-Length: ".strlen($poststring)."\r\n";
$headers .= "Content-Type: application/x-www-form-urlencoded\r\n";
// Open a socket
$http = fsockopen ($url, 80, $err[0], $err[1]);
if (!$http)
{
echo "Connection to ".$url.":80 failed: ".$err[0]." (".$err[1].")";
exit();
}
// Socket was open successfully, post the data.
fwrite ($http, $headers."\r\n".$poststring."\r\n");
// Read the results from the post
$result = "";
while (!feof($http))
{
$result .= fread($http, 8192);
}
// Close the connection
fclose ($http);
// Strip the headers from the result
list($resultheaders, $resultcode)=split("\r\n\r\n", $result, 2);
return $resultcode;
}
?>
Your code has a huge number of issues regardless if it's using HTTP or HTTPS - implementing an HTTP client (or server) is MUCH more complicated than simply throwing some headers across a socket then sinking the response.
What's particularly bad about this approach is that it will work some of the time - then it will fail and you won't understand why.
Start again using curl.
Doing it this way you only need to change the URL (it also implements a cookie jar, support for header injection, automatic following of redirects, routing via proxies, verification or non-verification of SSL certificates amongst other things).
I was wondering if it is so simple as
No, it isn't. It really, really isn't.
HTTPS is HTTP tunnelled over SSL. So you don't change the content of the HTTP request at all.
You do need to perform all the SSL handshaking before you do the HTTP stuff though.
SSL is crypto, it is therefore hard. Don't try reinventing this wheel. Use a library such as cURL.
curl
and set CURLOPT_SSL_VERIFYPEER = false

How can I run an fql request in the background? (asynchronous api calls via php)

I'd like to run a particularly expensive fql query in the background, log results to the database, and retrieve it later without the user having to wait for each step.
Can you share an example of how to run a facebook request asynchronously?
main.php
$uid = $facebook->getUser();
if ($uid) {
try {
echo $user;
////////////////////////////////////////////////////////
// Run lengthy query here, asynchronously (async.php) //
////////////////////////////////////////////////////////
// //
// For example: $profile = $facebook->api('/me'); //
// (I know this request doesn't take long, but //
// if I can run it in the background, it'll do. //
// //
////////////////////////////////////////////////////////
} catch (FacebookApiException $e) {
echo $e;
}
}
async.php
$profile = $facebook->api('/me');
$run = mysql_query("INSERT INTO table (id) VALUES (" . $profile['id'] . ");";
complete.php
echo getProfileId(); // assume function grabs id from db, as stored via async.php
My solution to running PHP jobs in the background is just to have the script itself make a new request which executes the actual job.
The code I've used before is the following, I'm not sure if there are more elegant solutions... but it has worked more than well enough for me in the past. The reason I use fsock and not file_get_contents() etc is of course so that I won't have to wait for the job to finish (which would defeat the purpose)
$sock = fsockopen("127.0.0.1", 80);
fwrite($sock, "GET /yourjoburl/ HTTP/1.1\r\n");
fwrite($sock, "Host: yourdomain.com\r\n");
fwrite($sock, "Connection: close\r\n");
fwrite($sock, "\r\n");
fflush($sock);
fclose($sock);
So, then you just have the other script write the results and progress to a database or whatever... also remember that MySQL supports mutexes, which means you can easily prevent multiple jobs from running at the same time... or to allow other scripts to wait for the job to finish.
PS. The reason I personally avoid exec and all that stuff is that it just seems like a hassle to work with, different servers, different setups, different OSes, etc. This works the same on all hosts that allow you to open sockets. Although you might want to add a private key to the request that you verify in the job, or check the IP of the caller, to prevent others from being able to start jobs.
EDIT: This is untested but should work if you want to forward the cookies as well, including the session cookie.
$sock = fsockopen("127.0.0.1", 80);
fwrite($sock, "GET /yourjoburl/ HTTP/1.1\r\n");
fwrite($sock, "Host: yourdomain.com\r\n");
fwrite($sock, "Cookie: " . $_SERVER['HTTP_COOKIE'] . "\r\n");
fwrite($sock, "Connection: close\r\n");
fwrite($sock, "\r\n");
fflush($sock);
fclose($sock);

How fast is simplexml_load_file()?

I'm fetching lots of user data via last.fm's API for my mashup. I do this every week as I have to collect listening data.
I fetch the data through their REST API and XML: more specifically simplexml_load_file().
The script is taking ridiculously long. For about 2 300 users, the script takes 30min to fetch only the names of artists. I have to fix it now, otherwise my hosting company will shut me down. I've siphoned out all other options, it is the XML that is slowing the script.
I now have to figure out whether last.fm has a slow API (or is limiting calls without them telling us), or whether PHP's simplexml is actually rather slow.
One thing I realised is that the XML request fetches a lot more than I need, but I can't limit it through the API (ie give me info on only 3 bands, not 70). But "big" XML files only get to about 20kb. Could it be that, that is slowing down the script? Having to load 20kb into an object for each of the 2300 users?
Doesn't make sense that it can be that... I just need confirmation that it is probably last.fm's slow API. Or is it?
Any other help you can provide?
I don't think simple xml is that slow, it's slow because it is a parser but I think the 2300 curl/file_get_contents are taking a lot more time. Also why don't fetch the data and just use simplexml_load_string, do you really need to put those file on the disk of the server ?
At least loading from memory should speed up a bit things, also what kind of processing are you going on the loaded xmls ? are you sure you processing is efficient as it could be ?
20kb * 2300 users is ~45MB. If you're downloading at ~25kB/sec, it will take 30 minutes just to download the data, let alone parse it.
Make sure the XML that you download from last.fm is gzipped. You'd probably have to include the correct HTTP header to tell the server you support gzip. It would speed up the download but eat more server resources with the ungzipping part.
Also consider using asynchronous downloads to free server resources. It won't necessarily speed the process up, but it should make the server administrators happy.
If the XML itself is big, use a SAX parser, instead of a DOM parser.
I think there's a limit of 1 API call per second. I'm not sure this policy is being enforced through code, but it might have something to do with it. You can ask the Last.fm staff on IRC at irc.last.fm #audioscrobbler if you believe this to be the case.
As suggested, fetch the data and parse using simplexml_load_string rather than relying on simplexml_load_file - it works out about twice as fast. Here's some code:
function simplexml_load_file2($url, $timeout = 30) {
// parse domain etc from url
$url_parts = parse_url($url);
if(!$url_parts || !array_key_exists('host', $url_parts)) return false;
$fp = fsockopen($url_parts['host'], 80, $errno, $errstr, $timeout);
if($fp)
{
$path = array_key_exists('path', $url_parts) ? $url_parts['path'] : '/';
if(array_key_exists('query', $url_parts))
{
$path .= '?' . $url_parts['query'];
}
// make request
$out = "GET $path HTTP/1.1\r\n";
$out .= "Host: " . $url_parts['host'] . "\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
// get response
$resp = "";
while (!feof($fp))
{
$resp .= fgets($fp, 128);
}
fclose($fp);
$parts = explode("\r\n\r\n", $resp);
$headers = array_shift($parts);
$status_regex = "/HTTP\/1\.\d\s(\d+)/";
if(preg_match($status_regex, $headers, $matches) && $matches[1] == 200)
{
$xml = join("\r\n\r\n", $parts);
return #simplexml_load_string($xml);
}
}
return false; }

php libcurl alternative

Are there any alternatives to using curl on hosts that have curl disabled?
To fetch content via HTTP, first, you can try with file_get_contents ; your host might not have disabled the http:// stream :
$str = file_get_contents('http://www.google.fr');
Bit this might be disabled (see allow_url_fopen) ; and sometimes is...
If it's disabled, you can try using fsockopen ; the example given in the manual says this (quoting) :
$fp = fsockopen("www.example.com", 80, $errno, $errstr, 30);
if (!$fp) {
echo "$errstr ($errno)<br />\n";
} else {
$out = "GET / HTTP/1.1\r\n";
$out .= "Host: www.example.com\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
while (!feof($fp)) {
echo fgets($fp, 128);
}
fclose($fp);
}
Considering it's quite low-level, though (you are working diretly with a socket, and HTTP Protocol is not that simple), using a library that uses it will make life easier for you.
For instance, you can take a look at snoopy ; here is an example.
http://www.phpclasses.org is full of these "alternatives", you can try this one: http://www.phpclasses.org/browse/package/3588.html
You can write a plain curl server script with PHP and place it on curl-enabled hosting, and when you need curl - you make client calls to it when needed from curl-less machine, and it will return data you need. Could be a strange solution, but was helpful once.
All the answers in this thread present valid workarounds, but there is one thing you should keep in mind. You host has, for whatever reason, deemed that making HTTP requests from your web server via PHP code is a "bad thing", and have therefore disabled (or not enabled) the curl extension. There's a really good chance if you find a workaround and they notice it that they'll block your request some other way. Unless there's political reasons forcing you into using this particular host, seriously consider moving your app/page elsewhere if it need to make http requests.

Categories