Why cURL is faster than file_get_contents? - php

There are actually two sub questions:
what's the difference between PHP curl library and libcurl? is the PHP curl library just a bridge to connect and use the libcurl or it is libcurl re-written in PHP language?
Why curl is much faster than the file_get_contents function in PHP?

the difference between PHP curl library and libcurl
PHP/CURL is a "binding" for the underlying libcurl library. It means that there's a bunch of glue code in the PHP curl extension that ultimately calls libcurl to actually perform the transfer operations.
The PHP code doesn't do very much more than converting from PHP conventions to libcurl conventions (and back again). It allows PHP users to take advantage of libcurl's raw native speed and latest developments without anyone having to change anything.
Why is curl faster than file_get_contents function in PHP?
Both are implemented in C and offer file transfer capabilities for PHP programs. The explanation is probably because of their respective software architectures and particular feature sets that makes one faster than the other for certain use cases.
There have possibly also been more work and efforts spent on optimizing transfer performance in libcurl.
As in most cases, it might be worth benchmarking exactly your case so that you know that you're not relying on speed tests done for cases that have other characteristics than yours.

Related

How does PHP works and what is its architecture ?

Guys recently I decided to go back to PHP and do some more complex stuff than a simple log in page. For 3 years I've been programming with Java/JavaEE and have a good understanding of the architecture of of Java Applications. Basically, a virtual machine ( a simple OS process ) that runs compiled code called byte code. a simple Java web server is basically a java application that listens on provided TCP port for Http requests and responds accordingly of course it is more complicated than that but this is its initial work.
Now, what about PHP ? How does it work ? What, in a nutshell, is its architecture.
I googled about this question but in 90% the articles explain how to implement and construct a web application with PHP which is not what I am looking for.
The biggest difference between a Java web server and PHP is that PHP doesn't have its own built-in web server. (Well, newer versions do, but it's supposed to be for testing only, it's not a production ready web server.) PHP itself is basically one executable which reads in a source code file of PHP code and interprets/executes the commands written in that file. That's it. That's PHP's architecture in a nutshell.
That executable supports a default API which the userland PHP code can call, and it's possible to add extensions to provide more APIs. Those extensions are typically written in C and compiled together with the PHP executable at install time. Some extensions can only be added by recompiling PHP with additional flags, others can be compiled against a PHP install and activated via a configuration file after the fact. PHP offers the PEAR and PECL side projects as an effort to standardise and ease such after-the-fact installs. Userland PHP code will often also include additional third party libraries simply written in PHP code. The advantage of C extensions is their execution speed and low-level system access, the advantage of userland code libraries is their trivial inclusion. If you're administering your own PHP install, it's often simple enough to add new PHP extensions; however on the very popular shared-host model there's often a tension between what the host wants to install and what the developer needs.
In practice a web service written in PHP runs on a third party web server, very often Apache, which handles any incoming requests and invokes the PHP interpreter with the given requested PHP source code file as argument, then delivers any output of that process back to the HTTP client. This also means there's no persistent PHP process running at all times with a persistent state, like Java typically does, but each request is handled by starting up and then tearing down a new PHP instance.
While Java simply saves persistent data in memory, data persistence between requests in PHP is handled via a number of methods like memcache, sessions, databases, files etc.; depending on the specific needs of the situation. PHP does have opcode cache addons, which kind of work like Java byte code, simply so PHP doesn't have to repeat the same parse and compile process every single time it's executing the same file.
Do keep in mind that it's entirely feasible to write a persistent PHP program which keeps running just like Java, it's simply not PHP's default modus operandi. Personally I'm quite a fan of writing workers for specific tasks on Gearman or ZMQ which run persistently, and have some ephemeral scripts running on the web server as "frontend" which delegate work to those workers as needed.
If this sounds like a typical PHP app is much more of a glued-together accumulation of several disparate components, you'd be correct. Java is pretty self-contained, except for external products like RDBMS servers. PHP on the other hand often tends to rely on a bunch of third party products; which can work to its advantage in the sense that you can use best-of-breed products for specific tasks, but also requires more overhead of dealing with different systems.
This is how does PHP work:
(one of the best over the Internet)
In general terms, PHP as an engine interprets the content of PHP files (typically *.php, although alternative extensions are used occasionally) into an abstract syntax tree. The PHP engine then processes the translated AST and then returns the result given whatever inputs and processing are required.
Below image will depict more information
Source: freecodecamp.org

Which is faster? Using PHP's cUrl library or invoking curl utility form shell_exec()

For a PHP project i have to access RESTful API. I was using curl to get familiar with the API. I can access the said API using both PHP's cUrl library and invoking the curl utility using PHP's shell_exec() function. Performance wise, which option would be better and why??
PS: I have my own server with root privilege.
My cautious guess would be not too useful test snippets shows that the curl library is more performant.
Edit: A little test shows, that the library is faster, but not by much. Also, if you fetch millions of URLs, network latency will more likely be a bigger problem.
Performance is pretty much exactly the same, because the same stuff is being executed internally. But you should use the API because it is cleaner.

What are nice use cases for cURL in PHP?

It's evident that the cURL functions are very widely used. But why is that? Is it really only because the extension is mostly enabled per default?
While I can certainly relate to not introducing 3rd party libraries over builtins (DOMDocument vs phpQuery), using curl appears somewhat odd to me. There are heaps of HTTP libraries like Zend_Http or PEAR Http_Request. And despite my disdain for needless object-oriented interfaces, the pull-parameter-procedural API of curl strikes me as less legible in comparison.
There is of course a reason for that. But I'm wondering if most PHP developers realize what else libcurl can actually be used for, and that it's not just a HTTP library?
Do you have examples or actual code which utilizes cURL for <any other things> it was made for?
Or if you just use it for HTTP, what are the reasons. Why are real PHP HTTP libraries seemingly avoided nowadays?
I think this would be related to why do people use the mysql functions instead of mysqli (more object oriented interface) or take a step further and use a data abstraction layer or PDOs.
HTTP_Request2 says that there is a cURL adapter available to wrap around PHP's cURL functions.
Personally a lot of the PEAR extensions I have tried out, I haven't been that impressed with (and I feel less confident with PEAR libraries that are sitting in alpha that haven't been updated in a long time). Whereas the HTTP_Request2 Library does look quite nice
I for one would have used cURL without thinking of looking at a possible PEAR library to use. So thanks for raising my awareness.
The libraries you mentioned aren't default, and from my experience in PHP, I prefer to use less of such libraries; they enable a broader attack surface, decrease reliability, open to future modification/deprecation more than PHP itself.
Then, there's the sockets functionality which, although I've used some times, I prefer to rely on a higher level approach whenever possible.
What have I used CURL for?
As some may know, I'm currently working on a PHP framework. The communication core extension (appropriately called "connect") makes use of CURL as it's base.
I've used it widely, from extracting favicons form websites (together with parser utilities and stuff) to standard API calls over HTTP as well as the FTP layer when PHP's FTP is disabled (through stream wrappers) - and we all know native PHP FTP ain't that reliable.
Functional reasons as mentioned in the comments:
It's very old, [widely used and] well tested code, works reliably
is usually enabled by default
allows very fine grained control over the details of the request.
This might need expanding. By nature of the common-denominator protocol API cURL might provide features that plain HTTP libraries in PHP can't...
Historic reasons:
curl used to be the only thing that could handle cookies, POST, file uploads...
A lot of curl use probably comes from tutorials that pre-date PHP 5.

PHP http_get vs fsockopen to HTTPS server?

In PHP, what are the biggest considerations when choosing between using http_get("https://...") and a sockets loop with fsockopen("ssl://..."), fputs() and fread()?
I’ve seen a couple of implementations lately that use the latter. Is that just old legacy code or is there some good reason for it?
Thanks.
http_get requires a PECL extension, which is not bundled with PHP.
fsockopen is more complicated to use (requires looping, sending the headers manually, reading the headers manually, and, in general, more code), but is part of the PHP (it's always present).
In my opinion, the best fail-safe option is to use the http wrapper, as in:
file_get_contents('https://...')
The http wrapper, however, has its own set of limitations – no digest authentication, no automatic handling of encoded content, etc. So if either the PECL http extension or the curl extension are available, those would probably be a better option.

PHP Difference between Curl and HttpRequest

I have a need to do RAW POST (PUT a $var) requests to a server, and accept the results from that page as a string. Also need to add custom HTTP header information (like x-example-info: 2342342)
I have two ways of doing it
Curl (http://us.php.net/manual/en/book.curl.php)
PHP HTTP using the HTTPRequest (http://us.php.net/manual/en/book.http.php)
What are the differences between the two? what's more lean? faster? Both seem pretty much the same to me...
Curl is bundled with PHP, HTTPRequest is a separate PECL extension.
As such, it's much more likely that CURL will be installed on your target platform, which is pretty much the deciding factor for most projects. I'd only consider using HTTPRequest if you plan to only ever install your software on servers you personally have the ability to install PECL extensions on; if your clients will be doing their own installations, installing PECL extensions is usually out of the question.
This page seems to suggest that HTTPRequest uses CURL under the hood anyway. Sounds like it might offer a slightly more elegant interface to curl_multi_*(), though.
HTTPRequest (and the PECL extension) is built on libcurl.
http://us.php.net/manual/en/http.requirements.php
The HTTPRequest is really just an easier/more syntactically friendly way to perform the same task.
As Frank Farmer mentioned, you're more likely to have a target platform with curl already installed and could have difficulty getting the PECL library installed by the hosting provider.
The HTTPRequest is "kind of" a wrapper for curl. This two quotes from the manual should give you a clue:
It provides powerful request functionality, if built with CURL support. Parallel requests are available for PHP 5 and greater.
The extension must be built with » libcurl support to enable request functionality (--with-http-curl-requests). A library version equal or greater to v7.12.3 is required.
Said that (and said that I've never used this extension myself), looks like if you want your code to look more object oriented, you can go for this one, but it might be a bit slower, though nothing compared with the external call that you are going to make, so I won't consider performance to make my choice. I would give preference to the fact that curl is built in and this other you have to add it yourself, which is unconvenient and reduces portability in case you want to host your app in a shared environment that you don't control.
For the needs that you explained in your question, I would definitely go for curl.

Categories