I'm a php developer as well as cpp developer. I was wondering: if I make a cpp binary and I run it on php. Will that make my process run faster?
For example:
I have to compare 1,000 array elements and execute a process for each of them and in some cases I had to run it over and over again ( recursively) . Yes is messup but it works !.
Yes, this might be faster. It's also very hard to do right (lots of corner cases in IPC).
Don't try this unless it's absolutely necessary for performance. First try to improve the algorithm in PHP.
Don't use the C++ code in production until you've measured the difference, and the C++ solution is significantly faster.
Don't run a binary, write a library and link it into the PHP interpreter. PHP is implemented in C, so export your C++ functions to C using extern "C".
I never did that in php, but in python I can tell you that it's a hell of a good way to squeeze performance. But don't overdo it: just implement in C what you know is a bottleneck, otherwise you will just create a monster.
Be sure to profile your code first and make sure you've actually identified the bottleneck. If its working now, it should be easy to include XDebug in your code so that you can measure its performance and profile your function calls. Maybe your function call isn't the bottleneck, in which case all your work would be wasted.
After that, see if there are any architectural issues before you switch languages. If there is a scalability problem, switching over to a faster language will just delay the issue.
Related
I have a PHP script I'd like to compile into a standalone command-line executable for running on Linux.
Is this realistic? Are there compilers for this?
I know that there are PHP compilers around, but my question is oriented more to whether it is of advantage, and which is the best compiler to use.
Will it be faster or slower than running it through PHP? If it would speed up my script then that would be great, since it does a lot of processing (lots of loops and math) and takes an hour to run.
My understanding is that HipHop compiles to C++, not C, and that the process is quite easy. FaceBook has applied it to vast amounts of code.
I'm not sure you get a .exe, but OP only wants "faster". He might get quite a good speedup; HipHop compiled-code averages only a factor of 2 or so faster, but that's because much of PHP execution is really library calls. He might get pretty good speedup on his computation part, if the HipHop compiler can figure out what the data types are in the computation code. He might have to modify his code somewhat to make this clear to the HipHop compiler, by not using his computation varaibles for anything but computation but that should be only a minor source code change. I'd expect the HipHop site to contain hints about what to do to speed up HipHop compiled code along these lines.
All of this is educated guess based on what I understand and not actual experience. YMMV.
Well, compiling itself will add very little if your script runs for hours.
The best solution would be to use another language, C seems woud be the best choice.
you can try to translate your PHP to C using hiphop compiler but it is not the task I'd call easy
I am writing an application for image processing. I am just wondering which programming language would be the best fit. Python or PHP. This process is system based not web based so I am just thinking if Python could be of more help.
Let me know your thoughts!
Python has stuff like SciPy and PIL so probably Python.
Here is how people use it:
Peak detection in a 2D array
How can I improve my paw detection?
One cannot suggest much without knowing the kind of image processing you have in mind.
If you just want to do some generic rotate/resize/etc then I guess there isn't much difference.
If you want to do something more complex, then study the libraries and decide which fits best for your particular task.
If you want to do something really custom, then C or similar language might be a better fit for the task.
Possibly neither; it depends on what you want to do.
Both PHP and Python are scripting languages, and are therefore not suited for high-performance numerical routines (which most image processing requires). However, they both have a number of image-processing libraries available for them, the innards of which are probably written in C. These will be fast.
If these libraries do what you want, then fine. If you need to so something custom, then you're probably better off with C or C++ (or Pascal, or whatever) if speed of execution is of concern.
Python is more clean and readable.
For image processing there is the imageMagic library available for Python and PHP too.
If you want to do some other complex image processing, which cannot by done using a library and you still want to do it in Python or PHP, then Python is defenitely the answer as Python can be extended with C -- But, wait, you didn't mention programming in C, well, there is Cython! It would allow you to write Python modules which are afterwards compiled to C
It really depends on what you want to do with the images. You probably should just use a batch or similar script to run a command that does the processing your looking for.
Between the two languages, I would go with python. The command line interface for php is only a recent addition, while python was designed primarily as a scripting language, not for serving pages. For a console application, python is a better fit.
I know you can minify PHP, but I'm wondering if there is any point. PHP is an interpreted language so will run a little slower than a compiled language. My question is: would clients see a visible speed improvement in page loads and such if I were to minify my PHP?
Also, is there a way to compile PHP or something similar?
PHP is compiled into bytecode, which is then interpreted on top of something resembling a VM. Many other scripting languages follow the same general process, including Perl and Ruby. It's not really a traditional interpreted language like, say, BASIC.
There would be no effective speed increase if you attempted to "minify" the source. You would get a major increase by using a bytecode cache like APC.
Facebook introduced a compiler named HipHop that transforms PHP source into C++ code. Rasmus Lerdorf, one of the big PHP guys did a presentation for Digg earlier this year that covers the performance improvements given by HipHop. In short, it's not too much faster than optimizing code and using a bytecode cache. HipHop is overkill for the majority of users.
Facebook also recently unveiled HHVM, a new virtual machine based on their work making HipHop. It's still rather new and it's not clear if it will provide a major performance boost to the general public.
Just to make sure it's stated expressly, please read that presentation in full. It points out numerous ways to benchmark and profile code and identify bottlenecks using tools like xdebug and xhprof, also from Facebook.
2021 Update
HHVM diverged away from vanilla PHP a couple versions ago. PHP 7 and 8 bring a whole bunch of amazing performance improvements that have pretty much closed the gap. You now no longer need to do weird things to get better performance out of PHP!
Minifying PHP source code continues to be useless for performance reasons.
Forgo the idea of minifying PHP in favor of using an opcode cache, like PHP Accelerator, or APC.
Or something else like memcached
Yes there is one (non-technical) point.
Your hoster can spy your code on his server. If you minify and uglify it, it is for spys more difficult to steal your ideas.
One reason for minifying and uglifying php may be spy-protection. I think uglyfing code should one step in an automatic deployment.
With some rewriting (shorter variable names) you could save a few bytes of memory, but that's also seldomly significant.
However I do design some of my applications in a way that allows to concatenate include scripts together. With php -w it can be compacted significantly, adding a little speed gain for script startup. On an opcode-enabled server this however only saves a few file mtime checks.
This is less an answer than an advertisement. I'm been working on a PHP extension that translates Zend opcodes to run on a VM with static typing. It doesn't accelerate arbitrary PHP code. It does allow you to write code that run way faster than what regular PHP allows. The key here is static typing. On a modern CPU, a dynamic language eats branch misprediction penalty left and right. Fact that PHP arrays are hash tables also imposes high cost: lot of branch mispredictions, inefficient use of cache, poor memory prefetching, and no SIMD optimization whatsoever. Branch misprediction and cache misses in particular are achilles' heel for today's processors. My little VM sidesteps those problem by using static types and C array instead of hash table. The result ends up running roughly ten times faster. This is using bytecode interpretation. The extension can optionally compile a function through gcc. In that case, you get two to five times more speed.
Here's the link for anyone interested:
https://github.com/chung-leong/qb/wiki
Again, the extension is not a general PHP accelerator. You have to write code specific for it.
There are PHP compilers... see this previous question for a list; but (unless you're the size of Facebook or are targetting your application to run client-side) they're generally a lot more trouble than they're worth
Simple opcode caching will give you more benefit for the effort involved. Or profile your code to identify the bottlenecks, and then optimise it.
You don't need to minify PHP.
In order to get a better performance, install an Opcode cache; but the ideal solution would be to upgrade your PHP to the 5.5 version or above because the newer versions have an opcode cache by default called Zend Optimiser that is performing better than the other ones http://massivescale.blogspot.com/2013/06/php-55-zend-optimiser-opcache-vs-xcache.html.
The "point" is to make the file smaller, because smaller files load faster than bigger files. Also, removing whitespace will make parsing a tiny bit faster since those characters don't need to be parsed out.
Will it be noticeable? Almost never, unless the file is huge and there's a big difference in size.
If I write a hello world app using a PHP web framework such as CodeIgniter and then I compile it and run it using HipHop. Will it run faster than if I write the same hello world app in django or rails?
HIPHOP converts php code into C++ code, which needs to be compiled to run. Since pre-compiled code runs faster and uses less memory then scriping languages like python/php it will probably run faster in the example you have given.
However, HIPHOP does not convert all code. A lot of code in php is dynamic and can not be changed to c++, this means you will have to write your code with this in mind. If codeigniter can even be compiled using HIPHOP is another question.
Terry Chay wrote a big article about HIPHOP, covering when to use it, it's limitations and future. I would recomment reading this, as it will most likely answer most of your questions and give you some insight into how it works :)
http://terrychay.com/article/hiphop-for-faster-php.shtml
At that point the run time is inconsequential. HipHop was designed for scaling... meaning billions of requests. There's absolutely no need to use something like HipHop for even a medium size website.
But more to the point of your question... I don't think there have been comparison charts available for us to see, but I doubt the run time would be faster at that level.
i don't know about django or rails, so this is a bit off-topic.
with plain php, the request goes to apache, then to mod_php. mod_php loads the helloworld.php script from disk, parses & tokenizes it, compiles it to bytecode, then interprets the bytecode, passes the output back to apache, apache serves it to the user.
with php and an optimizer the first run is about the same as with plain php, but the compiled source code is stored in ram. then, for the second request: goes to apache, apache to mod_php, apc loads bytecode from ram, interprets it, passes it back to apache, back to the user.
with hiphop there is no apache, but hiphop itself and there's no interpreter, so request goes directly to hiphop and back to the user. so yes, it's faster, because of several reasons:
faster startup because there's no bytecode compilation needed - the program is already in machine-readable code. so no per-request compilation and no source file reading.
no interpreter. machine code is not necessarily faster - that depends on the quality of source translation (hiphop) and the quality of the static compiler (g++). hiphop translated code is not fast compared to hand-written c code, because there's a bit of overhead because of type handling and such.
with node.js, there's also no apache. the script is started and directly compiled to machine code (because the V8 compiler does that), so it's kind of AOT (ahead of time) compiling (or is it still called JIT? i don't really know). every request is then directly handled by the already compiled machine code; so node.js is actually very comparable to hiphop. i assume hiphop to be multithreaded or something like this, while node does evented IO.
facebook claims a 50% speed gain, which is not really that much; if you compare the results of the language shootout, you'll see for the execution speed of assorted algorithms, php is 5 to 250 times slower.
so why only 50%? because ...
web apps depend on much more than just execution speed, e.g. IO
php's type system prevents hiphop to make the best use of c++'s static types
in practice, a lot of php is already C, because most of the functionality is either built in or comes from extensions. extensions are programmed in C and statically compiled.
i'm not sure if there was a huge performance gain for hello world, because hello world, even with a good framework, is still so small execution speed could be negligible in comparison to all the other overhead (network latency and stuff).
imo: if you want speed and ease of use, go for node.js :)
Running a simple application is always faster in any language. When it's become as complex as facebook, then you will face numerous of problems. PHP slowness will be show it's face. In same times, converting existing code to another language is not an options, since all logic and code is not so easy to translated to other language's syntax. That's why facebook developer decide to keep the old code, and make PHP faster. That's the reason they create their own PHP compiler, called HipHop.
Read this story from the perspective one of Facebook developer, so you know the history of HipHop.
That is not really an apple to apples comparison. In the most level playing field you might have something like:
Django running behind apache
Django rendering an HTML template to say hello world (no caching)
AND
HPHP running behind apache
HPHP rendring an HTML template to say hello world (again, no caching)
There is no database, almost no file I/O, and no caching. If you hit the page 10,000 times with a load generator at varying concurrency levels you will probably find that HPHP will outperform Django or rails - that is to say it can serve render more pages per second and keep up with your traffic a bit better.
The question is, will you ever have this many concurrent users? If you will, will they likely be hitting a database or a cached page?
HPHP sounds cool, but IMHO there is no reason to jump ship just yet (unless you are getting lots of traffic, in which case it might make sense to check it out).
Will it run faster than if I write the
same hello world app in django or
rails?
It probably will, but don't fret. If we're talking prospective speed improvements from yet unreleased projects, Pythonistas have pypy-jit and unladen-swallow to look forward to ;)
Most of my application is written in PHP ((Front and Back ends).
There is a part that works too slowly and I will need to rewrite it, probably not in PHP.
What will give me the following:
1. Most speed
2. Fastest development
3. Easily maintained.
I have in my mind to rewrite this piece of code in CPP as a PHP extension, but may be I am locked on this solution and misses some simpler/better solutions?
The algorithm is PorterStemmerAlgorithm on several MB of data each time it is run.
The answer really depends on what kind of process it is.
If it is a long running process (at least seconds) then perhaps an external program written in C++ would be super easy. It would not have the complexities of a PHP extension and it's stability would not affect PHP/apache. You could communicate over pipes, shared memory, or the sort...
If it is a short running process (measured in ms) then you will most likely need to write a PHP extension. That would allow it to be invoked VERY fast with almost no per-call overhead.
Another possibility is a custom server which listens on a Unix Domain Socket and will quickly respond to PHP when PHP asks for information. Then your per-call overhead is basically creating a socket (not bad). The server could be in any language (c, c++, python, erlang, etc...), and the client could be a 50 line PHP class that uses the socket_*() functions.
A lot of information needs evaluated before making this decision. PHP does not typically show slowdowns until you get into really tight loops or thousands of repeated function calls. In other words, the overhead of the HTTP request and network delays usually make PHP delays insignificant (unless the above applies)
Perhaps there is a better way to write it in PHP?
Are you database bound?
Is it CPU bound, Network bound, or IO bound?
Can the result be cached?
Does a library already exist which will do the heavy lifting.
By committing to a custom PHP extension, you add significantly to the base of knowledge required to maintain it (even above C++). But it is a great option when necessary.
Feel free to update your question with more details, and I'm sure Stack Overflow will be happy to help out.
Suggestion
The PorterStemmerAlgorithm has a C implementation available at http://tartarus.org/~martin/PorterStemmer/c.txt
It should be an easy matter to tie this C program into your data sources and make it a stand alone executable. Then you could simply invoke it from PHP with one of the proc functions, such as proc_open()
Unless you need to invoke this program many times PER php request, then this approach should save you the effort of building and integrating a PHP extension, not to mention that the hard work (in c) is already done.
Am not sure about what the PorterStemmerAlgorithm is. However if you could make your process run in parallel and collect the information together , you could look at parallel running processes easily implemented in JAVA. Not sure how you could call it in PHP, but definitely maintainable.
You can have a look at this framework. Looks simple to implement
https://computefarm.dev.java.net/
Regards,
Franklin.
If you absolutely need to rewrite in a different language for speed reasons then I think gahooa's answer covers the options nicely. However, before you do, are you absolutely sure you've done everything you can to improve the performance if the PHP implementation?
Is caching the output viable in your situation? Could you get away with running the algorithm once and caching the output rather than on every page load?
Have you tried profiling the code to ensure there's no unnecessary work being done (db queries in an inner loop and the like). Xdebug can help here.
Are there other stemming algorithms available which might perform better on your dataset?