I am building a small web interface to a database that will run on a Pogoplug Pro (128MB RAM). The app is unlikely to ever have more than four or five simultaneous users, and will run with sqlite as the database backend. Is it feasible to use a Lightttpd - PHP combination (with fastcgi) on this system? For other reasons enabling swap is not an option. Or should I try to use more lightweight languages such as Python?
PHP is indeed a memory hog as it allocates memory for all the different types of c variable (int, float, string, boolean etc) for every single variable you declare (Source). I'm not sure about the memory footprints of other languages. But I would suggest looking into HipHop for PHP.
Hiphop is an open source project released by Facebook a couple of years ago that compiles PHP code into highly optimised C++ that runs directly on the underlying OS. Once you hit compile, you get a full web stack with your PHP application bundled into it that runs fast and uses less memory. You can find hiphop at GitHub here. I'm not sure how mature it is, but it's certainly a possibility for your situation :)
Just so you know, I don't work for facebook or hiphop, I just think its a really clever system :)
Related
What is the difference in how they are handled?
Specifically, why is it common to find Python used in production-level long lived applications like web-servers while PHP isn't given their similar efficiency levels?
PHP was designed as a hypertext scripting language. Every process was designed to end after a very short time. So memory management and GC basically didn't matter.
However the ease and popularity of PHP have invoked its usage in long lived programs such as daemons, extensive calculations, socket servers etc.
PHP 5.3 introduced a lot of features and fixes that made it suitable for such applications, however in my opinion memory management was of lower significance on that matter.
PHPs error management is quite good now, but as in every programming language that I know of you can produce memory leaks.
You still cannot code in the same style that you can code Java or Python applications. A lot of PHP programs will probably show severe problems where Java/Python do not.
You can characterize this as "worse", but I would not. PHP just is a different set of tools that you have to handle different.
The company I work at has a lot of system programs and daemons written in PHP that run like a charm.
I think the biggest caveat for PHP when it comes to as you describe "production-level long lived applications" is its multi-processing and threading ability (the 2nd is basically nonexistent).
Of course there is the possibility to fork processes, access shared memory, do inter process communications and have message queues and stuff. But Python is far ahead on that matter, because it was designed bottom up for jobs like that.
There are many scenarios where I've questioned PHP's performance with some of its functions, and whether I should build a complex class to handle specific things using its seemingly slow tools.
For example, Complex regular expressions with sed and processing with awk would seemingly be exponential in performance rather than making PHP's regular expression and seemingly excessive functions parse and in time manage to finish it. If I were to do a lot of network tasks such as MX lookups/DIGging/retrieving simultaneously I would rather pass it via system() and let the OS handle it itself. There are simply too many functions in PHP, that are inefficient and result in slow pages or can be handled easier by the OS.
What are your opinions?
Do you think I should do the hard work with the OS in its own/custom functions?
System calls can very often be faster than using a solution built in PHP (although that doesn't always hold true, seeing as PHP's functions are themselves built and compiled in C. Many PHP core functions and extensions are pretty fast).
Apart from speed, a second factor is the memory limit. Externally called processes don't eat away at PHP's per-script limitation, which can be great when working with large files for example.
Also, some functions simply are not available in PHP itself. There is no way to imitate the feature set of ImageMagick entirely within PHP, for example. The GD library doesn't come close to what ImageMagick has to offer.
The big, big minus is that by using system commands, you effectively eliminate portability, which is part of PHP's beauty. Moving the application to a different server becomes a huge burden because the feature set of the external commands needs to be identical - and that isn't always the case even across different Linux distros, not to speak of crossing the OS border into Windows or Unix-based Mac OS. I have myself experienced issues with wget and ImageMagick in this respect, I'm sure there are many more.
If you are working on a custom application for which you entirely control the server environment (and the decision what kind of servers will be bought in the next five years), that may not be a problem. It will be one, though, if you build software that needs to be portable.
I personally tend to rather cut away a feature (that would need an external dependency) than lose portability, but then, I am in the trade of building portable applications very much. It really depends on your focus.
Even if system processes are faster and hog less memory (extensive testing is a must here), there's something to keep in mind:
I'd be cautious with using system() calls and only use it if you control the hardware your script will run on. Using those calls may result in the need to install further software / packages and may not work (the same way) on all OS, so if you can't control the server, I'd stick with PHP-functions.
I think that would be slower actually, because each time you call such a function the OS would start a new process, and that is time consuming.
I'd say if your program is intended to be executed on the shell using other external programs like sed/awk is fine as shellscripts also make excessive use of external programs and a php script run on the shell is just like a shell script, just in another language.
However, if it's a web application, better do it in php - most shared hosting environments don't allow you to execute external programs from php scripts.
(My experience is that "system calls" usually refers to calling kernel operations - not invoking other programs - "pass it via system() and let the OS handle it" - you seem to think the same - but none of the programs you mention are OS services - they are just other programs).
PHP is essentially a scripting language - which conventionally are just a glue for moving data between other programs, but some things to consider:
1) performance - forking a new process can be computationally expensive
2) security - giving your webserver unlimited access to all the programs on the system (even constrained by permissions) is potentially very dangerous
3) bearing in mind (2) most configurations will prevent or limit what you can do
4) for large scale development this is rather dangerous - letting programmers write their code in any language of their choice then putting a skim of PHP over the top means you will end up with an application written in lots of different languages
5) You can write your own native code PHP extensions quite easily
If I were to do a lot of network tasks such as MX lookups/DIGging/retrieving
While I could believe that mashing up large data sets using awk/sed might be faster/more efficient then native PHP code, I find it a bit surprising that DNS lookups are faster using a different client. How did you measure this?
The news in the PHP world today is Facebook's HipHop, which:
HipHop for PHP isn't technically a compiler itself. Rather it is a source code transformer. HipHop programmatically transforms your PHP source code into highly optimized C++ and then uses g++ to compile it. HipHop executes the source code in a semantically equivalent manner and sacrifices some rarely used features — such as eval() — in exchange for improved performance. HipHop includes a code transformer, a reimplementation of PHP's runtime system, and a rewrite of many common PHP Extensions to take advantage of these performance optimizations.
My question is, what type of web applications is this actually useful for?
Seems like typical database-bound web apps may not be greatly served by this, but rarer CPU-bound apps would.
Web applications that do a lot of processing and/or use a lot of memory. Apparently this HipHop will reduce CPU usage by around 50% and also reduce memory usage (I didn't see how much the memory usage would be reduced by mentioned anywhere). This means that you should be able to serve the same number of requests with fewer servers.
An added benefit may be that there will be some basic type checking to ensure that the code is consistent before it is compiled. This should help to locate the type of bugs that PHP currently tends to ignore as a result of its weak type system.
The downside appears to be that it might not support some of PHP's more dynamic features such as eval (though arguably that's a positive too).
Well it "transforms" PHP into C++ to help performance of a largely scalable website.
So, HipHop is for when you have a website that you started at Harvard that you quickly grow into a billion dollar company and that people are making a movie about starring Justin Timberlake. When you have such a website and want to save CPU cycles, but don't want to rewrite your codebase, you use HipHop.
If you are just starting out, unless you are trapped on a desert island with only PHP programmers that refuse to learn a more scalable language, you don't use HipHop.
Running machine code over interpreted code is faster. This is useful in one sense, but also reduces the amount of machines you require, as each processor has less work to do.
This is good for a company like Facebook, in that they can cut the amount of machines they need.
In terms of why it's useful for them, they probably run a lot of sorting and indexing, on the large amounts of data they have.
This article:
http://terrychay.com/article/hiphop-for-faster-php.shtml
answers this question perfectly with its series of "if" statements.
You can think of it as some sort of compiler that takes in a bunch of .php files, and generate a bunch of c++ files for which you can then compile using g++ (Not sure if other compilers are supported). The resulting exe is your web application with a web server included. That means you could run the exe and you are good to go. The web server is based on libevent and supposedly pretty efficient.
Hip Hop is essentially pointless to everyone except Facebook and other gigantic PHP-based sites. I'm sure many people will jump on the bandwagon due to "it's fast" but how many PHP based apps use whole server farms?
Just because you are working on a social network site, doesn't mean you should consider using HH.
If I write a hello world app using a PHP web framework such as CodeIgniter and then I compile it and run it using HipHop. Will it run faster than if I write the same hello world app in django or rails?
HIPHOP converts php code into C++ code, which needs to be compiled to run. Since pre-compiled code runs faster and uses less memory then scriping languages like python/php it will probably run faster in the example you have given.
However, HIPHOP does not convert all code. A lot of code in php is dynamic and can not be changed to c++, this means you will have to write your code with this in mind. If codeigniter can even be compiled using HIPHOP is another question.
Terry Chay wrote a big article about HIPHOP, covering when to use it, it's limitations and future. I would recomment reading this, as it will most likely answer most of your questions and give you some insight into how it works :)
http://terrychay.com/article/hiphop-for-faster-php.shtml
At that point the run time is inconsequential. HipHop was designed for scaling... meaning billions of requests. There's absolutely no need to use something like HipHop for even a medium size website.
But more to the point of your question... I don't think there have been comparison charts available for us to see, but I doubt the run time would be faster at that level.
i don't know about django or rails, so this is a bit off-topic.
with plain php, the request goes to apache, then to mod_php. mod_php loads the helloworld.php script from disk, parses & tokenizes it, compiles it to bytecode, then interprets the bytecode, passes the output back to apache, apache serves it to the user.
with php and an optimizer the first run is about the same as with plain php, but the compiled source code is stored in ram. then, for the second request: goes to apache, apache to mod_php, apc loads bytecode from ram, interprets it, passes it back to apache, back to the user.
with hiphop there is no apache, but hiphop itself and there's no interpreter, so request goes directly to hiphop and back to the user. so yes, it's faster, because of several reasons:
faster startup because there's no bytecode compilation needed - the program is already in machine-readable code. so no per-request compilation and no source file reading.
no interpreter. machine code is not necessarily faster - that depends on the quality of source translation (hiphop) and the quality of the static compiler (g++). hiphop translated code is not fast compared to hand-written c code, because there's a bit of overhead because of type handling and such.
with node.js, there's also no apache. the script is started and directly compiled to machine code (because the V8 compiler does that), so it's kind of AOT (ahead of time) compiling (or is it still called JIT? i don't really know). every request is then directly handled by the already compiled machine code; so node.js is actually very comparable to hiphop. i assume hiphop to be multithreaded or something like this, while node does evented IO.
facebook claims a 50% speed gain, which is not really that much; if you compare the results of the language shootout, you'll see for the execution speed of assorted algorithms, php is 5 to 250 times slower.
so why only 50%? because ...
web apps depend on much more than just execution speed, e.g. IO
php's type system prevents hiphop to make the best use of c++'s static types
in practice, a lot of php is already C, because most of the functionality is either built in or comes from extensions. extensions are programmed in C and statically compiled.
i'm not sure if there was a huge performance gain for hello world, because hello world, even with a good framework, is still so small execution speed could be negligible in comparison to all the other overhead (network latency and stuff).
imo: if you want speed and ease of use, go for node.js :)
Running a simple application is always faster in any language. When it's become as complex as facebook, then you will face numerous of problems. PHP slowness will be show it's face. In same times, converting existing code to another language is not an options, since all logic and code is not so easy to translated to other language's syntax. That's why facebook developer decide to keep the old code, and make PHP faster. That's the reason they create their own PHP compiler, called HipHop.
Read this story from the perspective one of Facebook developer, so you know the history of HipHop.
That is not really an apple to apples comparison. In the most level playing field you might have something like:
Django running behind apache
Django rendering an HTML template to say hello world (no caching)
AND
HPHP running behind apache
HPHP rendring an HTML template to say hello world (again, no caching)
There is no database, almost no file I/O, and no caching. If you hit the page 10,000 times with a load generator at varying concurrency levels you will probably find that HPHP will outperform Django or rails - that is to say it can serve render more pages per second and keep up with your traffic a bit better.
The question is, will you ever have this many concurrent users? If you will, will they likely be hitting a database or a cached page?
HPHP sounds cool, but IMHO there is no reason to jump ship just yet (unless you are getting lots of traffic, in which case it might make sense to check it out).
Will it run faster than if I write the
same hello world app in django or
rails?
It probably will, but don't fret. If we're talking prospective speed improvements from yet unreleased projects, Pythonistas have pypy-jit and unladen-swallow to look forward to ;)
I want to benchmark PHP vs Pylons. I want my comparison of both to be as even as possible, so here is what I came up with:
PHP 5.1.6 with APC, using a smarty template connecting to a MySQL database
Python 2.6.1, using Pylons with a mako template connecting the the same MySQL database
Is there anything that I should change in that setup to make it a more fair comparison?
I'm going to run it on a spare server that has almost no activity, 2G of ram and 4 cores.
Any suggestions of how I should or shouldn't benchmark them? I plan on using ab to do the actual benchmarking.
Related
Which is faster, python webpages or php webpages?
If you're not using an ORM in PHP you should not use the SQLAlchemy ORM or SQL-Expression language either but use raw SQL commands. If you're using APC you should make sure that Python has write privileges to the folder your application is in, or that the .py files are precompiled.
Also if you're using the smarty cache consider enabling the Mako cache as well for fairness sake.
However there is a catch: the Python MySQL adapter is incredible bad. For the database connections you will probably notice either slow performance (if SQLAlchemy performs the unicode decoding for itself) or it leaks memory (if the MySQL adapter does that).
Both issues you don't have with PHP because there is no unicode support. So for total fairness you would have to disable unicode in the database connection (which however is an incredible bad idea).
So: there doesn't seem to be a fair way to compare PHP and Pylons :)
your PHP version is out of date, PHP has been in the 5.2.x area for awhile and while there are not massive improvements, there are enough changes that I would say to test anything older is an unfair comparison.
PHP 5.3 is on the verge of becomming final and you should include that in your benchmarks as there are massive improvements to PHP 5.x as well as being the last version of 5.x, if you really want to split hairs PHP 6 is also in alpha/beta and that's a heavy overhaul also.
Comparing totally different languages can be interesting but don't forget you are comparing apples to oranges, and the biggest bottleneck in any 2/3/N-Tier app is waiting on I/O. So the biggest factor is your database speed, comparing PHP vs Python VS ASP.Net purely on speed is pointless as all 3 of them will execute in less than 1 second but yet you can easily wait 2-3 seconds on your database query, depending on your hardware and what you are doing.
If you are worried what is faster, you're taking the absolute wrong approach to choosing a platform. There are more important issues, such as (not in order):
a. How easily can I find skilled devs in that platform
b. How much do those skilled devs cost
c. How much ROI does the language offer
d. How feature rich is the language