PHP based Web Applications vs CGI [closed]

PHP based Web Applications vs CGI [closed] - php

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I've been working with Java webapps for quite sometime now. I was wondering how PHP based webapps are different from Java based ones. I could find some differences about security, availability of libraries etc.,
Basically, Java (servlet) based webapps are multi-threaded environments, where by a single process caters to the needs of different requests. How does this work in PHP?
a) Is every request a single process? In such a case how can we ensure memory utilization and other shared resources are under control?
b) Is there a concept called application scope/singleton in PHP based webapps?
c) Can we have connection pools?
Basically, how is PHP different from CGI?
The question might sound silly for PHP developers. I'll be happy to know there is already a place where the differences are documented. Thanks.

Java web applications are hosted (i.e. run within) a stateful, long-running Java process; because of this you can take advantage of in-memory object caching and the ability to manipulate threads.
The standard CGI model (ignoring FastCGI for now) is considerably simpler: a process is started and is passed the incoming HTTP request. The CGI process handles the request itself (including the creation of its own threads, if necessary) and then returns the HTTP response to the process that spawned the child CGI process. The CGI process is then terminated - so anything kept in-memory is lost and has to be serialized to some kind of persistance medium such as a database or file on disk.
(Speculation: CGI's design probably has to do with limited resources available on servers in the early 1990s, and how websites weren't visited all that often so it didn't make sense to use memory like that; finally if you're working on a huge scalable project then you probably won't be interested in in-memory caching because you'll have a dedicated state server).
PHP is a CGI system, so it inherits the limitations of the "one server process per request" model - as for not supporting threads: it seems to be a conscious decision of PHP's developers because it vastly simplifies the system (no need to worry about synchronisation, for example, and because PHP is probably the number one "beginner's language" nowadays it makes sense not to give them enough rope to hang themselves) - besides if you need to use multi-threading in a webserving scenario then PHP is probably the wrong tool to use in the first place.
PHP is not different to CGI - PHP implements CGI. Java webservers do not use CGI (at least not to serve Java applications, and note that there do exist CGI implementations of Java servlet hosts, but let's not complicate things).
Because PHP is not stateful it means it can't pool connections - but in practice this isn't a problem. When you pair PHP with MySQL you'll find that operations are surprisingly cheap. You can connect to a database, grab some data from a SELECT, and return a nicely formatted HTML table all in under 5ms even on a decade-old machine. It doesn't matter what platform you use so long as page generation times are kept under 30ms (my personal objective time limit for a good user experience).

Related

Working with Threads using Apache and PHP

Is there anyway to work with threads in PHP via Apache (using a browser) on Linux/Windows?

The mere fact that it is possible to do something, says nothing whatever about whether it is appropriate.
The facts are, that the threading model used by pthreads+PHP is 1:1, that is to say one user thread to one kernel thread.
To deploy this model at the frontend of a web application inside of apache doesn't really make sense; if a front end controller instructs the hardware to create even a small number of threads, for example 8, and 100 clients request the controller at the same time, you will be asking your hardware to execute 800 threads.
pthreads can be deployed inside of apache, but it shouldn't be. What you should do is attempt to isolate those parts of your application that require what threading provides and communicate with the isolated multi-threading subsystem via some sane form of RPC.
I wrote pthreads, please listen.

Highly discouraged.
The pcntl_fork function, if allowed at all in your setup, will fork the Apache worker itself, rather than the script, and most likely you won't be able to claim the child process after it's finished.
This leads to many zombie Apache processes.
I recommend using a background worker pool, properly running as a daemon/service or at least properly detached from a launching console (using screen for example), and your synchronous PHP/Apache script would push job requests to this pool, using a socket.
Does it help?
[Edit] I would have offered the above as a commment, but I did not have enough reputation to do so (I find it weird btw, to not be able to comment because you're too junior).
[Edit2] pthread seems a valid solution! (which I have not tried so I can't advise)

The idea of "thread safe" can be very broad. However PHP is on the very, furthest end of the spectrum. Yes, PHP is threadsafe, but the driving motivation and design goals focus on keeping the PHP VM safe to operate in threaded server environments, not providing thread safe features to PHP userspace. A huge amount of sites use PHP, one request at a time. Many of these sites change very slowly - for example statistics say more sites serve pages without Javascript still than sites that use Node.js on the server. A lot of us geeks like the idea of threads in PHP, but the consumers don't really care. Even with multiple packages that have proved it's entirely possible, it will likely be a long time before anything but very experimental threads exist in PHP.
Each of these examples (pthreads, pht, and now parallel) worked awesome and actually did what they were designed to do - as long as you use very vanilla PHP. Once you start using more dynamic features, and practically any other extensions, you find that PHP has a long way to go.

How apache runs PHP for pages [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Imagine there is a PHP page on http://server/page.php.
Client(s) send 100 requests from browser to the server for that page simultaneously.
Does the server run 100 separate processes of php.exe simultaneously?
Does it re-interpret the page.php 100 times?

The answer is highly variable, according to server config.
Let's answer question 1 first:
Does the server run 100 separate processes of php.exe simultaneously?
This depends on the way PHP is installed. If PHP is being run via CGI, then the answer is "Yes, each request calls a separate instance of PHP". If it's being run via an Apache module, then the answer is "No, each request starts a new PHP thread within the Apache executable".
Similar variations will exist for other web servers. Please note that for a Unix/Linux based operating system, running separate copies of the executable for each request is not necessarily a bad thing for performance; the core of the OS is designed such that in many cases, tasks are better done by a number of separate executables rather than one monolithic one.
However, no matter what you do about it, having large numbers of simultaneous requests will drain your server resources and lead to timeouts and errors for your users. This is why it is important for your PHP programs to finish running as quickly as possible. Do not write PHP programs for web consumption that are slow to run; if you're likely to have a lot of traffic, you need to test for performance as much as you do for functionality. Having your programs exit quickly will dramatically reduce the likelihood of having a significant number of simultaneous requests, which will obviously have a big impact on your site's performance.
Now your second question:
Does it re-interpret the page.php 100 times?
For a standard PHP installation, the answer here is "Yes it does, and yes it does have a performance impact."
However, PHP provides several Caching solutions that are designed specifically to mitigate this. The main options are APC and the Zend Cache, either of which can be installed as standard modules. Using these modules will mean that PHP caches the interpreted code, so it can be run much faster for subsequent calls.
The Zend Cache will be included as part of the standard PHP installation as of the forthcoming PHP 5.5 release.

Apache2 has multiple different mode to work.
In "prefork" (the most commonly used) mode, Apache will create process for every request, each process will run a own php.exe. Config file will assign a maximum number of connections (MaxClients in httpd.conf), Apache will only create MaxClients. This is to prevent memory exhaustion. More requests are queued, waiting for the previous request to complete.
If you do not install opcode cache extensions like APC, XCache, eAccelerator, php.exe will re-interpret the page.php 100 times.

It depends.
There are different ways of setting things up, and things can get quite complex.
The short answer is 'more or less'. A number of apache processes will be spawned, the PHP code will be parsed and run.
If you want to avoid the parsing overhead use an opcode cache. APC (Alternative PHP Cache) is a very popular one. This has a number of neat features which are worth digging into, but without any config other than installing it it will ensure that each php page is only parsed into opcode once.
To change how many apache services are spawned, most likely you'll be using MPM Prefork. This lets you decide if how you want Apache to deal with multiple users.
For general advice, in my experience (small sites, not a huge amount of traffic), installing APC is worth doing, for everything else the defaults are not too bad.

There are a number of answers to this. In general, Apache will create a process for an incoming request, so it is possible that 100 process are created. However, a process takes time to create, so it might be that by the time a process has finished and died, one of those 100 connections comes in a fraction of a second later (since 100 connections at exactly the same time is very rare indeed, unless you're Google).
However, let us imagine that 100 processes do really need to be held in memory simultaneously, but that there is only room for 50 in available server RAM. In that case, 50 connections will be served, and 50 will have to wait for processes to die and be re-spawned. Thus, a random half of those requests will be delayed, though if a process create-process-die sequence only takes a fraction of a second, they won't have to wait very long. This is why, when improving server capacity, reducing your page load time is as important as adding more RAM - the sooner a process finishes, the sooner a new one can take its place.
One way, incidentally, to reduce load time is to spawn a number of PHP processes and hold them in memory. This is the basis of FastCGI (or fcgid, which is compatible). Rather than creating and killing a process for every request, a process is spawned in memory immediately and is re-used for several requests. For PHP, these are usually configure to die after a certain number of page requests (e.g. 1000) as historically PHP has had quite a lot of memory leaks (the more a process is reused, the worse the memory leaks get).
You ask if a page is re-interpreted for every request. Normally yes, but if you also run a PHP Accelerator, then no - the byte-code that PHP compiles to is cached and reused. Thus, mixing the FastCGI approach with an accelerator can make for a very speedy server indeed. Standard PHP does not come with an accelerator, but Zend Cache is scheduled for inclusion into the PHP core.

How to make PHP write asynchronously? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Is there a PHP command that write asynchronously (as in I don't care when the data is written and this has very low priority.)
I am hosting 1,300 domains on a server. I know it's a lot. But each takes very little memory, very little CPU, very little bandwidth. The bottleneck is random writes. Too many random writes (and random read). The server is Linux.
I read here:
Hence, SYNC-reads and ASYNC-writes are generally a good way to go as
they allow the kernel to optimise the order and timing of the
underlying I/O requests.
There you go. I bet you now have quite a lot of things to say in your
presentation about improving disk-IO. ;-)
Most of the writes are just cache files. I lost nothing if it's wrong once in a while. Basically I want to set up my system so that most read and write goes to the memory and then when I want my server to write data to my disks in sequential block.
I have a huge amount of memory that can be used as cache. In Windows I use supercache to do this and everything loads faster. People there tell that Linux already has block caching. Well, the block caching doesn't work and I am still researching how to make most writes goes to memory first so that data is flushed only ocassionally.
For now, I just want to modify my PHP program so when it writes data it tells the kernel "just write it some times in the future, I can wait and I don't mind."
Is there an option for that? I could use SSD but they are small.

If most of the write activity is due to caching then I would use either apc or memcached as a a backend to the cache instead of just the filesystem. Any writes that must be persisted to the file system can then be handled separately without the competing cache io hitting the disk. The only consideration to make is that your cache is now in memory it will be emptied after a reboot, if that would cause a performance issue you may want to implement a cache warming process.
Most of the major PHP frameworks have memcached and apc (i.e. Zend Framework )as an available storage adaptor for their caching modules so that would be my first choice. Failing that you can use the memcached and apc extensions directly.
Another option is to look at implementing a queue that you can send the pending writes to. That queue can then be processed periodically by a scheduled task that performs the writes at an acceptable rate. I have used Zend_Queue in the past for a similar purpose.

Is it possible not to load the bootstrapping mechanism at every call?

This is not a PHP question, but my expertise is with PHP frameworks.
A lot of frameworks have a bootstrapping (loading of classes and files) mechanism. (Drupal, Zend Framework to name a few)
Everytime that you make a request, the complete bootloading process needs to be repeated. And it can be optimized using APC by automatically caching some intermediate code
The general question is:
For any language, is there any way to not load the complete bootstrapping process? Is there any way of "caching" the state (or starting at) at the end of the bootstraping process to not load everything again? (maybe the answer is in some other language/framework/pattern)
It looks to me as extremely inefficient.

In general, it's quite possible to perform bootstrap / init code once per process, instead of having to reload it for every request. In your specific case, I don't think this is possible with PHP (but my knowledge of PHP is limited). I know I have seen this as a frequently criticism of PHP's architecture... but to be fair to PHP, it's not the only language or framework that does things this way. To go into some detail...
The style of "run everything for every request" came about with "CGI" scripts (c.f. Common Gateway Interface), which were essentially just programs that got executed as a separate process by the webserver whenever a request came in matching the file, and predefined environmental variables would be set providing meta information. The file could be basically any executable, written in any language. Since this was basically the first way anyone came up with of doing server-side scripting, a number of the first languages to integrate into a webserver used the cgi interface, Perl and PHP among them.
To eliminate the inefficiency you identified, the a second method was devised, which used plugins into the webserver itself... for Apache, this includes mod_perl for Perl, and mod_python for Python (the latter now replaced by mod_wsgi for Python). Using these plugins, you could configure the server to identify a program to load once per process, which then does the requisite initialization, loads it's persistent state into memory, and offers up a single function for the server to call whenever there is a request. This can lead to some extremely fast frameworks, as well as things such as easy database connection pooling.
The other solution that was devised was to write a web server (usually stripped down) in the language required, and then use the real webserver to act as a proxy for the complicated requests, while still serving static files directly. This route is also used frequently by Python (quite often via the server provided by the 'Paste' project). It's also used by Java, through the Tomcat webserver. These servers, in turn, offer approximately the same interface as I mentioned in the last paragraph.

The short answer is: in PHP there's no good way to skip the bootstrapping. (Technically you could run a PHP service 24/7 that ran forked children to handle requests, but that's not going to make your life any better.)
A good framework shouldn't do much in bootstrapping. In my personal one that I use, it simply registers an autoload function for classes, loads the config settings from MemCache, and connects to a database.
At that point, it parses the request and sends it to the proper controller / action. While creating the new router object every time is a "waste," the actual process of handling the request needs to be done regardless if the bootstrapping process is magically "cached" between requests.
So I would measure the time it takes between starting the page and getting to the action method to see if it's even a problem. If the framework is doing expensive things related to configuration and class loading, you should be able to minimize that via storing the end results in memcache.
Note that you should always be using an opcode cache (e.g. APC) and a persistent SAPI (e.g., php-fpm) in production. Otherwise, there is a lot of overhead with starting up and shutting down.

I would suggest you to look into FastCGI and C/C++ interface if you want to handle multiple requests. Usually it brings many problems (such as data caching / flushing, memory leaks etc), but can raise performance 10-100 times.
PHP is more suitable for web interface, and if you need fast-processing then you can write a persistent handler.
Also take a look at Java / Tomcat, Python and mod_perl. Some people have also suggested xcache.
For the PHP frameworks they do need to support a multi-request structure in the core, and I'm not aware of any framework doing that.
However said that, I'd love to have a project which would let PHP script to respond to multiple requests inside a loop. Not simultaneously, but bypassing the initialization.
Also you can take a look at https://github.com/kvz/system_daemon, and http://gearman.org/.

Which is faster, python webpages or php webpages? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Which is faster, python webpages or php webpages?
Does anyone know how the speed of pylons(or any of the other frameworks) compares to a similar website made with php?
I know that serving a python base webpage via cgi is slower than php because of its long start up every time.
I enjoy using pylons and I would still use it if it was slower than php. But if pylons was faster than php, I could maybe, hopefully, eventually convince my employer to allow me to convert the site over to pylons.

It sounds like you don't want to compare the two languages, but that you want to compare two web systems.
This is tricky, because there are many variables involved.
For example, Python web applications can take advantage of mod_wsgi to talk to web servers, which is faster than any of the typical ways that PHP talks to web servers (even mod_php ends up being slower if you're using Apache, because Apache can only use the Prefork MPM with mod_php rather than multi-threaded MPM like Worker).
There is also the issue of code compilation. As you know, Python is compiled just-in-time to byte code (.pyc files) when a file is run each time the file changes. Therefore, after the first run of a Python file, the compilation step is skipped and the Python interpreter simply fetches the precompiled .pyc file. Because of this, one could argue that Python has a native advantage over PHP. However, optimizers and caching systems can be installed for PHP websites (my favorite is eAccelerator) to much the same effect.
In general, enough tools exist such that one can pretty much do everything that the other can do. Of course, as others have mentioned, there's more than just speed involved in the business case to switch languages. We have an app written in oCaml at my current employer, which turned out to be a mistake because the original author left the company and nobody else wants to touch it. Similarly, the PHP-web community is much larger than the Python-web community; Website hosting services are more likely to offer PHP support than Python support; etc.
But back to speed. You must recognize that the question of speed here involves many moving parts. Fortunately, many of these parts can be independently optimized, affording you various avenues to seek performance gains.

There's no point in attempting to convince your employer to port from PHP to Python, especially not for an existing system, which is what I think you implied in your question.
The reason for this is that you already have a (presumably) working system, with an existing investment of time and effort (and experience). To discard this in favour of a trivial performance gain (not that I'm claiming there would be one) would be foolish, and no manager worth his salt ought to endorse it.
It may also create a problem with maintainability, depending on who else has to work with the system, and their experience with Python.

I would assume that PHP (>5.5) is faster and more reliable for complex web applications because it is optimized for website scripting.
Many of the benchmarks you will find at the net are only made to prove that the favoured language is better. But you can not compare 2 languages with a mathematical task running X-times. For a real benchmark you need two comparable frameworks with hundreds of classes/files an a web application running 100 clients at once.

PHP and Python are similiar enough to not warrent any kind of switching.
Any performance improvement you might get from switching from one language to another would be vastly outgunned by simply not spending the money on converting the code (you don't code for free right?) and just buy more hardware.

It's about the same. The difference shouldn't be large enough to be the reason to pick one or the other. Don't try to compare them by writing your own tiny benchmarks ("hello world") because you will probably not have results that are representative of a real web site generating a more complex page.

If it ain't broke don't fix it.
Just write a quick test, but bear in mind that each language will be faster with certain functions then the other.

You need to be able to make a business case for switching, not just that "it's faster". If a site built on technology B costs 20% more in developer time for maintenance over a set period (say, 3 years), it would likely be cheaper to add another webserver to the system running technology A to bridge the performance gap.
Just saying "we should switch to technology B because technology B is faster!" doesn't really work.
Since Python is far less ubiquitous than PHP, I wouldn't be surprised if hosting, developer, and other maintenance costs for it (long term) would have it fit this scenario.

an IS organization would not ponder this unless availability was becoming an issue.
if so the case, look into replication, load balancing and lots of ram.

The only right answer is "It depends". There's a lot of variables that can affect the performance, and you can optimize many things in either situation.

I had to come back to web development at my new job, and, if not Pylons/Python, maybe I would have chosen to live in jungle instead :) In my subjective opinion, PHP is for kindergarten, I did it in my 3rd year of uni and, I believe, many self-respecting (or over-estimating) software engineers will not want to be bothered with PHP code.
Why my employers agreed? We (the team) just switched to Python, and they did not have much to say. The website still is and will be PHP, but we are developing other applications, including web, in Python. Advantages of Pylons? You can integrate your python libraries into the web app, and that is, imho, a huge advantage.
As for performance, we are still having troubles.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.