What's more resource intensive? PHP or Python? - php

My current web host allows for up to 25 processes running at once. From what I can figure, Python scripts take up a spot in processes, but PHP doesn't?
I get a 500 error if more than 25 processes are running at once (unlikely, but still a hassle), so I was wondering if it would be easier on the server if I were to port my site over to PHP?
Thanks!

You are using HostGator. Switch hosts. Their shared server offerings should be used by very low traffic, brochure sites as they cram 100's of vhosts onto each server.
If you can't switch, ensure you're setup to use mod_php (not suPHP or cgi) or Python equivalent. Otherwise, new processes will be spawned on each request and you'll be serving up blank pages in no time.

It depends on how you have PHP/Python set up. If you have, say, Apache loading PHP via mod_php, then it doesn't actually spawn a new process. Likewise, if you were using, say, Tornado to handle web requests, then the webserver itself is already running the Python process, and thus there's no additional Python processes required.
Basically... don't change languages just to alter the number of processes you have running. Instead, figure out what methods your current language has to reduce the process count.

Related

how are concurrent requests handled in PHP (using - threads, thread pool or child processes)

I understand that PHP supports handling multiple concurrent connections and depending on server it can be configured as mentioned in this answer
How does server manages multiple connections does it forks a child process for each request or does it handle using threads or does it handles using a thread pool?
The linked answer says a process is forked and then the author in comment says threads or process, which makes it confusing, if requests are served using child-processes, threads or thread pool?
As I know, every webserver has it's own kind of handling multpile simultanous request.
Usually Apache2 schould fork a child process for each new request. But you can somehow configure this behaviour as mentioned in your linked StackOverflow answer.
Nginx for example gets every request in one thread (processes new connections asyncronously like Node.js does) or sometimes uses caching (as configured; Nginx could also be used as a load balancer or HTTP proxy). It's a thing of choosing the right webserver for your application.
Apache2 could be a very good webserver but you need more loadbalancing when you want to use it in production. But it also has good power when having multiply short lasting connections or even documents which don't change at all (or using caching).
Nginx is very good if you expect many long lasting connections with somehow long processing time. You don't need that much loadbalancing then.
I hope, I was able to help you out with this ;)
Sources:
https://httpd.apache.org/docs/2.4/mod/worker.html
https://anturis.com/blog/nginx-vs-apache/
I recommend you to also look at: What is thread safe or non-thread safe in PHP?
I think the answer depends on how the web server and the cgi deploy.
In my company, we use Nginx as the web server and php-fpm as cgi, so the concurrent request is handled as process by php-fpm, not thread.
We configure the max number of process, and each request is handled by a single php process, if more requests(larger than the max number of process) come , they wait.
So, I believe PHP itself can support all of them, but how to use it, that depends.
After doing some research I ended up with below conclusions.
It is important to consider how PHP servers are set to be able to get insights into it.For setting up the server and PHP on your own, there could be three possibilities:
1) Using PHP as module (For many servers PHP has a direct module interface (also called SAPI))
2) CGI
3) FastCGI
Considering Case#1 PHP as module, in this case the module is integrated with the web server itself and now it puts the ball entirely on web server how it handles requests in terms of forking process, using threads, thread pools, etc.
For module, Apache mod_php appears to be very commonly used, and the Apache itself handles the requests using processes and threads in two models as mentioned in this answer
Prefork MPM uses multiple child processes with one thread each and
each process handles one connection at a time.
Worker MPM uses
multiple child processes with many threads each. Each thread handles
one connection at a time.
Obviously, other servers may take other approaches but, I am not aware of same.
For #2 and #3, web server and PHP part are handled in different processes, and how a web server handles the request and how it is further processed by application(PHP part) varies. For e.g.: NGINX may handle the request using asynchronous non-blocking I/O and Apache may handle requests using threads, but, how the request would be processed by FastCGI or CGI application is a different aspect as described below. Both the aspects i.e. how web server handles requests and how PHP part is processed would be important for PHP servers performance.
Considering #2, CGI protocol has makes web server and application (PHP) independent of each other and CGI Protocol requires application and web server to be handled using different process and the protocol does not promote reuse of the same process, which in turn means a new process is required to handle each request.
Considering#3, FastCGI protocol overcomes the limitation of CGI by allowing process re-use. If you check IIS FastCGI link FastCGI addresses the performance issues that are inherent in CGI by providing a mechanism to reuse a single process over and over again for many requests.
FastCGI maintains compatibility with non-thread-safe libraries by
providing a pool of reusable processes and ensuring that each process
handles only one request at a time.
That said, in case of FastCGI it appears that the server maintains a process pool and it uses the process pool to handle incoming client requests and since, the process pool does not require thread safe check, it provides a good performance.
PHP does not handle requests. The web server does.
For Apache HTTP Server, the most popular is "mod_php". This module is actually PHP itself, but compiled as a module for the web server, and so it gets loaded right inside it.
Since with mod_php, PHP gets loaded right into Apache, if Apache is going to handle concurrency using its Worker MPM (that is, using Threads)
For nginx PHP is totally outside of the web server with multiple PHP processes
It gives you choice sometimes to use non-thread safe or thread safe PHP.
But setlocale() function (when supported) is actually modifies the operation system process status and it is not thread safe.
You should remember it when you are not sure of how legacy code works.

Debugging potential network bottleneck on AJAX calls

I've written some JS scripts on my school's VLE.
It uses the UWA Widget Format and to communicate with a locally-hosted PHP script, it uses a proxy and AJAX requests.
Recently we've moved the aforementioned locally-hosted server from a horrible XP-based WAMP server to a virtual Server 2008 distribution running IIS and FastCGI PHP.
Since then - or maybe it was before and I just didn't notice - my AJAX calls are starting to take in excess of 1 second to run.
I've run the associated PHP script's queries on PHPMyAdmin and, for example, the associated getCategories SQL takes 0.00023s to run so I don't think the problem lies there.
I've pinged the server and it consistently returns <1ms as it should for a local network server on a relatively small scale network. The VLE is on this same network.
My question is this: what steps can I take to determine where the "bottleneck" might be?
First of all, test how long your script is actually running:
Simplest way to profile a PHP script
Secondly, you should check the disk activity on the server. If it is running too many FastCGI processes for the amount of available RAM, it will swap and it will be very slow. If the disk activity is very high, then you know you've found your culprit. Solve it by reducing the maximum number of fastcgi processes or by increasing the amount of server RAM.

php5-fpm children and requests

I have a question.
I own a 128mb vps with a simple blog that gets just a hundred hits per day.
I have nginx + php5-fpm installed. Considering the low visits and the ram I decided to set fpm to static with 1 server running. While I was doing my random tests like running php scripts through http that last over 30 minutes I tried to open the blog in the same machine and noticed that the site was basically unreachable. So I went to the configuration and read this:
The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes to be created when pm is set to 'dynamic'.
; **This value sets the limit on the number of simultaneous requests that will be
; served**
What shocked me the most was that I didn't know because I always assumed that a php children would handle hundreds of requests at the same time like a http server would do!
Did it get it right?
If for example I launch 2 php-fpm children and launch 2 "long scripts" at the same time all the sites using the same php backend will be unreachable?? How is this usable?
You may think: -duh! a php script (web page) is usually processed in 100ms- ... no doubt about that but what happens if you have pages that could run for about 10 secs each and I have 10 visitors with php-fpm with 5 servers so accepting only 5 requests per time at the same time? They'll all be queued or will experience timeouts?
I'm honestly used to run sites in Windows with Apache and mod_php I never experienced these issues because apparently those limits don't apply being a different way of using PHP.
This also raises another question. If I have file_1.php with sleep(20) and file_2.php with just an echo, if I run file_1 and then file_2 with the fastcgi machine the second file will request the creation of another server to handle the php request using 4MB RAM more. If I do the same with apache/mod_php the second file will only use 30KB more of RAM (in the apache server). Considering this why is mod_php is considering the "bad guy" if the ram used is actually less...I know I'm missing the big picture here.
You've basically got it right. You configured a static number of workers (and that number was "one") -- so that's exactly what you got.
But you don't understand quite how things typically work, since you say:
I always assumed that a php children would handle hundreds of requests
at the same time like a http server would do!
I'm not really familiar with nginx, but consider the typical mod_php setup in apache. If you're using mod_php, then you're using the prefork mpm for apache. So every concurrent http requests is handled by a distinct httpd process (no threads). If you're tuning your apache/mod_php server for low-memory, you're going to have to tweak apache settings to limit the number of processes it will spawn (in particular, MaxClients).
Failing to tune this stuff means that when you get a large traffic spike, apache starts spawning a huge number of heavy processes (remember, it's mod_php, so you have the whole PHP interpreter embedded in each httpd process), and you run out of memory, and then everything starts swapping, and your server starts emitting smoke.
Tuned properly (meaning: tuned so that you ignore requests instead of allocating memory you don't have for more processes), clients will time out, but when traffic subsides, things go back to normal.
Compare that with fpm, and a smarter web server architecture like apache-worker, or nginx. Now you have some, much larger, pool of threads (still configurable!) to handle http requests, and a separate pool of php-fpm processes to handle just the requests that require PHP. It's basically the same thing, if you don't set limits on how many processes/threads can be created, you are asking for trouble. But if you do tune, you come out ahead, since only a fraction of your requests use PHP. So essentially, the average amount of memory needed per http requests is lower -- thus you can handle more requests with the same amount of memory.
But setting the number to "1" is too extreme. At "1", it doesn't even matter if you choose static or dynamic, since either way you'll just have one php-fpm process.
So, to try to give explicit answers to particular questions:
You may think: -duh! a php script (web page) is usually processed in 100ms- ... no doubt about that but what happens if you have pages that could run for about 10 secs each and I have 10 visitors with php-fpm with 5 servers so accepting only 5 requests per time at the same time? They'll all be queued or will experience timeouts?
Yes, they'll all queue, and eventually timeout. The fact that you regularly have scripts that take 10 seconds to run is the real culprit here, though. There are lots of ways to architect around that (caching, work queues, etc), but the right solution depends entirely on what you're trying to do.
I'm honestly used to run sites in Windows with Apache and mod_php I never experienced these issues because apparently those limits don't apply being a different way of using PHP.
They do apply. You can set up an apache/mod_php server the same way as you have with nginx/php-fpm -- just set apache's MaxClients to 1!
This also raises another question. If I have file_1.php with sleep(20) and file_2.php with just an echo, if I run file_1 and then file_2 with the fastcgi machine the second file will request the creation of another server to handle the php request using 4MB RAM more. If I do the same with apache/mod_php the second file will only use 30KB more of RAM (in the apache server). Considering this why is mod_php is considering the "bad guy" if the ram used is actually less...I know I'm missing the big picture here.
Especially on linux, lots of things that report memory usage can be very misleading. But think about it this way: that 30kb is negligible. That's because most of PHP's memory was already allocated when some httpd process got started.
128MB VPS is pretty tight, but should be able to handle more than one php-process.
If you want to optimize, do something like this:
For PHP:
pm = static
pm.max_children=4
for nginx, figure out how to control processes and thread count (whatever the equivalent to apache's MaxClients, StartServers, MinSpareServers, MaxSpareServers)
Then figure out how to generate some realistic load (apachebench, siege, jmeter, etc). use vmstat, free, and top to watch your memory usage. Adjust pm.max_children and the nginx stuff to be as high as possible without causing any significant swap (according to vmstat)

Is PHP under Apache reentrant?

Just a theoretical question really.
say my website consists of a form that uses the PHP mail functions to send e-mails. I have 500 users clicking submit at the same time. Now 500 e-mails in 500 different sessions have to be sent out from PHP.
will it be done concurrently? how many threads are involved? will each send block the others and do it one by one?
There are two things you need to think about.
The first is how you have your web server configured. If you're using Apache, there are a few processing modules that can be picked from. The most popular processing module is prefork, in which there is a single parent process and multiple child processes. Each child handles one request at a time. This avoids threading entirely, because not all Apache modules are thread safe. You might also find the worker module somewhere in production. It uses a combination of forking behavior and threading to serve multiple requests per child. It can only be used when every single Apache module and all of it's dependencies are thread safe.
The second thing to think about is PHP itself. While the core PHP language and some extensions are thread safe, many extensions aren't thread safe. For this reason, when you're using Apache and mod_php, the prefork processing module is your best choice. (PHP itself has no internal concept of threads.)
tl;dr: Apache + PHP = one request per Apache child. You'll usually only have 20-30 Apache children, meaning 20-30 possible concurrent requests. This depends on configuration.
In a linux-based server emails are sent using local sendmail command. This immedately accepts the message and returns. The rest of the work is done by your MTA (asynchronously) which is hardened and optimized thru decades to jobs such as this one.

CGI is inefficient, but what is being used that is so different nowaydays?

My understanding is that creating CGI scripts is a thing of a past, and has deemed inefficient because of the way it forks every time its being called. However I don't see what the different is considering when you call a web page with php scripts embeded, it still in some way forks to another process, so why is CGI deemed inefficient?
There are two "mainstream" ways around forking for every request:
You can load the interpreter directly into your server's process space, and prefork a number of set instances during startup. mod_php and mod_python take roughly this tactic.
You can create a persistent process for the interpreter, and then either prefork or spawn threads for each request, communicating with the server over sockets. FastCGI is used this way.
Event-driven servers, while not exactly mainstream, are becoming more common for good reason. They rely on the knowledge that most websites spend most of their time blocking on I/O, just spinning their metaphorical gears. Whenever a request needs to do any I/O, the server is free to start handling another request without starting another thread/process by using select() and friends. This is really the only way to solve the C10k problem.
CGI forks every request. This means the "weight" of forking/initializing is done every time.
FCGI or mod_php only fork on server start or on load increase. This means setup is only done once per process (but they don't share memory).
Facebook's HipHop doesn't fork (it transforms the PHP into thread-safe C++ code). This allows all the PHP "process" to live in a single multithreaded C++ binary, resulting in a substantial speed up and decrease in memory usage.
There is FastCGI, when you start some instances of the php binary, and they handle queries one after another, this way those processes do not start over and over again for each request.
Also PHP is typically run as apache module, which has a lifecycle tied to the webserver, so no additional overhead between handling requests, when Apache identifies a file as PHP script it calls mod_php which was started togather with the webserver itself.
CGI's overhead was mainly from having to fork() the webserver process handling the request, firing up a shell, and then running an external program.
Since the request was an external program, almost everying to do with the request had to be copied into the new shell's environment variables - query string, remote address, authentication data, etc.. POST data was passed to the script via its stdin. All of this took time to copy and configure, adding to the overhead.
This is one reason why query strings had to be length limited. Some operating systems had (and still do) length limits on the name of environment variables, and definitely had limits on how large an individual variable's contents could be. The HTTP spec itself has no limits, but because of the CGI mechanism and underlying OS limits, limited length query strings became the norm.
Microsoft IIS allows the script to be interpreted in-process by loading a library into its address space. (Newer versions of IIS work more like FastCGI in that the actual script execution is performed in a separate web worker process, although this is subject to the configuration settings for the virtual directory.)

Categories