I have a script which runs 1000 cURL requests using curl_multi_* functions in PHP.
What is the bottleneck behind them timing out?
Would it be the CPU usage? Is there some more efficient way, in terms of how that number of outbound connections is handled by the server, to do this?
I cannot change the functionality and the requests themselves are simple calls to a remote API. I am just wondering what the limit is - would I need to increase memory on the server, or Apache connections, or CPU? (Or something else I have missed)
Your requests are made in a single thread of execution. The bottleneck is almost certainly CPU, have you ever actually watched curl multi code run ? ... it is incredibly cpu hungry; because you don't really have enough control over dealing with the requests. curl_multi makes it possible for you to orchestrate 1000 requests at once, but this doesn't make it a good idea. You have almost no chance of using curl_multi efficiently because you cannot control the flow of execution finely enough, just servicing the sockets and select()'ing on them will account for a lot of the high CPU usage you would see watching your code run on the command line.
The reasons the CPU usage is high during such tasks is this; PHP is designed to run for a fraction of a second, do everything as fast as it can. It usually does not matter how the CPU is utilized, because it's for such a short space of time. When you prolong a task like this the problem becomes more apparent, the overhead incurred with every opcode becomes visible to the programmer.
I'm aware you have said you cannot change the implementation, but still, for a complete answer. Such a task is far more suitable for Threading than curl multi, and you should start reading http://php.net/pthreads, starting with http://php.net/Thread
Left to their own devices on an idle CPU even 1000 threads would consume as much CPU as curl_multi, the point is that you can control precisely the code responsible for downloading every byte of response and upload every byte of the request, and if CPU usage is a concern you can implement a "nice" process by explicitly calling usleep, or limiting connection usage in a meaningful way, additionally your requests can be serviced in separate threads.
I do not suggest that 1000 threads is the thing to do, it is more than likely not. The thing to do would be design a Stackable ( see the documentation ) whose job is to make and service a request in a "nice", efficient way, and design pools ( see examples on github/pecl extension sources ) of workers to execute your newly designed requests ...
Related
I am running HTTP API which should be called more than 30,000 time per minute simultaneously.
Currently I can call it 1,200 time per minute. If I call 1200 time per minute, all the request are completed and get response immediately.
But if I called 12,000 time per minute simultaneously it take 10 minute to complete all the request. And during that 10 minute, I cannot browse any webpage on the server. It is very slow
I am running CentOS 7
Server Specification
Intel® Xeon® E5-1650 v3 Hexa-Core Haswell,
RAM 256 GB DDR4 ECC RAM,
Hard Drive2 x 480 GB SSD(Software-RAID 1),
Connection 1 Gbit/s
API- simple php script that echo the time-stamp
echo time();
I check the top command, there is no load in the server
please help me on it
Thanks
Sounds like a congestion problem.
It doesn't matter how quick your script/page handling is, if the next request gets done within the execution time of the previous:
It is going to use resources (cpu, ram, disk, network traffic and connections).
And make everything parallel to it slower.
There are multiple things you could do, but you need to figure out what exactly the problem is for your setup and decide if the measure produces the desired result.
If the core problem is that resources get hogged by parallel processes, you could lower connection limits so more connections go in to wait mode, which keeps more resources available for actually handing out a page instead of congesting everything even more.
Take a look at this:
http://oxpedia.org/wiki/index.php?title=Tune_apache2_for_more_concurrent_connections
If the server accepts connections quicker then it can handle them, you are going to have a problem which ever you change. It should start dropping connections at some point. If you cram down French baguettes down its throat quicker then it can open its mouth, it is going to suffocate either way.
If the system gets overwhelmed at the network side of things (transfer speed limit, maximum possible of concurent connections for the OS etc etc) then you should consider using a load balancer. Only after the loadbalancer confirms the server has the capacity to actually take care of the page request it will send the user further.
This usually works well when you do any kind of processing which slows down page loading (server side code execution, large volumes of data etc).
Optimise performance
There are many ways to execute PHP code on a webserver and I assume you use appache. I am no expert, but there are modes like CGI and FastCGI for example. Which can greatly enhance execution speed. And tweaking settings connected to these can also show you what is happening. It could for example be that you use to little number of PHP threats to handle that number of concurrent connections.
Have a look at something like this for example
http://blog.layershift.com/which-php-mode-apache-vs-cgi-vs-fastcgi/
There is no 'best fit for all' solution here. To fix it, you need to figure out what the bottle neck for the server is. And act accordingly.
12000 Calls per minute == 200 calls a second.
You could limit your test case to a multitude of those 200 and increase/decrease it while changing settings. Your goal is to dish that number of requestst out in a shortest amount of time as possible, thus ensuring the congestion never occurs.
That said: consequences.
When you are going to implement changes to optimise the maximum number of page loads you want to achieve you are inadvertently going to introduce other conditions. For example if maximum ram usage by Apache would be the problem, the upping that limit will ensure better performance, but heightens the chance the OS runs out of memory when other processes also want to claim more memory.
Adding a load balancer adds another possible layer of failure and possible slow downs. Yes you prevent congestion, but is it worth the slow down caused by the rerouting?
Upping performance will increase the load on the system, making it possible to accept more concurrent connections. So somewhere along the line a different bottle neck will pop up. High traffic on different processes could always end in said process crashing. Apache is a very well build web server, so it should in theories protect you against said problem, however tweaking settings wrongly could still cause crashes.
So experiment with care and test before you use it live.
I have a Laravel app (on Forge) that's posting messages to SQS. I then have another box on Forge which is running Supervisor with queue workers that are consuming the messages from SQS.
Right now, I just have one daemon worker processing a particular tube of data from SQS. When messages come up, they do take some time to process - anywhere from 30 to 60 seconds. The memory usage on the box is fine, but the CPU spikes almost instantly and then everything seems to get slower.
Is there any way to handle this? Should I instead dispatch many smaller jobs (which can be consumed by multiple workers) rather than one large job which can't be split amongst workers?
Also, I noted that Supervisor is only using one of my two cores. Any way to have it use both?
Having memory intensive applications is manageable as long as scaling is provided, but CPU spikes is something that is hard to manage since it happens within one core, and if that happens, sometimes your servers might even get sandboxed.
To answer your question, I see two possible ways to handle your problem.
Concurrent Programming. Have it as it is, and see whether the larger task can be parallelized. (see this). If this is supported, then parallelize the code to ensure that each core handles a specific part of your large task. Finally, gather the results into one coordinating core and assemble the final result. (additionally: This can be efficiently done is GPU programming is considered)
Dispatch Smaller Jobs (as given in the question): This is a good approach if you can manage multiple workers working on smaller tasks and finally there is a mechanism to coordinate everything together. This could be arranged as a Master-Slave setting. This would make everything easy (because parallelizing a problem is a bit hard), but you need to coordinate everything together.
I need to move a lot of mysql rows to a couchbase server. The catch is that I need to use a PHP class to do the job (The class has business logic)
I've created a PHP ClI script and ran 6 of them at once. It's faster than running a single CLI script, but not enough. It took me 2 hours to transfer everything.
Are there any better way?
Updated:
What PHP code does with Mysql
select * from table limit $limit
That's about it. Nothing fancy.
Are there any better way?
Yes. There most likely is.
You need to identify the bottleneck. From what you describe it seems the bottleneck is the number of jobs run in parallel. So you should increase that until you find the maximum performance. GNU Parallel can often help you do that.
When you have done that, the bottleneck is somewhere else. But since your description has very little detail, it is impossible to tell where.
You will therefore have to find the new bottleneck. The bottleneck is typically disk I/O, network I/O, or CPU, but can also be a shared lock or other ressource.
To look for a CPU bottleneck run top. If you see a process running at 100% and you cannot parallelize this process, then you have found your bottleneck.
To look for a disk I/O bottleneck run iostat -dkx 1. If the last column hits 100% more than 50% of the time, you have found your bottleneck.
To look for a network I/O bottleneck run iotop. If the bandwidth used is > 70% of the available network bandwidth, you have found your bottleneck.
A special network problem is DNS: This can often be seen as a process that is stuck for 5-10 seconds for no good reason, but which otherwise runs just fine. Use tcpdump -s1000 -n port 53 and see if the questions are being answered quickly. Do this on all machines involved in running the job.
To look for the shared lock is harder. You will first have to find the suspect processes and then you have to strace these.
I have a beanstalkapp worker that is made with nodejs what it does is there is a PHP application which does all the site stuff and if there are errors or issues or notifications or what ever it adds it to the beanstalkapp. The nodejs then needs to run pretty much constantly checking the beanstalkapp for any messages and do something with them (email someone, add it to the log, post somewhere).
My question is, is this bad performance wise or is there a better way to do this? I would assume that setInterval doesn't let the process end and would therefore be bad?
It depends on your setInterval time. The javascript interpreter sleeps between events. Therefore, in between setIntervals your node.js app consumes zero (or almost zero) CPU time.
So how much load the app incurs on the system as a whole depends on how often setInterval fires. Once every second would hardly consume any CPU at all. On the other hand, once every 1ms (called with a setInterval time of 1 or 0) can bog down your system especially if you're running on a resource constrained machine or VM.
In my experience, a good compromise is around 50 or 100 ms (20 or 10 times per second). It's responsive enough for even real-time applications (because human perception is a lot slower than 20Hz) and long enough to have little effect on the rest of the system.
i need to compare performance of 3 different computers each running a web server. My idea is that given same php script to process on each server, the one able to serve largest no. of clients at given load limit will be the most powerful one.
To implement this, i have a single php script which basically does some heavy maths calculations. I am maintaining clients count as a static value. The script will run infinitely until say the cpu load is 95%. when the load reaches 95% the script should stop for all clients.
And at this limit, the one system having largest clients count will be the best performer.
A general structure of this php script is as :
static $clients_count=0;
static $sys_load=0;
//increment clients_count
$clients_count++;
while(sys_load<=95)
{
do_heavy_maths();
//calculate current cpu load
sys_load=get_cpu_load();
}
echo "No. of max. clients this server handled: $clients_count";
So now i have a few questions :
is my approach of comparing CPU peformances correct . (PS i have to use web based
benchmarking only) ?
How to determine the no. of clients connected to my server ?
Pls provide a better way to find cpu load. (It's hard to draw a max. cpu limit using load_averages
that one can get by reading from /proc/loadavg).
Thanx..
Trying to do this from inside, is less accurate than running an external benchmarking tool.
Remember that php core has several triggers that can avoid cpu consumption or limit it.
With opcode cache this can run faster, several other aspects.
Check some tools at: http://www.opensourcetesting.org/performance.php
I strongly believe PHP does not suit as well as writing that on a raw memory/cpu access language with a closer memory and cpu management like C.
Think it is nice to try. Sounds like a fun project.
Unfortunately but this is not realistic load testing. Configuration settings will heavily influence the capacity a single script can use. You are only running single threaded while webservers strongly rely on multi-threaded capacity. Also other influences like the active load, other processes, allocations etc. all influence. Not even to take in account network speeds.
Loads are normally indicated as a number like 2.0. This actually doesn't say anything about the load since it depends on the number of CPU cores which is your real capacity.
For real benchmarking use real tools like: http://httpd.apache.org/docs/2.0/programs/ab.html and many other professional solutions.
is my approach of comparing CPU peformances correct . (PS i have to use web based benchmarking only) ?
No, as explained above.
How to determine the no. of clients connected to my server ?
That will only be done if you know what the clients actually do. A server for delivering only static files should be setup totally different compared to a server running complex script at every request.
Pls provide a better way to find cpu load. *It's hard to draw a max. cpu limit using load_averages that one can get by reading from /proc/loadavg.*
In a seperate process log the over-all load and log it. Put it next to your testing log and you see the influences. To create this limit the only way in PHP is exec a command, parse the feedback. Mention the note above about nr. of CPU's, cores etc. of which you have to take care of. It is not just a percentage.