How to run 50k jobs per second with gearman

How to run 50k jobs per second with gearman - php

According to Gearman website
"A 16 core Intel machine is able to process upwards of 50k jobs per second."
I have load balancer that moves traffic to 4 different machines. Each machine has 8 cores. I want to have the ability to run 13K jobs per machine, per second (it's definitely more then 50K jobs).
Each job takes between 0.02 - 0.8 MS.
How many workers do I need to open for this type of performance?
What is the steps that I need to take to open these amount of workers?

Depending on what kind of processing you're doing, this will require a little experimentation and load testing. Before you start, make sure you have a way to reboot the server without SSH, as you can easily peg the CPU. Follow these steps to find the optimum number of workers:
Begin by adding a number of workers equal to the number of cores minus one. If you have 8 cores, start with 7 workers (hopefully leaving a core free for doing things like SSH).
Run top and observe the load average. The load average should not be higher than the number of cores. For 8 cores, a load average of 7 or above would indicate you have too many workers. A lower load average means you can try adding another worker.
If you added another worker in step 2, observe the load average again. Also observe the increase in RAM usage.
If you repeat the above steps, eventually you will either run out of CPU or RAM.
When doing parallel processing, keep in mind that you could run into a point of diminishing returns. Read about Amdahl's law for more information.

Related

IIS - PHP 8.1.4 Performance Degrades After 2 Hours

I am running IIS 10 on Windows Server 2016. I am using PHP 8.1.4 NTS with FastCGI. PHP is optimized following the usual recommendations.
I noticed that the server's response times start to increase after about 2 hours. For example, the TTFB is roughly 150-200ms after IIS/worker processes are started. Sites load very quickly. Then, after about 2 hours or so, performance starts to decline where TTFB increases upward and eventually plateaus at around 500ms. Sometimes, it will even go as high as 800ms.
If I have the IIS application pool recycle, we're back to ~200ms where it will stay in that area for another 2 hours.
I'm trying to keep our server response times fast, and am curious what could be causing the performance to degrade after a few hours. Should we setup to recycle the pool more often? That can work, but it seems like something else is going on, and you shouldn't have to do that.
The server does not have high CPU, disk, or RAM usage. The w3wp and php-cgi processes have very little memory usage (10-20MB each). CPU is almost always under 10%, and RAM is only 50% in use.
Optimized IIS FastCGI parameters, and application pool parameters, to recommended settings (10k requests, etc.)
Reviewed MySQL 8.0 Server logs to find slow queries, but nothing with low performance was found.

Max number of PHP CLI processes

I have a PHP script that is called by an external process when certain events happen, like when an email arrives. So if during a period of time, the triggering event happens multiple times, the script is invoked multiple times as well.
What's the limit on max number of instances of the script running concurrently? How would I go about loosening the limit?
I have read about various pieces of info on max number of concurrent connections in the context of Apache/PHP, but I think the CLI context works differently.
My environment is Ubuntu 16.04LTS/PHP 7.0.
Thanks!

On Linux, the max number of processes is determined by several factors:
cat /proc/sys/kernel/pid_max
This will show the maximum PID value + 1 (i.e. the highest value PID is one less than this value). When you hit this value, the kernel wraps around. Note that the first 300 are reserved. The default value is 32768, so the maximum number of PIDs is 32767 - 300 = 32467.
This will show you the maximum number of processes that your user account can run:
ulimit -u
So if you're running all of your processes as a single user, and this number is less than the pid_max value, then this may be your upper limit.
These values can be adjusted (particularly on 64-bit systems), but they're unlikely to be your real-world upper limit. In most cases, the maximum number of processes is going to be based on your hardware resources. And I suspect that if you try to run anywhere near 32,000 PHP CLI instances, you'll run out of RAM far earlier than you'll run out of available process space.
The maximum number of processes is going to depend on what your script does; particularly, how much CPU and RAM it uses. Your best bet is to kick off a script that runs X number of your CLI processes for an extended period of time, then watch the load on your system. Start with 10.. if the load is negligible, bump it to 100. Then continue that process until you have a noticeable load. Find out what your max load is, then do the match to figure out what your max number of processes is.

Is PHP long sleep() inefficient?

I have a code that needs to run in 5 parts, each 10 minutes apart. I know I can run 5 different cron jobs, but the script lends itself to being one script with 10 minute sleep()s at different points.
So I have:
set_time_limit(3600);
//code
sleep(600);
//continutes
sleep(600);
//etc
Is doing this highly inefficient, or should I find a way to have it split into 5 different cron jobs run 10 minutes apart?

sleep() doesn't consume CPU time but the processes ongoing will consume RAM because the php engine needs to keep running. It shouldn't be a problem if you have a lot of free RAM but I would still suggest splitting it into other crons.
Personally, I've used long sleep (10-20 minutes) in previous web crawlers that I've written in PHP and that ran from my local 4 GB RAM machine with no problem.

Depends on the task you have, but generally speaking - it is bad because it consumes unneeded resources for a long time and has a high risk of being interrupted (by system crash/reboot or external changes to resources which whom script operates).
I'd recommend to use a job queue daemon like RabbitMQ with delaying features. So that after each block you could enqueue a next one in 10 minutes. That will save resources and increase stability

How to use RLimitCPU

How I can Limit cpu usage of apache2 php scirpts using
RLimitCPU seconds|max [seconds|max]
Please show me an example.
e.g RLimitCPU 2 2 ? whats that mean ?
I know its cpu seconds but question is how to convert GHz to seconds.
One php for video streaming script sometimes is taking 100% CPU usage on 2 cores.
http://httpd.apache.org/docs/2.2/mod/core.html#rlimitcpu

1 GHz is 1,000,000,000 CPU cycles per second - so a 2.6 GHz CPU is going to go through 2,600,000,000 cycles in one second. How many instructions actually get executed in a cycle is going to vary with the CPU - they'll all take a certain number of cycles to actually complete an instruction.
2 CPU seconds is "the CPU is completely maxed out for two full seconds or the equivalent". So if your program uses the CPU at half capacity for 4 full seconds that's 2 CPU seconds.
For your app, if you have a 2.6 GHz CPU and you run at 2 CPU seconds, you'll have executed 5,200,000,000 CPU cycles. How many instructions that is harder to work out, and how many instructions you actually need for your "video streaming script" is going to be incredibly hard to work out (and is going to vary with the length of the video).
I'd advise just running the script for the biggest video you'd ever send, seeing how many CPU seconds you use (top -u apache-usr will let you see the PHP process running, "TIME+" column is CPU time) and then tripling that as your RLimitCPU.
Bear in mind that RLimitCPU is just going to kill your PHP script when it takes more CPU time than the limit. It's not some magical tool that means your script will take less CPU time, it's just a limit on the maximum time the script can take.

Apache Reference: http_core, RLimitCPU
RLimitCPU
Resource Limit on CPU Usage
Syntax: RLimitCPU soft-seconds [hard-seconds]
Example: RLimitCPU 60 120
Since: Apache 1.2
This directive sets the soft and hard limits for maximum CPU usage of a process in seconds. It takes one or two parameters. The first parameter, soft-seconds, sets the soft resource limit for all processes. The second parameter, hard-seconds, sets the maximum resource limit. Either parameter can be a number, or max'', which indicates to the server that the limit should match the maximum allowed by the operating system configuration. Raising the maximum resource limit requires the server to be running as the userroot'', or in the initial start-up phase.
http://www.apacheref.com/ref/http_core/RLimitCPU.html

how much is CPU usage considered high on linux server

I'm running a few PHP job which fetches 100th thousands of data from a webservice and insert them to database. These jobs take up the CPU usage of the server.
My question is, how much is it considered high?
When i do a "top" command on linux server,
it seems like 77% .. It will go up to more than 100% if i run more jobs simultaneously. It seems high to me, (does more than 100% means it is running on the 2nd CPU ?)
28908 mysql 15 0 152m 43m 5556 S 77.6 4.3 2099:25 mysqld
7227 apache 15 0 104m 79m 5964 S 2.3 7.8 4:54.81 httpd
This server is also has also webpages/projects hosted in it. The hourly job since to be affecting the server as well as the other web project's loading time.
If high, is there any way of making it more efficient on the CPU?
Anyone can enlighten?

A better indicator is the load average, if I simplify, it is the amount of waiting tasks because of insufficient resources.
You can access it in the uptime command, for example: 13:05:31 up 6 days, 22:54, 5 users, load average: 0.01, 0.04, 0.06. The 3 numbers at the end are the load averages for the last minute, the last 5 minutes and the last 15 minutes. If it reaches 1.00, (no matter of the number of cores) it is that something it waiting.

I'd say 77% is definitly high.
There are probably many ways to make the job more efficient, (recursive import), but not much info given.
A quick fix would be invoking the script with the nice cmd,
and add a few sleeps to stretch the load over time.
I guess you also saturate the network during import, so can you split up the job it would prevent your site from stalling.
regards,
/t

You can always nice your tasks
http://unixhelp.ed.ac.uk/CGI/man-cgi?nice
With the command nice you can give proccesses more or less priority

These jobs take up the CPU usage of the server.
My question is, how much is it considered high?
That is entirely subjective. On computing nodes, the CPU usage is pretty much 100% per core all the time. Is that high? No, not at all, it is proper use of hardware that has been bought for money.

Nice won't help much, since it's mysql that's occupying your cpu,
putting nice on a php-client as in
nice -10 php /home/me/myjob.php
won't make any significant difference.
Better to split up the job so smaller parts, call your php-script
from cron and build it like
<?
ini_set("max_execution_time", "600")
//
//1. get the file from remote server, in chunks to avoid net saturation
$fp = fopen('http://example.org/list.txt');
$fp2 = fopen('local.txt','w');
while(!feof($fp)) {
fwrite($fp2,fread($fp,10000));
sleep(5);
}
fclose($fp/fp2);
while(!eof(file) {
//read 1000 lines
//do insert..
sleep(10);
}
//finished, now rename to .bak, log success or whatever...

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to run 50k jobs per second with gearman - php

Related

IIS - PHP 8.1.4 Performance Degrades After 2 Hours

Max number of PHP CLI processes

Is PHP long sleep() inefficient?

How to use RLimitCPU

how much is CPU usage considered high on linux server

Categories

Resources