I'm using redis (2.6.8) with php-fpm and phpredis driver and have some trouble with redis latency issues. Under certain load first request to redis from our application takes about 1-1.5s and redis-cli --latency shows the same latency.
I've already checked the latency guide.
We use redis on the same host with Unix sockets
slowlog has no entries longer 5ms
we don't use AOF
redis takes about 3.5Gb memory of 16Gb available (i suppose it's not too much)
our system is not swapping
there is no other process doing disk I/O
I'm using persistent connections and amount of connected client is varying from 5 to 25 (sometimes strikes to 60-80).
Here is the graph.
It looks like problems starts when there are 20 or more simultaneously connected clients.
Can you help me to figure out where is the problem?
Update
I investigated the problem and it seemed like redis did not have enough processor time for some reason to operate properly.
I thoroughly checked communication between php-fpm and redis with the help of network sniffer. Redis received request over tcp but sent the answer back only after one and a half seconds. It obviously signified that the problem is inside redis, that it cannot process so many requests in the given conditions (possibly processor starvation as the processor was only 50% loaded for the whole system).
The problem was resolved by moving redis to other server that was nearly idle. I suppose that we should have played with linux scheduler to make it work on the same server, but have not done it yet.
Bear in mind that Redis is single-threaded. If the operations that you're doing err on the processor-intensive side, your requests could be blocking on each other. For instance, if you're doing HVALS against hashes with very large values, you're going to make all of your clients wait while you pull out all that data and copy it to the output buffer.
Part of what you need to do here (regardless if this is the issue) is to look at all of the commands that you're using and determine the complexity of each command. If you're doing a bunch of O(N) commands against very large amounts of data, it's not impossible that you're simply doing too much stuff at a time.
TL;DR Nobody on here can debug this issue with real certainty without knowing which commands you're using and what your data looks like. But you can look up the time complexity of each method you're using and make sure it's reasonable.
I ran across this in researching an issue I'm working on but thought it might help here:
https://groups.google.com/forum/#!topic/redis-db/uZaXHZUl0NA
If you read through the thread there is some interesting info.
Related
ReactPHP http server for each user, Is this a good idea?
In my application:
Each logged on user sends and receives data from server. In average one request per second.
After server response, the server have some extra work to do, which is related to specific user.
I can simply build new ReactPHP http server for each user who logs, and release the server after the user log out.
Is this will work? Am i missing something ?
No, it's not a good idea. You need a separate port per user in that case to route the user to the right server. That'd quickly exhaust your ports.
If you have blocking tasks within the event loop and want to use multiple processes because of that, just stick to traditional PHP with mod_php or php-fpm and start a new event loop for each process, do your thing and then exit.
If you don't have any blocking operations and everything is non-blocking, you can just use a single server and it handles all the things.
I'm not sure if exhausting ports would be the issue. Other services that do just this such as WebRTC SFUs. With 65,535 ports available that your talking 30,000+ concurrent TCP connections.
However, with that many users first obvious problem would be memory. At 10 mb just to start up PHP, that would be 300+ gb of memory without including a single line of code or actually doing anything. If your working with a seriously trimmed php binary you can get down to 4 or 5 mb, so at 5,000 concurrent users you would have around 25 gb.
But the real problem is that it would result in thousands of processes, which is impossible to work around. This would be entirely wasteful considering ReactPHP's eventloop can handle 10k users within a single process. Not saying a single PHP process can do the work for that many users (except maybe the most basic chat) but ReactPHP can handle the IO. Throwing them all into their own process though would a nightmare.
The basic idea has been tried in other languages by giving each user their own thread, but even in C/C++ this is quickly proven to be a bad design.
I have been using elasticsearch for an e-commerce site for quite some time - not only for search, but also to retrieve product data (/index/type/{id}) to avoid SQL queries.
Generally this works really well and most requests are answered between 1ms and 3ms. But there are some requests which take 100ms-250ms - just for a GET request like /index/type/{id}, where no actual searching is done and which normally takes 1-2ms. It seems to me that something must be wrong if such a response takes more than 100ms, because the server has a lot of RAM & a fast 6-core-CPU, the data is stored on very fast SSDs, there are only 150'000 entries (about 300MB in Elasticsearch) and there is almost no load. Elasticsearch has 5GB of RAM, and there is enough spare RAM for Lucene to cache all entries all the time. Requests are made through a local network with a dedicated switch. The index has only one shard and I am running Elasticsearch 2.3.
I am doing the requests in PHP. I have already tried using Nginx as reverse proxy for Elasticsearch, but this did not solve anything - it happens with and without Nginx inbetween.
Edit: Slow requests happen about 1% of the time (in relation to total number of requests). I can also reproduce it by just making 1000 requests in PHP to /index/type/{id} in Elasticsearch - always 1% will be really slow, even when using the same ID like /index/type/55 (as long as the ID exists). This also means there is no "cache effect" - after the first request Elasticsearch should have the data "ready", yet the number of slow requests is the same no matter what IDs I request or if I request the same ID over and over.
Edit2: I have looked at the stats of my nodes with Marvel & Kibana, and nothing indicates slowdowns there: between 20-40% of JVM heap memory is used, and almost no latency (between 0.1ms and 0.5ms). It confirms that there are more than enough resources and I see no correlations or hints for the cause of any slow requests.
After a lot of testing:
These are now my definite testing results:
The larger the response from Elasticsearch, the more likely slow requests are going to happen. Many small responses have a MUCH larger chance of not being exceptionally slow than one large response.
Bombarding Elasticsearch with simple GET requests reduces the likelihood of slow responses when I run more requests in parallel.
When using a simple search for one keyword over and over again, Elasticsearch tells me in the response it "took" 2-3ms, even when a response takes 200ms until my application receives it. But also here: the larger the response, the higher the chances are of slow responses. 1KB response is never slow when I run loops of requests, 2.5KB is only a little slow (30ms) in very few instances, 10KB response always has up to 1% of slow requests with up to 200ms.
I have taken into account that it might be a network "problem", especially when Elasticsearch thinks it is fast even when it is slow. But it would be a strange root cause, because my setup is so standard (Debian Jessie). Also, keep-alive connections and TCP_NODELAY do nothing to improve this problem.
Anybody know how to find the root cause, and what could possibly be happening?
I finally found the reason for the measureable slow responses: It was the network driver or maybe even the hardware implementation on the network card.
When running tests from the node itself the slow responses disappeared, and I also noticed the older servers (8 years old compared to the only-2-years-old newer servers) had no slow responses when running tests on them, which indicated the requesting server was at fault, not the responding ES server, but it also indicated the network itself was fine, because only the "new" servers had this problem.
I went down the rabbit hole of TCP/network settings and found ethtool, which shows network configuration and also allows to change it. I learned there was something called "offloading", where a lot of network operations are offloaded to the network card (especially splitting up requests and responses into segments), and tried the following command to disable all offloading:
ethtool -K eth1 tx off rx off sg off tso off ufo off gso off gro off lro off rxvlan off txvlan off rxhash off
Afterwards my request-1000-identical-searches-from-ES were as fast as expected - no slow requests anymore. My network card (IntelĀ® 82574L Dual port GbE LAN on a SuperMicro X9SRL-F running the e1000e driver) seems to do something in hardware which slows down responses, or holds them back, or whatever. The older servers are running the tg3 driver - offloading is enabled on them (according to ethtool), but it does not cause these delayed responses. Disabling offloading had no noticeable effect on CPU load, which is probably to be expected with any modern CPUs.
With the new settings, I was able to lower the number of slow pages due to slow Elasticsearch responses to 0.07%, where before it was about 1%. I also noticed that using Nginx as reverse proxy for Elasticsearch caused some slow responses, even though they were not many - usually about 3-5 responses for every 150'000 were above 50ms. Without Nginx, by just querying Elasticsearch directly, I am now unable to reproduce any slow requests anymore, even at a grand scale.
UPDATE 11/2017
After updating to Debian Stretch and running the server with kernel 4.9 all remaining "slow requests" disappeared. So this problem seems to be at least partly rooted in older linux kernels.
I'm building a PHP application with an API that has be able to respond very rapidly (within 100ms) to all requests, and must be able to handle up to 200 queries per second (requests are in JSON, and responses require a DB lookup + save every time). My code runs easily fast enough (very consistently around 30ms) for single requests, but as soon as it has to respond to multiple requests per second, the response times start jumping all over the place.
I don't think it's a memory problem (PHP's memory limit is set to 128MB and the code's memory usage is only around 3.5MB) or a MySQL problem (the code before any DB request is as likely to bottleneck as the bit that interacts with the DB).
Because the timing is so important, I need to get the response times as consistent as possible. So my question is: are there any simple tweaks I can make (to php.ini or Apache) to stabilise PHP's response times when handling multiple simultaneous requests?
One of the slowest things (easiest to fix) in my experience in a server in terms of bottleneck is going to be your filesystem and hard drives. I think speeding this up will help out in all other areas.
So you could for example upgrade the hard drive where your httpdocs and database resides. You can put it on an SSD drive for example. Or even make a RAM disk and place all files on it.
Alternatively you can setup your database such that it operates off of a Memory storage engine.
(Related info here too)
Of course for all that you'll need a lot of physical memory. It is also important to note if your web/app hosting you got is shared then your going to have problems with Shared Memory.
Tune Mysql
Tune Apache
Performance tune PHP
Get Zend Optimizer enabled, or look at APC, or eAccelerator
Here's some basic LAMP tuning tips from IBM
Here's a slideshare with some good advice as well
I am trying to write a client-server app.
Basically, there is a Master program that needs to maintain a MySQL database that keeps track of the processing done on the server-side,
and a Slave program that queries the database to see what to do for keeping in sync with the Master. There can be many slaves at the same time.
All the programs must be able to run from anywhere in the world.
For now, I have tried setting up a MySQL database on a shared hosting server as where the DB is hosted
and made C++ programs for the master and slave that use CURL library to make request to a php file (ex.: www.myserver.com/check.php) located on my hosting server.
The master program calls the URL every second and some PHP code is executed to keep the database up to date. I did a test with a single slave program that calls the URL every second also and execute PHP code that queries the database.
With that setup however, my web hoster suspended my account and told me that I was 'using too much CPU resources' and I that would need to use a dedicated server (200$ per month rather than 10$) from their analysis of the CPU resources that were needed. And that was with one Master and only one Slave, so no more than 5-6 MySql queries per second. What would it be with 10 slaves then..?
Am I missing something?
Would there be a better setup than what I was planning to use in order to achieve the syncing mechanism that I need between two and more far apart programs?
I would use Google App Engine for storing the data. You can read about free quotas and pricing here.
I think the syncing approach you are taking is probably fine.
The more significant question you need to ask yourself is, what is the maximum acceptable time between sync's that is acceptable? If you truly need to have virtually realtime syncing happening between two databases on opposite sites of the world, then you will be using significant bandwidth and you will unfortunately have to pay for it, as your host pointed out.
Figure out what is acceptable to you in terms of time. Is it okay for the databases to only sync once a minute? Once every 5 minutes?
Also, when running sync's like this in rapid succession, it is important to make sure you are not overlapping your syncs: Before a sync happens, test to see if a sync is already in process and has not finished yet. If a sync is still happening, then don't start another. If there is not a sync happening, then do one. This will prevent a lot of unnecessary overhead and sync's happening on top of eachother.
Are you using a shared web host? What you are doing sounds like excessive use for a shared (cPanel-type) host - use a VPS instead. You can get an unmanaged VPS with 512M for 10-20USD pcm depending on spec.
Edit: if your bottleneck is CPU rather than bandwidth, have you tried bundling up updates inside a transaction? Let us say you are getting 10 updates per second, and you decide you are happy with a propagation delay of 2 seconds. Rather than opening a connection and a transaction for 20 statements, bundle them together in a single transaction that executes every two seconds. That would substantially reduce your CPU usage.
We are using Jmeter to test our Php application running on the Apache 2 web server. I can load up Jmeter to use 25 or 50 threads and the load on the server does not increase, however the response time from the server does. The more threads the slower the response time. It seems like Jmeter or Apache is queuing the requests. I have changed the maxclients value in apache web server configuration file, but this does not change the problem. While Jmeter is running I can use the application and get respectable response times. What gives? I would expect to be able to tax my server down to 0% idle by increase the number of threads. Can anyone help point me in the right direction?
Update: I found that if I remove sessions from my application I am able to simulate a full load on the server. I have tried to re-enable sessions and use an HTTP Cookie Manager for each thread, but it does not seem to make an impact.
You need to identify where the bottleneck is occurring, and then attempt to remediate the problem.
The JMeter client should be running on a well equipted machine. I prefer a Solaris/Unix server running the JVM, but for <200 threads, a modern windows machine will do just fine. JMeter can become a bottleneck, and you won't get any meaningful results once it does. Additionally, it should run on a separate machine to what your testing, and preferable on the same network. The WAN latency can become a problem if your test rig and server are far apart.
The second thing to check is your Apache workers. Apache has a module - mod_status - which will show you the state of every worker. It's possible to have your pool size set too low. From the mod_status, you'll be able to see how many workers are in use. To few, and Apache won't have any workers to process requests, and the requests will queue up. Too many, and Apache may exhaust the memory on the box it's running on.
Next, you should check your database. If it's on a separate machine, the database could have an IO or CPU shortage.
If your hitting a bottleneck, and the server and db are on the same machine, you'll generally hit a CPU, RAM, or IO limit. I listed those in the order in which they are easiest to identify. If you get a CPU bound app, you can easily see you CPU usage go to 100%. If you run out of RAM, your machine will start swapping. On both Windows and unix it's fairly easy to see your available free RAM. Lastly, you may be IO bound. This too can be monitored using various tools or stats, but it's not as obvious as CPU.
Lastly, specifically to your question, the one thing that stands out is it's possible to have a huge number of session files stored in a single directory. Often PHP stores session information in files. If this directory gets large, it will take increasingly long amount of time for PHP to find the session. If you ran your test will cookies turned off, the PHP app may have created thousands of session files for each user request. On a Windows server, it will slow down faster than on a unix server, do to differences in the way directories are stored on the two operating systems.
Are you using a constant throughput timer? If Jmeter can't service the throughput with the threads allocated to it, you'll see this queueing and blowouts in the response time. To figure out if this is the problem, try adding more threads.
I also found a report of this happening when there are javascript calls inside the script. In this instance, try to move javascript calls to the test plan element at the top of the script, or look for ways to pre-calculate the value.
Try checking a static file served by apache and not by PHP to see if the problem is in the Apache config or the PHP config.
Also check your network connections and configuration. Our JMeter testing was progressing nicely until it hit a wall. Eventually realized we only had a 100Mb connection and it was saturated, going to gigabit fixed it. Your network cards or switch may be running at a lower speed than you think, especially if their speed setting is "auto".