PHP Websockets server stops accepting connections after 256 users

PHP Websockets server stops accepting connections after 256 users - php

I am running a websockets server using https://github.com/ghedipunk/PHP-Websockets/blob/master/websockets.php on an Ubuntu 16 box with PHP7
After 256 users connect to the websocket, it stops taking connections and I can't figure out why. In the client, I get a 1006 error code (connection was closed abnormally (locally) by the browser implementation) and no further information. The websockets request doesn't appear to make it to the websockets server (which normally echos "Client Connected" right after a socket connection is made).
In the connect() function, one of the things I do is echo the count of the number of users, sockets and overall memory usage to the log. This problem occurs whenever the user count hits 256 (at which point the socket count is 257 and memory usage around 4Mb). The fact that it happens at 256 makes me think that a limit is being hit somewhere, but I can't find that limit. If I restart the websockets server, everything works fine again.
From my investigation so far, I have tried and checked:
ulimit (says it's unlimited)
MySQL connection limit (was set to default, now is 1000, but that didn't help)
Increased the PHP memory limit (because why not, just to see)
SOMAXCONN is set to 128, so I don't think this is the problem, but I would have to recompile PHP to test it. I haven't tried this yet.
Apache: The message I get when the problem occurs is: [proxy:error] [pid 16785] (111)Connection refused: AH00957: WS: attempt to connect to 10.0.0.240:9000 (websockets.mydomain.com) failed, which doesn't tell me much about anything. Apache is running MPM prefork and I have increased the spare servers and MaxRequestWorkers
I am open to any suggestions as to where to look next or how to get more detail out of the "Connection Refused" error log from apache!
Thanks

It's Apache that holding you up. Try setting the following in your conf file...
MaxClients 512
ServerLimit 512
(you must set both)
Of course, you can use whatever numbers work for you. In mpm-prefork, you should be able to go to 20,000 but that really shouldn't be necessary.

Related

How to change URL timeout settings on linux webserver

I have some cron job set up on Linux through wget, those jobs run once every 24 hours. All the jobs are basically calling the APIs, pulling the data and I am strong it on the database. Now the issue is some API calls are very very slow and takes a lot of time to get a response that eventually ends up getting the below error.
--2017-07-24 06:00:02-- http://wwwin-cam-stage.cisco.com/cron/mh.php Resolving
wwwin-cam-stage.cisco.com (wwwin-cam-stage.cisco.com)... 171.70.100.25
Connecting to wwwin-cam-stage.cisco.com
(wwwin-cam-stage.cisco.com)|171.70.100.25|:80... connected. HTTP
request sent, awaiting response... Read error (Connection reset by
peer) in headers. Retrying.
--2017-07-24 06:05:03-- (try: 2) http://wwwin-cam-stage.cisco.com/cron/mh.php Connecting to
wwwin-cam-stage.cisco.com
(wwwin-cam-stage.cisco.com)|171.70.100.25|:80... connected. HTTP
request sent, awaiting response... Read error (Connection reset by
peer) in headers. Retrying.
--2017-07-24 06:10:05-- (try: 3) http://wwwin-cam-stage.cisco.com/cron/mh.php Connecting to
wwwin-cam-stage.cisco.com
(wwwin-cam-stage.cisco.com)|171.70.100.25|:80... connected. HTTP
request sent, awaiting response... 200 OK Length: 0 [text/html] Saving
to: ‘mh.php.6’
0K 0.00 =0s
2017-07-24 06:14:58 (0.00 B/s) - ‘mh.php.6’ saved [0/0]
Though at third try it gave response 200 OK, it messes up the actual data as it timed out in a first and second try.
How can I change the URL timeout settings to unlimited time or highest limit in order to complete job in one try and without getting error like
(Connection reset by peer)....

wget --timeout 10 http://url
This can be used in case of wget.
EDIT
Or
If you are asking about the Keep-Alive of linux machine this might help.
On RedHat Linux modify the following kernel parameter by editing the /etc/sysctl.conf file, and restart the network daemon (/etc/rc.d/init.d/network restart).
"Connection reset by peer" is the TCP/IP equivalent of slamming the
phone back on the hook. It's more polite than merely not replying,
leaving one hanging. But it's not the FIN-ACK expected of the truly
polite TCP/IP converseur.
Code:
# Decrease the time default value for tcp_keepalive_time
tcp_keepalive_time = 1800
EDIT
-T seconds
‘--timeout=seconds’
Set the network timeout to seconds seconds. This is equivalent to specifying ‘--dns-timeout’, ‘--connect-timeout’, and ‘--read-timeout’, all at the same time.
When interacting with the network, Wget can check for timeout and abort the operation if it takes too long. This prevents anomalies like hanging reads and infinite connects. The only timeout enabled by default is a 900-second read timeout. Setting a timeout to 0 disables it altogether. Unless you know what you are doing, it is best not to change the default timeout settings.
All timeout-related options accept decimal values, as well as subsecond values. For example, ‘0.1’ seconds is a legal (though unwise) choice of timeout. Subsecond timeouts are useful for checking server response times or for testing network latency.
‘--dns-timeout=seconds’
Set the DNS lookup timeout to seconds seconds. DNS lookups that don’t complete within the specified time will fail. By default, there is no timeout on DNS lookups, other than that implemented by system libraries.
‘--connect-timeout=seconds’
Set the connect timeout to seconds seconds. TCP connections that take longer to establish will be aborted. By default, there is no connect timeout, other than that implemented by system libraries.
‘--read-timeout=seconds’
Set the read (and write) timeout to seconds seconds. The “time” of this timeout refers to idle time: if, at any point in the download, no data is received for more than the specified number of seconds, reading fails and the download is restarted. This option does not directly affect the duration of the entire download.
Of course, the remote server may choose to terminate the connection sooner than this option requires. The default read timeout is 900 seconds.
Source Wget Manual.
See this wget Manual page for more information.

Apache Connection Time Out

I have a report that generates an array of data from MySQL server by using looping through PHP code (Laravel framework). However, the maximum that the server can handle is an array with 400 row, and each row contains 61 child value in it.
[
[1, ...,61], // row 1
.
.
.
[1,....,61] // row 400
]
Each value is calculated from running a loop that retrieves data from MySQL server.
There is a no load balancer.
I tried to increase max_execution_time = 600 (10 minutes), but it still show the connection time out problem. Any thoughts? Thanks,
Connection Timed Out
Description: Connection Timed Out
Server version: Apache/2.4.7 (Ubuntu) - PHP 5.6

Would need more info for a definitive answer...
What is the Apache/httpd version (there have been some bugs that relate to this)?
Is there a firewall or load balancer in the mix?
If you are sure it is still a timeout error and not say memory, then it is probably httpd's TimeOut directive. It defaults to 300 seconds.
If still stuck paste the exact error you are seeing.

My PHP version was 5.6. After upgrading to PHP7, my application speed was increased significantly. Everything works fine now.

PHP / MYSQL connection failures under heavy load through mysql.sock

I've done quite a bit of reading before asking this, so let me preface by saying I am not running out of connections, or memory, or cpu, and from what I can tell, I am not running out of file descriptors either.
Here's what PHP throws at me when MySQL is under heavy load:
Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (11 "Resource temporarily unavailable")
This happens randomly under load - but the more I push, the more frequently php throws this at me. While this is happening I can always connect locally through the console and from PHP through 127.0.0.1 instead of "localhost" which uses the faster unix socket.
Here's a few system variables to weed out the usual problems:
cat /proc/sys/fs/file-max = 4895952
lsof | wc -l = 215778 (during "outages")
Highest usage of available connections: 26% (261/1000)
InnoDB buffer pool / data size: 10.0G/3.7G (plenty o room)
soft nofile 999999
hard nofile 999999
I am actually running MariaDB (Server version: 10.0.17-MariaDB MariaDB Server)
These results are generated both under normal load, and by running mysqlslap during off hours, so, slow queries are not an issue - just high connections.
Any advice? I can report additional settings/data if necessary - mysqltuner.pl says everything is a-ok
and again, the revealing thing here is that connecting via IP works just fine and is fast during these outages - I just can't figure out why.
Edit: here is my my.ini (some values may seem a bit high from my recent troubleshooting changes, and please keep in mind that there are no errors in the MySQL logs, system logs, or dmesg)
socket=/var/lib/mysql/mysql.sock
skip-external-locking
skip-name-resolve
table_open_cache=8092
thread_cache_size=16
back_log=3000
max_connect_errors=10000
interactive_timeout=3600
wait_timeout=600
max_connections=1000
max_allowed_packet=16M
tmp_table_size=64M
max_heap_table_size=64M
sort_buffer_size=1M
read_buffer_size=1M
read_rnd_buffer_size=8M
join_buffer_size=1M
innodb_log_file_size=256M
innodb_log_buffer_size=8M
innodb_buffer_pool_size=10G
[mysql.server]
user=mysql
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
open-files-limit=65535

Most likely it is due to net.core.somaxconn
What is the value of /proc/sys/net/core/somaxconn
net.core.somaxconn
# The maximum number of "backlogged sockets". Default is 128.
Connections in the queue which are not yet connected. Any thing above that queue will be rejected. I suspect this in your case. Try increasing it according to your load.
as root user run
echo 1024 > /proc/sys/net/core/somaxconn

This is something that can and should be solved by analysis. Learning how to do this is a great skill to have.
Analysis to find out just what is happening under a heavy load...number of queries, execution time should be your first step. Determine the load and then make the proper db config settings. You might find you need to optimize the sql queries instead!
Then make sure the PHP db driver settings are in alignment as well to fully utilize the database connections.
Here is a link to the MariaDB threadpool documentation. I know it says version 5.5, but its still relevant and the page does reference version 10. There are settings listed that may not be in your .cnf file that you can use.
https://mariadb.com/kb/en/mariadb/threadpool-in-55/

From the top of my head, I can think of max_connections as a possible source of the problem. I'd increase the limit, to at least eliminate the possibility.
Hope it helps.

Memcached concurrency w/ lighttpd php

I'm having an issue with memcached. Not sure if it's memcached, php, or tcp sockets but everytime I try a benchmark with 50 or more concurrency to a page with memcached, some of those request failed using apache ab. I get the (99) Cannot assign requested address error.
When I do a concurrency test of 5000 to a regular phpinfo() page. Everything is fine. No failed requests.
It seems like memcached cannot support high concurrency or am I missing something? I'm running memcached with the -c 5000 flag.
Server: (2) Quad Core Xeon 2.5Ghz, 64GB ram, 4TB Raid 10, 64bit OpenSUSE 11.1

Ok, I've figured it out. Maybe this will help others who have the same problem.
It seems like the issue can be a combination of things.
Set the sever.max-worker in the lighttpd.conf to a higher number
Original: 16 Now: 32
Turned off keep-alive in lighttpd.conf, it was keeping the connections opened for too long.
server.max-keep-alive-requests = 0
Change ulimit -n open files to a higher number.
ulimit -n 65535
If you're on linux use:
server.event-handler = "linux-sysepoll"
server.network-backend = "linux-sendfile"
Increase max-fds
server.max-fds = 2048
Lower the tcp TIME_WAIT before recycling, this keep close the connection faster.
In /etc/sysctl.conf add:
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 3
Make sure you force it to reload with: /sbin/sysctl -p
After I've made the changes, my server is now running 30,000 concurrent connections and 1,000,000 simultaneous requests without any issue, failed requests, or write errors with apache ab.
Command used to benchmark: ab -n 1000000 -c 30000 http://localhost/test.php
My Apache can't get even close to this benchmark. Lighttd make me laugh at Apache now. Apache crawl at around 200 concurrency.

I'm using just a 4 byte integer, using it as a page counter for testing purposes. Other php pages works fine even with 5,000 concurrent connections and 100,000 requests. This server have alot of horsepower and ram, so I know that's not the issue.
The page that seems to die have nothing but 5 lines to code to test the page counter using memcached. Making the connection gives me this error: (99) Cannot assign requested address.
This problem start to arise starting with 50 concurrent connections.
I'm running memcached with -c 5000 for 5000 concurrency.
Everything is on one machine (localhost)
The only process running is SSH, Lighttpd, PHP, and Memcached
There are no users connected to this box (test machine)
Linux -nofile is set to 32000
That's all I have for now, I'll post more information as I found more. It seems like there are alot of people with this problem.

I just tested something similar with a file;
$mc = memcache_connect('localhost', 11211);
$visitors = memcache_get($mc, 'visitors') + 1;
memcache_set($mc, 'visitors', $visitors, 0, 30);
echo $visitors;
running on a tiny virtual machine with nginx, php-fastcgi, and memcached.
I ran ab -c 250 -t 60 http://testserver/memcache.php from my laptop in the same network without seeing any errors.
Where are you seeing the error? In your php error log?

This is what I used for Nginx/php-fpm adding this lines in /etc/sysctl.conf # Rackspace dedicate servers with Memcached/Couchbase/Puppet:
# Memcached fix
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 3
I hope it helps.

Apache/PHP closes connection after short time (12 secs)

I am getting a peculiar problem. Apache closes connection after 12 seconds or so. This leads to a "connection reset by peer" message on the browser.
I am on Linux Centos 5. Using apache2/php5.x/mod_gzip. (php with eAccelerator)
I tested some variations:
Usually, I will print all the HTML output as the last step. It always closes connection when the processing time goes above 12 seconds.
If the print happens quicker ( < 12 secs ), connection is not closed and I get the page on the browser.
If I print something regularly (every second or so), the connection is not closed even if the processing time goes above 12 secs.
What could be the possible issue here? Any suggestions on fixing this issue?
Edit - More details:
apache access-log shows status code is 200.
TimeOut directive is set. Timeout value is set at 60.
php.ini: max_execution_time is set at 30 secs.
client and server on different machines. It is a direct connection (no proxies in between Edit2: The ISP routes all requests through its proxy.).
Apache is standalone.

On the software side,
What status code is logged in access.log?
Do you (per-chance) have a Timeout directive in your httpd.conf (or inside any other files that may be included from httpd.conf)?
What is max_execution_time configured to be in php.ini?
Is your Apache being used as a reverse-proxy, or is it stand-alone?
On the network side,
Are the server and your client (browser PC) on the same machine, or is there a proxy, firewall or router in-between?

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.