error connections would be exceeded: 300 - php

When i am trying to connect aerospike(PHP Client) then i am getting an error
object(Aerospike)#4 (2) {
["errorno":"Aerospike":private] =>
int(-7) ["error":"Aerospike":private] =>
string(59) "Max node BB93615E8270008 connections would be exceeded: 300"
}

The Aerospike client for PHP has a constructor config max_threads that by default is set to 300. The PHP client is built around the C client, and passes that configuration down to the C client instance. Error status code -7 is AEROSPIKE_ERR_NO_MORE_CONNECTIONS. You could increase the max_threads.
However, I'm not sure is how you're getting this error. The non-ZTS PHP client is a single execution thread, and those connections should be reused. It's really only an issue in multi-threaded environments like HHVM, Java, C, etc when multiple commands are executing in parallel. Please give more information about your code and environment.

Related

Queued Laravel Notifications get stuck on AWS SQS

I have a worker on AWS that handles queued Laravel notifications. Some of the notifications get send out, but others get stuck in the queue and I don't know why.
I've looked at the logs in Beanstalk and see three different types of error:
2020/11/03 09:22:34 [emerg] 10932#0: *30 malloc(4096) failed (12: Cannot allocate memory) while reading upstream, client: 127.0.0.1, server: , request: "POST /worker/queue HTTP/1.1", upstream: "fastcgi://unix:/run/php-fpm/www.sock:", host: "localhost"
I see an Out of Memory issue on Bugsnag too, but without any stacktrace.
Another error is this one:
2020/11/02 14:50:07 [error] 10241#0: *2623 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 127.0.0.1, server: , request: "POST /worker/queue HTTP/1.1", upstream: "fastcgi://unix:/run/php-fpm/www.sock", host: "localhost"
And this is the last one:
2020/11/02 15:00:24 [error] 10241#0: *2698 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 127.0.0.1, server: , request: "POST /worker/queue HTTP/1.1", upstream: "fastcgi://unix:/run/php-fpm/www.sock:", host: "localhost"
I don't really understand what I can do to resolve these errors. It's just a basic Laravel / EBS / SQS setup, and the only thing the queue has to do is handle notifications. Sometimes a couple of dozens at a time. I'm running a t2.micro, and would assume that's enough to send a few e-mails? I've upped the environment to a t2.large but to no avail.
I notice that messages end up in the queue, then get the status 'Messages in flight', but then run into all sorts of troubles on the Laravel side. But I don't get any useful errors to work with.
All implementation code seems to be fine, because the first few notifications go out as expected and if I don't queue at all, all notifications get dispatched right away.
The queued notifications eventually generate two different exceptions: MaxAttemptsExceededException and an Out of Memory FatalError, but neither leads me to the actual underlying problem.
Where do I look further to debug?
UPDATE
See my answer for the problem and the solution. The database transaction hadn't finished before the worker tried to send a notification for the object that still had to be created.
What is the current memory_limit assigned to PHP? You can determine this by running this command:
php -i | grep memory_limit
You can increase this by running something like:
sed -i -e 's/memory_limit = [current-limit]/memory_limit = [new-limit]/g' [full-path-to-php-ini]
Just replace the [current-limit] with the value displayed in the first command, and [new-limit] with a new reasonable value. This might require trial and error. Replace [full-path-to-php-ini] with the full path to the php.ini that's used by the process that's failing. To find this, run:
php -i | grep php.ini
First make sure that you increased the max_execution_time and also memory_limit
Also make sure that you set --timeout option
Then make sure you follow the instruction for Amazon SQS as in laravel doc says
The only queue connection which does not contain a retry_after value is Amazon SQS. SQS will retry the job based on the Default Visibility Timeout which is managed within the AWS console.
Job Expirations & Timeouts
If you are sure that some of the queued events are correctly received and processed by the worker Laravel, then as others said it's mostly a PHP memory issue.
On beanstalk, here's what I added to my ebextensions to get larger memory for PHP (it was for composer memory issues):
Note that this is with a t3.medium EC2 instance with 4go, dedicated for laravel API only.
02-environment.config
commands:
...
option_settings:
...
- namespace: aws:elasticbeanstalk:container:php:phpini
option_name: memory_limit
value: 4096M
- namespace: aws:ec2:instances
option_name: InstanceTypes
value: t3.medium
So you can try to increase the limit use more of your available instance max ram, and deploy again so beanstalk will rebuild the instance and setup PHP memory_limit.
Note: the real config contains other configuration files and more truncated contents of course.
As you said, you are just sending an email so it should be ok. Is it happening when there's a burst of email queued? Is there, in the end, many events in the SQS deadLetterQueue? If so, it may be because of a queued email burst. And so SQS will "flood" the /worker route to execute your jobs. You could check server usage from AWS console, or htop like CLI tools to monitor, and also check SQS interface to see if many failed jobs are coming at the same moments (burst).
Edit: for elastic beanstalk, I use dusterio/laravel-aws-worker, maybe you too as your log mentions the /worker/queue route
Memory
The default amount of memory allocated to PHP can often be quite small. When using EBS, you want to use config files as much as possible - any time you're having to SSH and change things on the server, you're going to have more issues when you need to redploy. I have this added to my EBS config /.ebextensions/01-php-settings.config:
option_settings:
aws:elasticbeanstalk:container:php:phpini:
memory_limit: 256M
That's been enough when running a t3.micro to do all my notification and import processing. For simple processing it doesn't usually need much more memory than the default, but it depends a fair bit on your use-case and how you've programmed your notifications.
Timeout
As pointed out in this answer already, the SQS queue operates a little differently when it comes to timeouts. This is a small trait that I wrote to help work around this issue:
<?php
namespace App\Jobs\Traits;
trait CanExtendSqsVisibilityTimeout
{
/** NOTE: this needs to map to setting in AWS console */
protected $defaultBackoff = 30;
protected $backoff = 30;
/**
* Extend the time that the job is locked for processing
*
* SQS messages are managed via the default visibility timeout console setting; noted absence of retry_after config
* #see https://laravel.com/docs/7.x/queues#job-expirations-and-timeouts
* AWS recommends to create a "heartbeat" in the consumer process in order to extend processing time:
* #see https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html#configuring-visibility-timeout
*
* #param int $delay Number of seconds to extend the processing time by
*
* #return void
*/
public function extendBackoff($delay = 60)
{
if ($this->job) {
// VisibilityTimeout has a 12 hour (43200s) maximum and will error above that; no extensions if close to it
if ($this->backoff + $delay > 42300) {
return;
}
// add the delay
$this->backoff += $delay;
$sqs = $this->job->getSqs();
$sqsJob = $this->job->getSqsJob();
$sqs->changeMessageVisibility([
'QueueUrl' => $this->job->getQueue(),
'ReceiptHandle' => $sqsJob['ReceiptHandle'],
'VisibilityTimeout' => $this->backoff,
]);
}
}
}
Then for a queued job that was taking a long time, I changed the code a bit to work out where I could insert a sensible "heartbeat". In my case, I had a loop:
class LongRunningJob implements ShouldQueue
{
use CanExtendSqsVisibilityTimeout;
//...
public function handle()
{
// some other processing, no loops involved
// now the code that loops!
$last_extend_at = time();
foreach ($tasks as $task) {
$task->doingSomething();
// make sure the processing doesn't time out, but don't extend time too often
if ($last_extend_at + $this->defaultBackoff - 10 > time()) {
// "heartbeat" to extend visibility timeout
$this->extendBackoff();
$last_extend_at = time();
}
}
}
Supervisor
It sounds like you might need to look at how you're running your worker(s) in a bit more detail.
Having Supervisor running to help restart your workers is a must, I think. Otherwise if the worker(s) stop working, messages that are queued up will end up getting deleted as they expire. It's a bit fiddly to get working nicely with Laravel + EBS - there isn't really much good documentation around it, which is potentially why not having to manage it is one of the selling points for Vapor!
We finally found out what the problem was, and it wasn't memory or execution time.
Already from the beginning I thought it was strange that either default memory or default execution time wasn't sufficient to send an e-mail or two.
Our use case is: a new Article is created and users receive a notification.
A few clues that led to the solution:
We noticed that we usually have problems with the first notification.
If we create 10 articles at the same time, we miss the first notification on every article.
We set the HTTP Max Connections in the Worker to 1. When creating 10 articles simultaneously, we noticed that only the first article missed the first notification.
We didn't get any useful error messages from the Worker, so we decided to set up our own EC2 and run php artisan queue manually.
What we then saw explained everything:
Illuminate\Database\Eloquent\ModelNotFoundException: No query results for model [App\Article]
This is an error that we never got from the EBS Worker / SQS and swiftly led to the solution:
The notification is handled before the article has made it to the database.
We have added a delay to the worker and haven't had a problem since then. We recently added a database transaction to the process of creating an article, and creating the notification happens within that transaction (but on the very end). I think that's why we didn't have this problem before. We decided to leave the notification creation in the transaction, and just handle the notifications with a delay. This means we don't have to do a hotfix to get this solved.
Thanks to everyone who joined in to help!

PHP mysql persistent connection not reused ( opens more than one connection per fpm-worker )

I'm facing a really weird behaviour while testing persistent connections from php to mysql. I have a small script that looks like this:
<?php
$db = new mysqli('p:db-host','user','pass','schema');
$res = $db->query('select * from users limit 1');
print_r($res->fetch_assoc());
My setup is :
OS: CentOS 7.3
PHP/7.1.18
php-fpm
nginx/1.10.2
MySQL-5.6.30
I tried to do some requests with ab:
$ ab -c 100 -n 500 http://mysite/my_test_script.php
PHP-FPM was configured to have 150 workers ready, and i saw what i was expecting, 150 established connections to mysql, which stayed open after the ab finished. I launched ab once again, and the behaviour was still the same, 150 connections, no new connections where opened. All fine. Then i created a script which did the the same exact requests, same IP, same HTTP headers, but used curl to make the request, and BOOM i had 300 connections on mysql instead of 150. I launched the script again, i got still 300 connections. Subsequent runs of the same script didn't increase the number of connections. Did anyone ever faced anything like this? Does anyone know what could make php open more connections than needed? Am I missing something obvious?
If it's not clear what i'm asking, please comment below and i will try to better my explain problem.
P.S. I tried this with PDO too, same behaviour.
EDIT: My tests where not accurate
After further testing i noticed that my first tests where not accurate. I was in a multi-tenant environment and different connections ( different schema ) where initialized when i launched ab. In my case the php documentation was a bit missleading, it says:
PHP checks if there's already an identical persistent connection (that remained open from earlier) - and if it exists, it uses it. If it does not exist, it creates the link. An 'identical' connection is a connection that was opened to the same host, with the same username and the same password (where applicable).
http://php.net/manual/en/features.persistent-connections.php
Maybe its i obvious to everyone, I don't know, it was not for me. Passing the 4th parameter to mysqli made php consider connections not identical. Once i changed my code to something like this:
<?php
$db = new mysqli('p:db-host','user','pass');
$db->select_db('schema');
$res = $db->query('select * from users limit 1');
print_r($res->fetch_assoc());
The application started to behave as i expected, one connection per worker.

PHP Websockets server stops accepting connections after 256 users

I am running a websockets server using https://github.com/ghedipunk/PHP-Websockets/blob/master/websockets.php on an Ubuntu 16 box with PHP7
After 256 users connect to the websocket, it stops taking connections and I can't figure out why. In the client, I get a 1006 error code (connection was closed abnormally (locally) by the browser implementation) and no further information. The websockets request doesn't appear to make it to the websockets server (which normally echos "Client Connected" right after a socket connection is made).
In the connect() function, one of the things I do is echo the count of the number of users, sockets and overall memory usage to the log. This problem occurs whenever the user count hits 256 (at which point the socket count is 257 and memory usage around 4Mb). The fact that it happens at 256 makes me think that a limit is being hit somewhere, but I can't find that limit. If I restart the websockets server, everything works fine again.
From my investigation so far, I have tried and checked:
ulimit (says it's unlimited)
MySQL connection limit (was set to default, now is 1000, but that didn't help)
Increased the PHP memory limit (because why not, just to see)
SOMAXCONN is set to 128, so I don't think this is the problem, but I would have to recompile PHP to test it. I haven't tried this yet.
Apache: The message I get when the problem occurs is: [proxy:error] [pid 16785] (111)Connection refused: AH00957: WS: attempt to connect to 10.0.0.240:9000 (websockets.mydomain.com) failed, which doesn't tell me much about anything. Apache is running MPM prefork and I have increased the spare servers and MaxRequestWorkers
I am open to any suggestions as to where to look next or how to get more detail out of the "Connection Refused" error log from apache!
Thanks
It's Apache that holding you up. Try setting the following in your conf file...
MaxClients 512
ServerLimit 512
(you must set both)
Of course, you can use whatever numbers work for you. In mpm-prefork, you should be able to go to 20,000 but that really shouldn't be necessary.

Mongos replicaset

I switched my mongodb environment from replication-sets to sharding with replication-sets through mongos.
I had 3 rep-sets (A,B,C) which I switched to S1(A,B); S2(C,D) with mongoS running on A,B,C,D.
When I was connecting to my old system, I connected as followed
new Mongo("mongodb://A,B,C", array("replicaSet" => "repset-name"));
Now I tried to to the same with mongoS wich throws an interal server error
new Mongo("mongodb://A,B,C,D", array("replicaSet" => "repset-name"));
If I get rid of the "replicaSet" option, it works again.
new Mongo("mongodb://A,B,C,D")
I was wondering if mongoS now balances the reads between the rep-sets in the shard (e.g. S1 balance between A and B) without the "replicaSet" option set?
By the way, pymongo reacts the same way with a pymongo.errors.AutoReconnect "No address associated with hostname".
Thx
Correct, once you've sharded, you should connect your driver to the mongos as if it were a single server. Mongos is now responsible for distributing reads and writes among the primaries and secondaries around your cluster. Set slaveOk to True for reads if you want mongos to distribute reads to secondaries.

Difference between PHP SQL Server Driver and SQLCMD when running queries

Why is that the SQL Server PHP Driver has problms with long running queries?
Every time I have a query that takes a while to run, I get the following errors from sqlsrv_errors() in the below order:
Shared Memory failure, Communication
Link Failure, Timeout failure
But if I try the same query with SQLCMD.exe it comes back fine. Does the PHP SQL Server Driver have somewhere that a no timeout can be set?
Whats the difference between running queries via SQLCMD and PHP Driver?
Thanks all for any help
Typical usage of the PHP Driver to run a query.
function already_exists(){
$model_name = trim($_GET['name']);
include('../includes/db-connect.php');
$connectionInfo = array('Database' => $monitor_name);
$conn = sqlsrv_connect($serverName, $connectionInfo);
$tsql = "SELECT model_name FROM slr WHERE model_name = '".$model_name."'";
$queryResult = sqlsrv_query($conn, $tsql);
if($queryResult != false){
$rows = sqlsrv_has_rows($queryResult);
if ($rows === true){
return true;
}else{
return false;
}
}else{
return false;
}
sqlsrv_close($conn);
}
SQLCMD has no query execution timeout by default. PHP does. I assume you're using mssql_query? If so, the default timeout for queries through this API is 60 seconds. You can override it by modifying the configuration property mssql.timeout.
See more on the configuration of the MSSQL driver in the PHP manual.
If you're not using mssql_query, can you give more details on exactly how you're querying SQL Server?
Edit [based on comment]
Are you using sqlsrv_query then? Looking at the documentation this should wait indefinately, however you can override it. How long is it waiting before it seems to timeout? You might want to time it and see if it's consistent. If not, can you provide a code snippet (edit your question) to show how you're using the driver.
If MSDTC is getting involved (and I don't know how you can ascertain this), then there's a 60-second timeout on that by default. This is configured in the Component Services administration tool and lives in a different place dependent on version of Windows.
SQL Server 2005 limits the maximum
number of TDS packets to 65,536 per
connection (limit that was removed in
SQL Server 2008). As the default
PacketSize for the SQL Server Native
Client (ODBC layer) is 4K, the PHP
driver has a de-facto transfer limit
of 256MB per connection. When
attempting to transfer more than
65,536 packets, the connection is
reset at TDS protocol level.
Therefore, you should make sure that
the BULK INSERT is not going to push
through more than 256 MB of data;
otherwise the only alternative is to
migrate your application to SQL Server
2008.
From MSDN Forums
http://social.msdn.microsoft.com/Forums/en-US/sqldriverforphp/thread/4a8d822f-83b5-4eac-a38c-6c963b386343
PHP itself has several different timeout settings that you can control via php.ini. The one that often causes problems like you're seeing is max_execution_time (see also set_time_limit()). If these limits are exceeded, php will simply kill the process without regard for ongoing activities (like a running db query).
There is also a setting, memory_limit, that does as its name suggests. If the memory limit is exceeded, php just kills the process without warning.
good luck.

Categories