"no alive nodes found in cluster" while indexing docs

"no alive nodes found in cluster" while indexing docs - php

I have a "legacy" php application that we just migrated to run on Google Cloud (Kubernetes Engine). Along with it I also have a ElasticSearch installation (Elastic Cloud on Kubernetes) running. After a few incidents with Kubernetes killing my Elastic Search when we're trying to deploy other services we have come to the conclusion that we should probably not run ES on Kubernetes, at least if are to manage it ourselves. This due to a apparent lack of knowledge for doing it in a robust way.
So our idea is now to move to managed Elastic Cloud instead which was really simple to deploy and start using. However... now that I try to load ES with the data needed for our php application if fails mid-process with the error message no alive nodes found in cluster. Sometimes it happens after less than 1000 "documents" and other times I manage to get 5000+ of them indexed before failure.
This is how I initialize the es client:
$clientBuilder = ClientBuilder::create();
$clientBuilder->setElasticCloudId(ELASTIC_CLOUD_ID);
$clientBuilder->setBasicAuthentication('elastic',ELASTICSEARCH_PW);
$clientBuilder->setRetries(10);
$this->esClient = $clientBuilder->build();
ELASTIC_CLOUD_ID & ELASTICSEARCH_PW are set via environment vars.
The request looks something like:
$params = [
'index' => $index,
'type' => '_doc',
'body' => $body,
'client' => [
'timeout' => 15,
'connect_timeout' => 30,
'curl' => [CURLOPT_HTTPHEADER => ['Content-type: application/json']
]
The body and which index depends on how far we get with the "ingestion", but generally pretty standard stuff.
All this works without any real problems when running on a own installation of Elastic in our own GKE cluster.
What I've tried so far is to add the retries and timeouts, but none of that seems to make much of a difference?
We're running:
php 7.4
ElasticSearch 7.11
Elastic Search client 7.12 (php via composer)

If you use WAMP64, this error will occur, You have to use XAMPP instead.
Try the following command in the command prompt, If it runs, there is a problem with your configurations.
curl -u elastic:<password> https://<endpoint>:<port>
(Ex for Elastic Cloud)
curl -u elastic:<password> example.es.us-central1.gcp.cloud.es.io:9234

Related

Is there a maximum of simultaneous localhost cUrl request with PHP?

I'm setting up a vuejs app and multiple Laravel applications that must communicate between them. Communication between Laravel applications are done through cUrl request. For now, everything are on my dev machines (Last MacBook Pro - Mac OS Majove) with the help of MAMP Pro (PHP 7.3).
Problem is when I do simultaneous queries, I had a :
CURL error 28 communication timeout ... with 0 bytes received.
It if of course not a timeout problem (I tried with 2 minutes on all applications and all timeouts - same result). Since I'm working with API, there is no PHP session (so no session file lock).
It seems that cUrl connection is closed but I don't know why (I don't close it myself - and it don't hit the timeouts (connect/read/global) ).
More visually :
vuejs --ajax1--> Laravel A --cUrl--> Laravel B --cUrl--> Laravel C
vuejs --ajax2--> Laravel A --cUrl--> Laravel B --cUrl--> Laravel C
vuejs <--500-- Laravel A --X-- Laravel B <---- Laravel C
vuejs <--500-- Laravel A --X-- Laravel B <---- Laravel C
ajax1 and ajax2 are sent at the same time.
It is working if ajax1 and ajax2 are sent NOT at the same time.
What I know:
Communication is cut between Laravel A and Laravel B but Laravel B
execute the code en return a response (that never arrived because, I think, connection is closed ?). But both request hit Laravel C and Laravel C also runs.
What I tried:
Apache and nginx
Disable firewall
Increase all timeouts (PHP - cUrl)
Increase memory limit (PHP)
cUrl CURLOPT_FORBID_REUSE and CURLOPT_FRESH_CONNECT options
Change local domain name and tld of Laravel applications
What I wondering:
Is there a maximum requests that I can do, at the same time, with cUrl on the same machine (I, of course, set CURLOPT_MAXCONNECTS to 20 - without success) ?
Is there a PHP.ini configuration that I missed ?
Is it possible that the problem come from the fact that all these applications run on the same machine ? If yes, why ?
Are both servers (nginx and Apache) limits connection from same IP ? (since all applications are on the same machine, they all have the same IP).

Just bumped into the same problem. Where main backend is acting as a proxy to get data from other backend through curl(guzzle).
But i get timeout only if doing more than 5 simultaneous requests.
All applications are live and i've tried setting up a new VPS server to avoid having all backends on the same machine.
Setting CURLOPT_MAXCONNECTS also didn't work.
This is my fake proxy code(using laravel)
$uri = $request->get('url');
$api_url = env('URL');
$token = env('TOKEN');
$key = 'Bearer ' . $token;
$client = new Client([
'base_uri' => $api_url,
'headers' => [
'Accept' => 'application/json',
'Content-Type' => 'application/json',
'Authorization' => $key
]
]);
$request = $client->request('GET', $uri, [
'http_errors' => false
]);
$response = $request->getBody()->getContents();
return $response;
I haven't found a solution yet. Hopefully someone could help us solving this problem.

Elasticsearch PHP client throwing exception "No alive nodes found in your cluster"

I am trying to do a scan and scroll operation on an index as shown in the example :
$client = ClientBuilder::create()->setHosts([MYESHOST])->build();
$params = [
"search_type" => "scan", // use search_type=scan
"scroll" => "30s", // how long between scroll requests. should be small!
"size" => 50, // how many results *per shard* you want back
"index" => "my_index",
"body" => [
"query" => [
"match_all" => []
]
]
];
$docs = $client->search($params); // Execute the search
$scroll_id = $docs['_scroll_id']; // The response will contain no results, just a _scroll_id
// Now we loop until the scroll "cursors" are exhausted
while (\true) {
// Execute a Scroll request
$response = $client->scroll([
"scroll_id" => $scroll_id, //...using our previously obtained _scroll_id
"scroll" => "30s" // and the same timeout window
]
);
// Check to see if we got any search hits from the scroll
if (count($response['hits']['hits']) > 0) {
// If yes, Do Work Here
// Get new scroll_id
// Must always refresh your _scroll_id! It can change sometimes
$scroll_id = $response['_scroll_id'];
} else {
// No results, scroll cursor is empty. You've exported all the data
break;
}
}
The first $client->search($params) API call executes fine and I am able to get back the scroll id. But $client->scroll() API fails and I am getting the exception : "Elasticsearch\Common\Exceptions\NoNodesAvailableException No alive nodes found in your cluster"
I am using Elasticsearch 1.7.1 and PHP 5.6.11
Please help

I found the php driver for elasticsearch is riddled with issues, the solution I had was to just implement the RESTful API with curl via php, Everything worked much quicker and debugging was much easier

I would guess the example is not up to date with the version you're using (the link you've provided is to 2.0, and you are sauing you use 1.7.1). Just add inside the loop:
try {
$response = $client->scroll([
"scroll_id" => $scroll_id, //...using our previously obtained _scroll_id
"scroll" => "30s" // and the same timeout window
]
);
}catch (Elasticsearch\Common\Exceptions\NoNodesAvailableException $e) {
break;
}

Check if your server running with following command.
service elasticsearch status
I had the same problem and solved it.
I have added script.disable_dynamic: true to elasticsearch.yml as explained in Digitalocan tutorial https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-elasticsearch-on-ubuntu-14-04
So elasticsearch server was not started.
I removed following line from elasticsearch.yml
script.disable_dynamic: true

restart the elastic search service and set the network host to local "127.0.0.1".

I would recommend on using php curl lib directly for elasticsearch queries.
I find it easier to use than any other elasticsearch client lib, you can simulate any query using cli curl and you can find many examples, documentation and discussions in the internet.

Maybe you should try to telnet on your machine
telnet [your_es_host] [your_es_ip]
to check if you can access to it.
If not please try to open that port or disable your machine's firewall.

That error basically means it can't find your cluster, likely due to misconfiguration on either the client's side or the server's side.

I have had the same problem with scroll and it was working with certain indexes but not with others. It must have had been a bug in the driver as it went away after I have updated elasticsearch/elasticsearch package from 2.1.3 to 2.2.0

Uncomment in elasticsearch.yml:
network.host:198....
And set to:
127.0.0.1
Like this:
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1
#
# Set a custom port for HTTP:
#
# http.port: 9200
#
I use Elasticsearch 2.2 in Magento 2 under LXC container.

I setup Elasticsearch server in docker as the doc, https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html
But it uses a different network (networks: - esnet) and it cannot talk to the application network. After remove the networks setting and it works well.

If you setup Elasticsearch server in docker as the doc, https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html
But it uses a different network (networks: - esnet) from other services and it cannot talk to the application network. After remove the networks setting and it works well.

Try:
Stop your elasticsearch service if it's already running
Go to your elasticsearch directory via terminal, run:
> ./bin/elasticsearch
This worked for me.

Updating remote site with drush and ssh

I'm very new to drush. We have a git repo of a drupal site that I would like to push to the remote server using drush. I can easily scp the drupal files or setup a cron on the remote that runs git pull but I still would like to learn how to push code and sync a remote drupal site with my local drupal.
Currently, I have drupal running locally and I use git to update the repo. The ssh is already configured and I can ssh to the remote drupal server using keys. I have also created .drush/aliases.drushrc.php file and I tested it by running drush #dev status. It worked well
<?php
$aliases['dev'] = array(
'root' => '/var/www/html',
'uri' => 'dev.example.com',
'remote-host' => '192.168.1.50'
);
?>
Now, I would like my local drupal site to be synchronized with our server on 192.168.1.50 server. The local drupal files are on my /home/ubuntu/drupal_site.
I have few questions:
What is the drush command/parameters to update remote drupal server?
What will be the drush command/parameters if remote server doesn't have drupal files yet?

Backup before synchronizing with drush ard or drush #dev ard or with the suited alias. You can set the backup path in the alias settings.
I think you named your remote server dev. That is why I keep this in the following and use the alias local for the local drupal site.
Add the alias for your local drupal site. Then you can use the following command to synchronize the files:
drush rsync #local #dev
There #local is the source and #dev the target. More details on how to use the command rsync can be displayed with:
drush help rsync
You also need to synchronize the database to get the remote site running. For this add the database account data to the alias data for #local and #dev. It will look something like this:
'databases' => array(
'default' => array(
'default' => array(
'driver' => 'mysql',
'username' => 'USERNAME',
'password' => 'PASSWORD',
'port' => '',
'host' => 'localhost',
'database' => 'DATABASE',
)
)
)
Replace the space holders with your data. Then databases can be synchronized with:
drush sql-sync #local #dev
There #local is the source and #dev the target.
Initially the synchronization will happen in one direction. After this it is good practice to synchronize files from development or test site to the productive site. The database is synchronized the other way around from productive site to development or test site.

Drush and Git workflows differ in a way, as Drush can pull packages separately – you could probably use Git to push to the server. Be sure to check the /files directory, which is usually in the .gitignore file – a possible approach would be to mirror the files directory directly from the live site.
A common approach to update and check 2 or several sites at the same time (being local and remote) would be to use Drush aliases for your sites on a script on your machine.
Articles like this one are a good starting point.

Slow DynamoDB for php session handling

we're using DynamoDB in order to synchronize sessions between more than one EC2 machine under ELBs.
We noticed that this method slow down a lot the scripts.
Specifically, I made a js that calls 10 times 3 different php scripts on the server.
1) The first one is just an echo timestamp(); and takes about 50ms as roundtrip time.
2) The second one is a php script that connect through mysqli to the RDS MySQL and takes the same time (about 50-60ms).
3) The third script use the DynamoDB session keeping method described in official AWS documentation and takes about 150ms (3 times slower!!).
I'm cleaning the garbage every night (as documentation say) and the DynamoDB metrics seems OK (attached below).
The code I use is this:
use Aws\DynamoDb\DynamoDbClient;
use Aws\DynamoDb\Session\SessionHandler;
ini_set("session.entropy_file", "/dev/urandom");
ini_set("session.entropy_length", "512");
ini_set('session.gc_probability', 0);
require 'aws.phar';
$dynamoDb = DynamoDbClient::factory(array(
'key' => 'XXXXXX',
'secret' => 'YYYYYY',
'region' => 'eu-west-1'
));
$sessionHandler = SessionHandler::factory(array(
'dynamodb_client' => $dynamoDb,
'table_name' => 'sessions',
'session_lifetime' => 259200,
'consistent_read' => true,
'locking_strategy' => null,
'automatic_gc' => 0,
'gc_batch_size' => 25,
'max_lock_wait_time' => 15,
'min_lock_retry_microtime' => 5000,
'max_lock_retry_microtime' => 50000,
));
$sessionHandler->register();
session_start();
Am I doing something wrong, or is it normal all that time to retrieve the session?
Thanks.

Copying correspondence from an AWS engineer in AWS forums: https://forums.aws.amazon.com/thread.jspa?messageID=597493
Here a couple things to check:
Are you running your application on EC2 in the same region as your DynamoDB table?
Have you enabled OPcode caching to ensure that the classes used by the SDK do not need to be loaded from disk and parsed each time your
script is run?
Using a web server like Apache and connecting to a DynamoDB session
will require a new SSL connection to be established on each request.
This is because PHP doesn't (currently) allow you to reuse cURL
connection handles between requests. Some database drivers do allow
for a persistent connections between requests, which could account for
the performance difference.
If you follow up on the AWS forums thread, an AWS engineer should be able to help you with your issue. This thread is also monitored if you want to keep it open.

Detecting no available server with PHP and Gearman

I'm currently making use of Gearman with PHP using the standard bindings (docs here). All functioning fine, but I have one small issue with not being able to detect when a call to GearmanClient::addServer (docs here) is "successfull", by which I mean...
The issue is that adding the server attempts no socket I/O, meaning that the server may not actually exist or be operational. This means that subsequent code calls (in the scenario where the sever does not infact exist) fail and result in PHP warnings
Is there any way, or what is the best way, to confirm that the Gearman Daemon is operational on the server before or after adding it?
I would like to achieve this so that I can reliably handle scenarios in which Gearman may have died, or the server is un-contactable perhaps..
Many thanks.

We first tried this by manually calling fsockopen on the host and port passed to addServer, but it turns out that this can leave a lot of hanging connections as the Gearman server expects something to happen over that socket.
We use a monitor script to check the status of the daemon and its workers — something similar to this perl script on Google Groups. We modified the script to restart the daemon if it was not running.
If this does not appeal, have a look at the Gearman Protocol (specifically the “Administrative Protocol” section, referenced in the above thread) and use the status command. This will give you information on the status of the jobs and workers, but also means you can perform a socket connection to the daemon and not leave it hanging.

You can use this library: https://github.com/necromant2005/gearman-stats
It has no external dependencies.
$adapter = new \TweeGearmanStat\Queue\Gearman(array(
'h1' => array('host' => '10.0.0.1', 'port' => 4730, 'timeout' => 1),
'h2' => array('host' => '10.0.0.2', 'port' => 4730, 'timeout' => 1),
));
$status = $adapter->status();
var_dump($status);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

"no alive nodes found in cluster" while indexing docs - php

Related

Is there a maximum of simultaneous localhost cUrl request with PHP?

Elasticsearch PHP client throwing exception "No alive nodes found in your cluster"

Updating remote site with drush and ssh

Slow DynamoDB for php session handling

Detecting no available server with PHP and Gearman

Categories

Resources