Monday, February 1, 2016

Elasticsearch Version Upgrade using Rolling Restart

This upgrade method follows updating one node at a time and restart again. The same method can also be used for restarting a elasticsearch cluster in a safe and efficient way.

Master-eligible nodes


In a elasticsearch production cluster, start with one master-eligible node, stop the elasticsearch service, perform upgrade, and then restart the service. Repeat this until all master-eligible nodes have been upgraded. Since master-eligible nodes do not keep shards and replicas, the process is safe with data so far.

Client nodes


Next stop, upgrade, and restart the client nodes one at a time just like with the master-eligible nodes.

Data nodes


Next, before starting to upgrade any data node in the elasticsearch cluster, dynamically adding a setting to temporarily turn off the resharding of the elasticsearch cluster via restful api calls to the cluster (because if we restart a data node, the shards in the cluster will rebalance). The setting to be temporally disabled is the cluster.routing.allocation.enable, which can be done by issue the following call to the elasticsearch cluster:

curl -X PUT -H "Content-Type: application/json" http://elastic-cluster:9200/_cluster/settings/ -d '
{ "transient": { "cluster.routing.allocation.enable": "none" } }
'
(Note that the "transient" allows the setting to be not permanent)

Now stop, upgrade and restart a data node. At this point, we can reverse the setting for cluster.routing.allocation.enable by running the curl restful below:

curl -X PUT -H "Content-Type: application/json" http://elastic-cluster:9200/_cluster/settings/ -d '
{ "transient": { "cluster.routing.allocation.enable": "all" } }
'
Once this is done, the origin shards for that data node will be up again.

Next, proceed to the second data node and repeat the process above until all data nodes are upgraded.