High availability in Aiven for OpenSearch®#
Aiven for OpenSearch® is available on a variety of plans, offering different levels of high availability. The selected plan defines the features available, and a summary is provided in the table below:
Plan |
High Availability Features |
Backup History |
---|---|---|
Hobbyist |
Single-node with limited availability |
single backup for disaster recovery |
Startup |
Single-node with limited availability |
2 days, with hourly backup for 24 hours |
Business |
Three-node cluster configured for high availability |
14 days, with hourly backup for 24 hours |
Premium |
Six-node (or more) cluster configured for high availability |
30 days, with hourly backup for 24 hours |
Failure handling#
Minor failures, such as service process crashes or temporary loss of network access, are handled by Aiven automatically in all plans without any major changes to the service deployment. The service automatically restores normal operation once the crashed process is automatically restarted or when the network access is restored.
Severe failures, such as loss of a cluster node, require more drastic recovery measures. Aiven platform continuously monitors the health of every node in a cluster. When a node reports failures from its own self-diagnostics or when no response to a health check is returned, Aiven platform starts a replacement node. While the node is being replaced, client requests are rerouted to the other nodes that have replica shards. When the new node joins the cluster, it restores data from the existing nodes or from a backup if no old nodes can be reached anymore, and starts servicing requests when data restoration is completed.
Single-node Hobbyist and Startup service plans#
Losing the only node from the service starts the automatic process of creating a new replacement node. The new node starts up, restores its state from the latest available backup and resumes serving customers. Since there was just a single node providing the service, the service will be unavailable for the duration of the restore operation. All the write operations made since the last backup are lost.