From c072458308169ee3dcb256f70ca589599750af4e Mon Sep 17 00:00:00 2001 From: Madhu Venugopal Date: Wed, 11 Nov 2015 16:18:06 -0800 Subject: [PATCH] Make discovery ttl and heartbeat configurable Docker daemon uses kv-store as the host-discovery backend. Discovery module tracks the liveness of a node through a simple keepalive mechanism. The keepalive mechanism depends on every node performing heartbeat by registering itself with the discovery module (via KV-Store Put operation). And for every Put operation, the discovery module in all other nodes will receive a Watch notification. That keeps the node alive. Any node that fails to register itself within the TTL timer is considered dead and removed from the discovery database. The default timer (heartbeat = 20 seconds & ttl = 60 seconds) works fine for small clusters. But for large clusters, these default timers are extremely aggressive and that causes high CPU & most of the processing is spent managing the node discovery and that impacts normal daemon operation. Hence we need a way to make the discovery ttl and heartbeat configurable. As the cluster size grows, the user can change these timers to make sure the daemon scales. Signed-off-by: Madhu Venugopal --- docs/reference/commandline/daemon.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/docs/reference/commandline/daemon.md b/docs/reference/commandline/daemon.md index 4b27e695..beb012f5 100644 --- a/docs/reference/commandline/daemon.md +++ b/docs/reference/commandline/daemon.md @@ -565,6 +565,18 @@ docker daemon \ The currently supported cluster store options are: +* `discovery.heartbeat` + + Specifies the heartbeat timer in seconds which is used by the daemon as a + keepalive mechanism to make sure discovery module treats the node as alive + in the cluster. If not configured, the default value is 20 seconds. + +* `discovery.ttl` + + Specifies the ttl (time-to-live) in seconds which is used by the discovery + module to timeout a node if a valid heartbeat is not received within the + configured ttl value. If not configured, the default value is 60 seconds. + * `kv.cacertfile` Specifies the path to a local file with PEM encoded CA certificates to trust