Apply etcd defragmentation

The etcd distributed key-value store retains a history of its keyspace. That history is set for compaction following a specified number of revisions, however it only releases the used space back to the host filesystem following defragmentation. For more information, refer to the etcd documentation.

With MKE you can defragment the etcd cluster while avoiding cluster outages. To do this, you apply defragmentation to etcd members one at a time. MKE will defragment the current etcd leader last, to prevent the triggering of multiple leader elections.

Important

In a High Availability (HA) cluster, the defragmentation process subtly affects cluster dynamics, because when a node undergoes defragmentation it temporarily leaves the pool of active nodes. This subsequent reduction in the active node count results in a proportional increase of the load on the remaining nodes, which can lead to performance degradation if the remaining nodes do not have the capacity to handle the additional load. In addition, at the end of the process, when the leader node is undergoing defragmentation, there is a brief period during which cluster write operations do not take place. This pause occurs when the system initiates and completes the leader election process, and though it is automated and brief it does result in a momentary write block on the cluster.

Taking the described factors into account, Mirantis recommends taking a cautious scheduling approach in defragmenting HA clusters. Ideally, the defragmentation should occur during planned maintenance windows rather than relying on a recurring cron job, as during such periods you can closely monitor potential impacts on performance and availability and mitigate as necessary.


To defragment the etcd cluster:

  1. Trigger the etcd cluster defragmentation by issuing a POST to the https://MKE_HOST/api/ucp/etcd/defrag endpoint.

    You can specify two parameters:

    timeoutSeconds

    Sets how long MKE waits for each member to finish defragmentation. Default: 60 seconds. MKE will cancel the defragmentation if the timeout occurs before the member defragmentation completes.

    pauseSeconds

    Sets how long MKE waits between each member defragmentation. Default: 60 seconds.

    Mirantis recommends that you adjust these parameters based on the size of the etcd database and the amount of time that has elapsed since the last defragmentation.

    Example command:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/defrag --data '{"timeoutSeconds": 60, "pauseSeconds": 60}'
    

    Example response:

    "Cluster Defragmentation Initiated"
    
  2. Review the state of individual etcd cluster members and the state of the cluster defragmentation by running the following command:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/info
    

    Example output:

    {
        "DefragInProgress": true,
        "DefragResult": "Cluster Defrag Initiated",
        "MemberInfo": [
            {
                "MemberID": 5051939019959384922,
                "Endpoint": "https://172.31.21.33:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": true,
                "Alarms": null
            },
            {
                "MemberID": 10749614093923491478,
                "Endpoint": "https://172.31.30.179:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": false,
                "Alarms": null
            },
            {
                "MemberID": 7837950661722744517,
                "Endpoint": "https://172.31.30.44:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": false,
                "Alarms": null
            }
        ]
    }
    

    You can monitor this endpoint until the defragmentation is complete. The information is also available in the ucp-controller logs.


To manually remove the etcd defragmentation lock file:

To maintain etcd cluster availability, MKE uses a lock file that prevents multiple defragmentations from being simultaneously implemented. MKE removes the lock file at the conclusion of defragmentation, however you can manually remove it as necessary.

Manually remove the lock file by running the following command:

docker exec ucp-controller rm /var/lock/etcd-defrag