Apply etcd defragmentation

Available since MKE 3.5.4

The etcd distributed key-value store retains a history of its keyspace. That history is set for compaction following a specified number of revisions, however it only releases the used space back to the host filesystem following defragmentation. For more information, refer to the etcd documentation.

With MKE you can defragment the etcd cluster while avoiding cluster outages. To do this, you apply defragmentation to etcd members one at a time. MKE will defragment the current etcd leader last, to prevent the triggering of multiple leader elections.


To defragment the etcd cluster:

  1. Trigger the etcd cluster defragmentation by issuing a POST to the https://MKE_HOST/api/ucp/etcd/defrag endpoint.

    You can specify two parameters:

    timeoutSeconds

    Sets how long MKE waits for each member to finish defragmentation. Default: 60 seconds. MKE will cancel the defragmentation if the timeout occurs before the member defragmentation completes.

    pauseSeconds

    Sets how long MKE waits between each member defragmentation. Default: 60 seconds.

    Mirantis recommends that you adjust these parameters based on the size of the etcd database and the amount of time that has elapsed since the last defragmentation.

    Example command:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/defrag --data '{"timeoutSeconds": 60, "pauseSeconds": 60}'
    

    Example response:

    "Cluster Defragmentation Initiated"
    
  2. Review the state of individual etcd cluster members and the state of the cluster defragmentation by running the following command:

    AUTHTOKEN=$(curl --silent --insecure --data '{"username":"<username>","password":"<password>"}' https://MKE_HOST/auth/login | jq --raw-output .auth_token)
    curl --insecure -H "Authorization: Bearer $AUTHTOKEN" https://MKE_HOST/api/ucp/etcd/info
    

    Example output:

    {
        "DefragInProgress": true,
        "DefragResult": "Cluster Defrag Initiated",
        "MemberInfo": [
            {
                "MemberID": 5051939019959384922,
                "Endpoint": "https://172.31.21.33:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": true,
                "Alarms": null
            },
            {
                "MemberID": 10749614093923491478,
                "Endpoint": "https://172.31.30.179:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": false,
                "Alarms": null
            },
            {
                "MemberID": 7837950661722744517,
                "Endpoint": "https://172.31.30.44:12379",
                "EtcdVersion": "3.4.16",
                "DbSize": "2 MB",
                "IsLeader": false,
                "Alarms": null
            }
        ]
    }
    

    You can monitor this endpoint until the defragmentation is complete. The information is also available in the ucp-controller logs.


To manually remove the etcd defragmentation lock file:

To maintain etcd cluster availability, MKE uses a lock file that prevents multiple defragmentations from being simultaneously implemented. MKE removes the lock file at the conclusion of defragmentation, however you can manually remove it as necessary.

Manually remove the lock file by running the following command:

docker exec ucp-controller rm /var/lock/etcd-defrag