etcd maintenance service#
To facilitate etcd maintenance, MKE 4k deploys the etcd maintenance host service on all controller nodes. This service is currently exposed only on localhost.
To configure the etcd maintenance service, edit the etcd section of the the
mke4.yaml configuration file:
etcd:
maintenanceService:
httpPort: <desired_http_port>
grpcPort: <desired_grpc_port>
User communication with the etcd service occurs through the HTTP port, the default for which is port 18088. Internal communication for
the service occurs through the GRPC port, the default for which is port
5557.
The etcd maintenance service provides automated maintenance operations for the underlying etcd database in your cluster. Detailed in the table below, these operations serve to prevent etcd from running out of storage space and help maintain optimal cluster performance.
| Operation | Detail |
|---|---|
| Kubernetes event cleanup and etcd compaction |
|
| etcd defragmentation |
|
| Maintenance operation scheduling | You can schedule the cleanup and compaction operation and the etcd defragmentation operation to run automatically at specific times. The etcd maintenance service runs on all control plane nodes and ensures only one maintenance operation is run at a time. |
Best Practices
- Set etcd maintenance to take place during periods of when cluster usage is minimal, such as weekends or early morning hours.
- Run etcd maintenance at weekly or monthly intervals. The minimum allowed interval is 72 hours (3 days), and daily schedules are not permitted.
- Enable both cleanup and defragmentation operations, to ensure optimal etcd health.
- Monitor the first few maintenance runs to verify successful completion.
- Set appropriate timeout intervals. Configure the
defragTimeoutSecondsparameter based on your cluster size and etcd database size, taking into account that larger clusters may need longer timeout intervals. - Retain recent events with
minTTLToKeepSeconds, as needed for troubleshooting. For insance,86400for 24 hours.