mirantis/dtr emergency-repair¶
Recover MSR from loss of quorum.
Usage¶
docker run -it --rm mirantis/dtr \
emergency-repair [command options]
Description¶
The emergency-repair command repairs an MSR cluster that has lost quorum by reverting your cluster to a single MSR replica.
There are three actions you can take to recover an unhealthy MSR cluster:
If the majority of replicas are healthy, remove the unhealthy nodes from the cluster, and join new ones for high availability.
If the majority of replicas are unhealthy, use the emergency-repair command to revert your cluster to a single MSR replica.
If you cannot repair your cluster to a single replica, you must restore from an existing backup, using the restore command.
When you run this command, an MSR replica of your choice is repaired and
turned into the only replica in the whole MSR cluster. The containers
for all the other MSR replicas are stopped and removed. When using the
force
option, the volumes for these replicas are also deleted.
After repairing the cluster, you should use the join command to add more MSR replicas for high availability.
Options¶
Option |
Environment variable |
Description |
---|---|---|
|
$DEBUG |
Enable debug mode for additional logs. |
|
$MSR_REPLICA_ID |
The ID of an existing MSR replica. To add, remove or modify MSR, you must connect to the database of an existing healthy replica. |
|
$MSR_EXTENDED_HELP |
Display extended help text for a given command. |
|
$NOCOLOR |
Disable output coloring in logs. |
|
$MSR_OVERLAY_SUBNET |
The subnet used by the dtr-ol overlay network.
Example: |
|
$PRUNE |
Delete the data volumes of all unhealthy replicas. With this option, the volume of the MSR replica you’re restoring is preserved but the volumes for all other replicas are deleted. This has the same result as completely uninstalling MSR from those replicas. |
|
$UCP_CA |
Use a PEM-encoded TLS CA certificate for MKE. Download the MKE
TLS CA certificate from |
|
$UCP_INSECURE_TLS |
Disable TLS verification for MKE. The installation
uses TLS but always trusts the TLS certificate used by MKE, which can
lead to man-in-the-middle attacks. For production deployments,
use |
|
$UCP_PASSWORD |
The MKE administrator password. |
|
$UCP_URL |
The MKE URL including domain and port. |
|
$UCP_USERNAME |
The MKE administrator username. |
|
$YES |
Answer yes to any prompts. |
|
$MAX_WAIT |
The maximum amount of time MSR allows an operation to complete within.
This is frequently used to allocate more startup time to very large MSR
databases. The value is a Golang duration string. For example, |