Virtual machine node stops responding

Virtual machine node stops respondingΒΆ

If one of the control plane VM nodes stops responding, you may need to redeploy it.

Workaround:

  1. From the physical node where the target VM is located, get a list of the VM domain IDs and VM names:

    virsh list
    
  2. Destroy the target VM (ungraceful powering off of the VM):

    virsh destroy DOMAIN_ID
    
  3. Undefine the VM (removes the VM configuration from KVM):

    virsh undefine VM_NAME
    
  4. Verify that your physical KVM node has the correct salt-common and salt-minion version:

    apt-cache policy salt-common
    apt-cache policy salt-minion
    

    Note

    If the salt-common and salt-minion versions are not 2015.8, proceed with Install the correct versions of salt-common and salt-minion.

  5. Redeploy the VM from the physical node meant to host the VM:

    salt-call state.sls salt.control
    
  6. Verify the newly deployed VM is listed in the Salt keys:

    salt-key
    
  7. Deploy the Salt states to the node:

    salt 'OST_NAME*' state.sls linux,ntp,openssh,salt
    
  8. Deploy service states to the node:

    salt 'HOST_NAME*' state.sls keepalived,haproxy,SPECIFIC_SERVICES
    

    Note

    You may need to log in to the node itself and run the states locally for higher success rates.