Reschedule stateful applications

Note

The procedure applies to the MOSK clusters running MOSK 23.3 series or earlier versions. Starting from 24.1, MOSK performs the rescheduling of stateful applications automatically.

The rescheduling of stateful applications may be required when replacing a permanently failed node, decommissioning a node, migrating applications to nodes with a more suitable set of hardware, and in several other use cases.

MOSK deployment profiles include the following stateful applications:

  • OpenStack database (MariaDB)

  • OpenStack coordination (etcd)

  • OpenStack Time Series Database backend (Redis)

Each stateful application from the list above has a persistent volume claim (PVC) based on a local persistent volume per pod. Each of control plane nodes has a set of local volumes available. To migrate an application pod to another node, recreate a PVC with the persistent volume from the target node.

Caution

A stateful application pod can only be migrated to a node that does not contain other pods of this application.

Caution

When a PVC is removed, all data present in the related persistent volume is removed from the node as well.

Reschedule pods to another control plane node

This section describes how to reschedule pods for MariaDB, etcd, and Redis to another control plane node.

Reschedule pods for MariaDB

Important

Perform the pods rescheduling if you have to move a PVC to another node and the current node is still present in the cluster. If the current node has been removed already, MOSK reschedules pods automatically when a node with required labels is present in the cluster.

  1. Recreate PVCs as described in Recreate a PVC on another control plane node.

  2. Remove the pod:

    Note

    To remove a pod from a node in the NotReady state, add --grace-period=0 --force to the following command.

    kubectl -n openstack delete pod <STATEFULSET-NAME>-<NUMBER>
    
  3. Wait until the pod appears in the Ready state.

    When the rescheduling is finalized, the <STATEFULSET-NAME>-<NUMBER> pod rejoins the Galera cluster with a clean MySQL data directory and requests the Galera state transfer from the available nodes.

Reschedule pods for Redis

Important

Perform the pods rescheduling if you have to move a PVC to another node and the current node is still present in the cluster. If the current node has been removed already, MOSK reschedules pods automatically when a node with required labels is present in the cluster.

  1. Recreate PVCs as described in Recreate a PVC on another control plane node.

  2. Remove the pod:

    Note

    To remove a pod from a node in the NotReady state, add --grace-period=0 --force to the following command.

    kubectl -n openstack-redis delete pod <STATEFULSET-NAME>-<NUMBER>
    
  3. Wait until the pod is in the Ready state.

Reschedule pods for etcd

Warning

During the reschedule procedure of the etcd LCM, a short cluster downtime is expected.

  1. Before MOSK 23.1:

    1. Identify the etcd replica ID that is a numeric suffix in a pod name. For example, the ID of the etcd-etcd-0 pod is 0. This ID is required during the reschedule procedure.

      kubectl -n openstack get pods | grep etcd
      

      Example of a system response:

      etcd-etcd-0                    0/1     Pending                 0          3m52s
      etcd-etcd-1                    1/1     Running                 0          39m
      etcd-etcd-2                    1/1     Running                 0          39m
      
    2. If the replica ID is 1 or higher:

      1. Add the coordination section to the spec.services section of the OsDpl object:

        spec:
          services:
            coordination:
              etcd:
                values:
                  conf:
                    etcd:
                      ETCD_INITIAL_CLUSTER_STATE: existing
        
      2. Wait for the etcd statefulSet to update the new state parameter:

        kubectl -n openstack get sts etcd-etcd -o jsonpath='{.spec.template.spec.containers[0].env[?(@.name=="ETCD_INITIAL_CLUSTER_STATE")].value}'
        
  2. Scale down the etcd StatefulSet to 0 replicas. Verify that no replicas are running on the failed node.

    kubectl -n openstack scale sts etcd-etcd --replicas=0
    
  3. Select from the following options:

    • If the current node is still present in the cluster and the PVC should be moved to another node, recreate the PVC as described in Recreate a PVC on another control plane node.

    • If the current node has been removed, remove the PVC related to the etcd replica of the failed node:

      kubectl -n <NAMESPACE> delete pvc <PVC-NAME>
      

      The PVC will be recreated automatically after the etcd StatefulSet is scaled to the initial number of replicas.

  4. Scale the etcd StatefulSet to the initial number of replicas:

    kubectl -n openstack scale sts etcd-etcd --replicas=<NUMBER-OF-REPLICAS>
    
  5. Wait until all etcd pods are in the Ready state.

  6. Verify that the etcd cluster is healthy:

    kubectl -n openstack exec -t etcd-etcd-1 -- etcdctl -w table endpoint --cluster status
    
  7. Before MOSK 23.1, if the replica ID is 1 or higher:

    1. Remove the coordination section from the spec.services section of the OsDpl object.

    2. Wait until all etcd pods appear in the Ready state.

    3. Verify that the etcd cluster is healthy:

      kubectl -n openstack exec -t etcd-etcd-1 -- etcdctl -w table endpoint --cluster status
      

Recreate a PVC on another control plane node

This section describes how to recreate a PVC of a stateful application on another control plane node.

To recreate a PVC on another control plane node:

  1. Select one of the persistent volumes available on the node:

    Caution

    A stateful application pod can only be migrated to the node that does not contain other pods of this application.

    NODE_NAME=<NODE-NAME>
    STORAGE_CLASS=$(kubectl -n openstack get osdpl <OSDPL_OBJECT_NAME> -o jsonpath='{.spec.local_volume_storage_class}')
    kubectl -n openstack get pv -o json | jq --arg NODE_NAME $NODE_NAME --arg STORAGE_CLASS $STORAGE_CLASS -r '.items[] | select(.spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0] == $NODE_NAME and .spec.storageClassName == $STORAGE_CLASS and .status.phase == "Available") | .metadata.name'
    
  2. As the new PVC should contain the same parameters as the deleted one except for volumeName, save the old PVC configuration in YAML:

    kubectl -n <NAMESPACE> get pvc <PVC-NAME> -o yaml > <OLD-PVC>.yaml
    

    Note

    <NAMESPACE> is a Kubernetes namespace where the PVC is created. For Redis, specify openstack-redis, for other applications specify openstack.

  3. Delete the old PVC:

    kubectl -n <NAMESPACE> delete pvc <PVC-NAME>
    

    Note

    If a PVC has stuck in the terminating state, run kubectl -n openstack edit pvc <PVC-NAME> and remove the finalizers section from metadata of the PVC.

  4. Create a PVC with a new persistent volume:

    cat <<EOF | kubectl apply -f -
         apiVersion: v1
         kind: PersistentVolumeClaim
         metadata:
           name: <PVC-NAME>
           namespace: <NAMESPACE>
         spec:
           accessModes:
           - ReadWriteOnce
           resources:
             requests:
               storage: <STORAGE-SIZE>
           storageClassName: <STORAGE-CLASS>
           volumeMode: Filesystem
           volumeName: <PV-NAME>
        EOF
    

    Caution

    <STORAGE-SIZE>, <STORAGE-CLASS>, and <NAMESPACE> should correspond to the storage, storageClassName, and namespace values from the <OLD-PVC>.yaml file with the old PVC configuration.