Tungsten Fabric Controller maintenance API

The Tungsten Fabric (TF) Controller creates and uses both types of workloadlocks that include ClusterWorkloadLock and NodeWorkloadLock.

Maintenance workflow

When the ClusterMaintenanceRequest object is created, the TF Controller verifies the TF cluster health status and proceeds as follows:

  • If the cluster is Ready , the TF Controller moves the ClusterWorkloadLock object to the inactive state.

  • Otherwise, the TF Controller keeps the ClusterWorkloadLock object in the active state.

When the NodeMaintenanceRequest object is created, the TF Controller verifies the vRouter pod state on the corresponding node and proceeds as follows:

  • If all containers are Ready, the TF Controller moves the NodeWorkloadLock object to the inactive state.

  • Otherwise, the TF Controller keeps the NodeWorkloadLock in the active state.

Note

If there is a NodeWorkloadLock object in the inactive state present in the cluster, the TF Controller does not process the NodeMaintenanceRequest object for other nodes until this inactive NodeWorkloadLock object becomes active.

When the cluster LCM removes the MaintenanceRequest object, the TF Controller waits for the vRouter pods to become ready and proceeds as follows:

  • If all containers are in the Ready state, the TF Controller moves the NodeWorkloadLock object to the active state.

  • Otherwise, the TF Controller keeps the NodeWorkloadLock object in the inactive state.

Maintenance progress tracking logic

The Tungsten Fabric (TF) Operator monitors maintenance progress for both cluster-wide and node-level operations. The operator automatically calculates progress and displays it as a percentage based on the number of nodes that have successfully completed the maintenance lifecycle.

Maintenance progress is reported through two custom resources: ClusterWorkloadLock and NodeWorkloadLock.

The ClusterWorkloadLock provides the high-level update status across the entire cluster:

  • Formula:

    Percentage inactive = (inactive_nodes/total_enabled_nodes) * 100
    

    If new nodes are added during maintenance, the TF Operator detects additional NodeWorkloadLock resources during its next iteration.

  • Status format: X out of Y nodes updated.

  • Update trigger: The value updates automatically as node states transition during the maintenance window.

The NodeWorkloadLock provides granular statuses for individual nodes:

NodeWorkloadLock progress statuses

Percentage

Status

Meaning

0%

Active

Update is currently in progress

100%

Inactive

Update completed successfully

0%

Failed

Update encountered an error and stopped