Update notes

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster update to the version 23.1. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Features

The MOSK cluster will obtain the newly implemented capabilities automatically with no significant impact on the update procedure.

Major component versions update

As part of the update to MOSK 23.1, Tungsten Fabric will automatically get updated from version 2011 to version 21.4.

Note

For the compatibility matrix of the most recent MOSK releases and their major components in conjunction with Container Cloud and Cluster releases, refer to Release Compatibility Matrix.

Update impact and maintenance windows planning

The update to MOSK 23.1 does not include any version-specific impact on the cluster. To start planning a maintenance window, use the Operations Guide: Update a MOSK cluster standard procedure.

Known issues during the update

Before updating the cluster, be sure to review the potential issues that may arise during the process and the recommended solutions to address them, as outlined in Cluster update known issues.

Pre-update actions

Update the baremetal-provider image to 1.37.18

If your Container Cloud management cluster has updated to 2.24.1, to avoid the issue with waiting for the lcm-agent to update the currentDistribution field during the cluster update to MOSK 23.1, replace the baremetal-provider image 1.37.15 tag with 1.37.18:

  1. Open the kaasrelease object for editing:

    kubectl edit kaasrelease kaas-2-24-1
    
  2. Replace the 1.37.15 tag with 1.37.18 for the baremetal-provider image:

    - chartURL: core/helm/baremetal-provider-1.37.15.tgz
      helmV3: true
      name: baremetal-provider
      namespace: kaas
      values:
        cluster_api_provider_baremetal:
          image:
            tag: 1.37.18
    

Explicitly define the OIDCClaimDelimiter parameter

MOSK 23.1 introduces a new default value for the OIDCClaimDelimiter parameter, which defines the delimiter to use when setting multi-valued claims in the HTTP headers. See the MOSK 23.1 OpenStack API Reference for details.

Previously, the value of the OIDCClaimDelimiter parameter defaulted to ",". This value misaligned with the behavior expected by Keystone. As a result, when creating federation mappings for Keystone, the cloud operator was forced to write more complex rules. Therefore, in MOSK 22.4, Mirantis announced the change of the default value for the OIDCClaimDelimiter parameter.

If your deployment is affected and you have not explicitly defined the OIDCClaimDelimiter parameter, as Mirantis advised, after update to MOSK 22.4 or 22.5, now would be a good time to do it. Otherwise, you may encounter unforeseen consequences after the update to MOSK 23.1.

Affected deployments

Proceed with the instruction below only if the following conditions are true:

  • Keystone is set to use federation through the OpenID Connect protocol, with Mirantis Container Cloud Keycloak in particular. The following configuration is present in your OpenStackDeployment custom resource:

    kind: OpenStackDeployment
    spec:
      features:
        keystone:
          keycloak:
            enabled: true
    
  • No value has already been specified for the OIDCClaimDelimiter parameter in your OpenStackDeployment custom resource.

To facilitate smooth transition of the existing deployments to the new default value, explicitly define the OIDCClaimDelimiter parameter as follows:

kind: OpenStackDeployment
spec:
  features:
    keystone:
      keycloak:
        oidc:
          OIDCClaimDelimiter: ","

Note

The new default value for the OIDCClaimDelimiter parameter is ";". To find out whether your Keystone mappings will need adjustment after changing the default value, set the parameter to ";" on your staging environment and verify the rules.

Verify Ceph configuration

Verify that the KaaSCephCluster custom resource does not contain the following entries. If they exist, remove them.

  • In the spec.cephClusterSpec section, the external section.

    Caution

    If the external section exists in the KaaSCephCluster spec during upgrade to MOSK 23.1, it will cause Ceph outage that leads to corruption of the Cinder volumes file system and requires a lot of routine work to fix sectors with Cinder volumes one-by-one after fixing Ceph outage.

    Therefore, make sure that the external section is removed from the KaaSCephCluster spec right before starting cluster upgrade.

  • In the spec.cephClusterSpec.rookConfig section, the ms_crc_data or ms crc data configuration key. After you remove the key, wait for rook-ceph-mon pods to restart on the MOSK cluster.

    Caution

    If the ms_crc_data key exists in the rookConfig section of KaaSCephCluster during upgrade to MOSK 23.1, it causes missing connection between Rook Operator and Ceph Monitors during Ceph version upgrade leading to a stuck upgrade and requires that you manually disable the ms_crc_data key for all Ceph Monitors.

    Therefore, make sure that the ms_crc_data key is removed from the KaaSCephCluster spec right before starting cluster upgrade.

Disable Tempest

To prevent issues during graceful reboot of the OpenStack controller nodes, temporarily remove Tempest from the OpenStackDeployment object:

spec:
  features:
    services:
    - tempest

Post-update actions

Remove sensitive information from cluster configuration

The OpenStackDeploymentSecret custom resource has been deprecated in MOSK 23.1. The fields that store confidential settings in OpenStackDeploymentSecret and OpenStackDeployment custom resources need to be migrated to the Kubernetes secrets.

Note

For the functionality deprecation and deletion schedule, refer to OpenStackDeploymentSecret custom resource.

The full list of the affected fields include:

spec:
  features:
    ssl:
      public_endpoints:
        - ca_cert
        - api_cert
        - api_key
    barbican:
      backends:
        vault:
          - approle_role_id
          - approle_secret_id
          - ssl_ca_crt_file
    baremetal:
      ngs:
        hardware:
          *:
            - username
            - password
            - ssh_private_key
            - secret
    keystone:
      domain_specific_configuration:
        ...
        ks_domains:
          *:
            config:
              ...
              ldap:
                ...
                password: <password>
                user: <user-name>

After the update, migrate the fields mentioned above from OpenStackDeployment and OpenStackDeploymentSecret custom resources:

  1. Create a Kubernetes secret <osdpl-name>-hidden in the openstack namespace either using the helper script or manually:

    Use the osctl-move-sensitive-data helper script from the openstack-controller pod:

    osctl-move-sensitive-data  osh-dev --secret-name osh-dev-hidden
    

    Create the Kubernetes secret and add content of required fields to the OpenStackDeployment custom resource. For example:

    apiVersion: v1
    kind: Secret
    metadata:
      name: osh-dev-hidden
      namespace: openstack
      labels:
        openstack.lcm.mirantis.com/osdpl_secret: 'true'
    type: Opaque
    data:
      ca_cert: ...
      api_cert: ...
      api_key: ...
    
  2. Add a reference from appropriate fields in the OpenStackDeployment object. For example:

    spec:
      features:
       ssl:
         public_endpoints:
           api_cert:
             value_from:
               secret_key_ref:
                 key: api_cert
                 name: osh-dev-hidden
           api_key:
             value_from:
               secret_key_ref:
                 key: api_key
                 name: osh-dev-hidden
           ca_cert:
             value_from:
               secret_key_ref:
                 key: ca_cert
                 name: osh-dev-hidden
    
  3. If you used to store your sensitive information in the OpenStackDeploymentSecret object, remove it from your cluster configuration.

Change to default RAM oversubscription

To ensure stability for production workloads, MOSK 23.1 changes the default value of RAM oversubscription on compute nodes to 1.0, which is no oversubscription. In MOSK 22.5 and earlier, the effective default value of RAM allocation ratio is 1.1.

This change will be applied only to the compute nodes added to the cloud after update to MOSK 23.1. The effective RAM oversubscription value for existing compute nodes will not automatically change after updating to MOSK 23.1.

Therefore, Mirantis strongly recommends adjusting the oversubscription values to new defaults for existing compute nodes as well. For the procedure, refer to Change oversubscription settings for existing compute nodes.

Use dynamic configuration for resource oversubscription

Since MOSK 23.1, the Compute service (OpenStack Nova) enables you to control the resource oversubscription dynamically through the placement API.

However, if your cloud already makes use of custom allocation ratios, the new functionality will not become immediately available after update. Any compute node configured with explicit values for the cpu_allocation_ratio, disk_allocation_ratio, and ram_allocation_ratio configuration options will continue to enforce those values in the placement service. Therefore, any changes made through the placement API will be overridden by the values set in those configuration options in the Compute service. To modify oversubscription, you should adjust the values of these configuration options in the OpenStackDeployment custom resource. This procedure should be performed with caution as modifying these values may result in compute service restarts and potential disruptions in the instance builds.

To enable the use of the new functionality, Mirantis recommends removing explicit values for the cpu_allocation_ratio, disk_allocation_ratio, and ram_allocation_ratio options from the OpenStackDeployment custom resource. Instead, use the new configuration options as described in Configuring initial resource oversubscription. Also, keep in mind that the changes will only impact newly added compute nodes and will not be applied to the existing ones.