Enable Ceph tolerations and resources prior to Container Cloud 2.11.0

Caution

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

Note

This document does not provide any specific recommendations on requests and limits for Ceph resources. The document stands for a native Helm-release based Ceph resources configuration for any cluster with Mirantis Container Cloud or Mirantis OpenStack for Kubernetes (MOS).

Warning

This document is applicable only to Container Cloud versions earlier than 2.11.0. To enable Ceph tolerations and resources for Container Cloud 2.11.0 and later, refer Enable Ceph tolerations and resources management.

You can configure Ceph Controller to manage Ceph resources by specifying their requirements and constraints. To configure the resources consumption for the Ceph nodes, consider the following options that are based on different Helm release configuration values:

  • Configuring tolerations for taint nodes for the Ceph Monitor, Ceph Manager, and Ceph OSD daemons.

  • Configuring nodes resources requests or limits for the Ceph daemons and for each Ceph OSD device class such as HDD, SSD, or NVMe. For details, see Managing Resources for Containers.

Warning

Mirantis recommends enabling Ceph resources management when bootstrapping a new Ceph management or managed cluster. Enabling Ceph resources management on an existing Ceph cluster may cause downtime.

To enable Ceph tolerations and resources management:

  1. Select from the following options:

    • For baremetal-based Container Cloud, open templates/bm/cluster.yaml.template for editing.

    • For Equinix Metal based Container Cloud, open templates/equinix/cluster.yaml.template for editing.

  2. In the ceph-controller section of spec.providerSpec.value.helmReleases, specify the hyperconverge.tolerations or hyperconverge.resources parameters as required:

    Warning

    To avoid Ceph cluster health issues during daemons configuration changing, enable the spec.cephClusterSpec.maintenance flag before editing hyperconverge.resources. For details, see General parameters.

    Ceph resource management parameters

    Parameter

    Description

    Example values

    tolerations

    Specifies tolerations for taint nodes.

    hyperconverge:
      tolerations:
        # Array of correct k8s
        # toleration rules for
        # mon/mgr/osd daemon pods
        mon:
        mgr:
        osd:
    

    Note

    Use vertical bars after tolerations keys. The mon, mgr, and osd values are strings that contain YAML-formatted arrays of Kubernetes toleration rules.

    hyperconverge:
      tolerations:
        mon: |
          - effect: NoSchedule
            key: node-role.kubernetes.io/controlplane
            operator: Exists
        mgr: |
          - effect: NoSchedule
            key: node-role.kubernetes.io/controlplane
            operator: Exists
        osd: |
          - effect: NoSchedule
            key: node-role.kubernetes.io/controlplane
            operator: Exists
    

    resources

    Specifies resources requests or limits. The hdd, ssd, and nvme resource requirements handle only the Ceph OSDs with a defined device class.

    Note

    Use vertical bars after resources requirements keys. The mon, mgr, osd and hdd, ssd, nvme values are strings that contain YAML-formatted maps of requests and limits for each component type.

    hyperconverge:
      resources:
        # resources requirements
        # for Ceph daemons
        mon:
        mgr:
        osd:
        # resources requirements
        # for Ceph OSD device
        # classes
        hdd:
        ssd:
        nvme:
    
    hyperconverge:
      resources:
        mon: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        mgr: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        osd: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        hdd: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        ssd: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        nvme: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
    
  3. Save the reconfigured cluster resource and wait for the ceph-controller Helm release upgrade. It will recreate Ceph Monitors, Ceph Managers, or Ceph OSDs according to the specified hyperconverge configuration. The Ceph cluster may experience a short downtime.

Once done, proceed to Verify Ceph tolerations and resources management.