Enable Ceph tolerations and resources management

Enable Ceph tolerations and resources management

Caution

This feature is available starting from the Container Cloud release 2.6.0.

Caution

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For details about the Mirantis Technology Preview support scope, see the Preface section of this guide.

Note

This document does not provide any specific recommendations on requests and limits for Ceph resources. The document stands for a native Helm-release based Ceph resources configuration for any cluster with Mirantis Container Cloud or Mirantis OpenStack for Kubernetes (MOS).

You can configure Ceph Controller to manage Ceph resources by specifying their requirements and constraints. To configure the resources consumption for the Ceph nodes, consider the following options that are based on different Helm release configuration values:

  • Configuring tolerations for taint nodes for the Ceph Monitor, Ceph Manager, and Ceph OSD daemons.

  • Configuring nodes resources requests or limits for the Ceph daemons and for each Ceph OSD device class such as HDD, SSD, or NVMe. For details, see Managing Resources for Containers.

Warning

Mirantis recommends enabling Ceph resources management when bootstrapping a new Ceph management or managed cluster. Enabling Ceph resources management on an existing Ceph cluster may cause downtime.

To enable Ceph tolerations and resources management:

  1. Open templates/bm/kaascephcluster.yaml.template for editing.

  2. In the ceph-controller section of spec.providerSpec.value.helmReleases, specify the hyperconverge.tolerations or hyperconverge.resources parameters as required:

    Ceph resource management parameters

    Parameter

    Description

    Example values

    tolerations

    Specifies tolerations for taint nodes.

    hyperconverge:
      tolerations:
        # Array of correct k8s
        # toleration rules for
        # mon/mgr/osd daemon pods
        mon:
        mgr:
        osd:
    

    Note

    Use vertical bars after tolerations keys. The mon, mgr, and osd values are strings that contain YAML-formatted arrays of Kubernetes toleration rules.

    hyperconverge:
      tolerations:
        mon: |
          - effect: NoSchedule
            key: node-role.kubernetes.io/controlplane
            operator: Exists
        mgr: |
          - effect: NoSchedule
            key: node-role.kubernetes.io/controlplane
            operator: Exists
        osd: |
          - effect: NoSchedule
            key: node-role.kubernetes.io/controlplane
            operator: Exists
    

    resources

    Specifies resources requests or limits. The hdd, ssd, and nvme resource requirements handle only the Ceph OSDs with a defined device class.

    Note

    Use vertical bars after resources requirements keys. The mon, mgr, osd and hdd, ssd, nvme values are strings that contain YAML-formatted maps of requests and limits for each component type.

    hyperconverge:
      resources:
        # resources requirements
        # for Ceph daemons
        mon:
        mgr:
        osd:
        # resources requirements
        # for Ceph OSD device
        # classes
        hdd:
        ssd:
        nvme:
    
    hyperconverge:
      resources:
        mon: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        mgr: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        osd: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        hdd: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        ssd: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
        nvme: |
          requests:
            memory: 1Gi
            cpu: 2
          limits:
            memory: 2Gi
            cpu: 3
    
  3. Save the reconfigured cluster resource and wait for the ceph-controller Helm release upgrade. It will recreate Ceph Monitors, Ceph Managers, or Ceph OSDs according to the specified hyperconverge configuration. The Ceph cluster may experience a short downtime.

Once done, proceed to Verify Ceph tolerations and resources management.