Ceph advanced configuration

This section describes how to configure a Ceph cluster through the KaaSCephCluster (kaascephclusters.kaas.mirantis.com) CR during or after the deployment of a managed cluster.

The KaaSCephCluster CR spec has two sections, cephClusterSpec and k8sCluster and specifies the nodes to deploy as Ceph components. Based on the roles definitions in the KaaSCephCluster CR, Ceph controller automatically labels nodes for Ceph Monitors and Managers. Ceph OSDs are deployed based on the storageDevices parameter defined for each Ceph node.

For a default KaaSCephCluster CR, see templates/bm/kaascephcluster.yaml.template. For details on how to configure the default template for a baremetal-based cluster bootstrap, see Bootstrap a management cluster.

To configure a Ceph cluster:

  1. Select from the following options:

    • If you do not have a Container cloud cluster yet, open kaascephcluster.yaml.template for editing.

    • If the Container cloud cluster is already deployed, open the KaasCephCluster CR of a managed cluster for editing:

      kubectl edit kaascephcluster -n <managedClusterProjectName>
      

      Substitute <managedClusterProjectName> with a corresponding value.

  2. Using the tables below, configure the Ceph cluster as required.

    High-level parameters

    Parameter

    Description

    cephClusterSpec

    Describes a Ceph cluster in the Container Cloud cluster. For details on cephClusterSpec parameters, see the tables below.

    k8sCluster

    Defines the cluster on which the KaaSCephCluster depends on. Use the k8sCluster parameter if the name or namespace of the corresponding Container Cloud cluster differs from default one:

    spec:
      k8sCluster:
        name: kaas-mgmt
        namespace: default
    
    General parameters

    Parameter

    Description

    manageOsds

    Caution

    Starting from Container Cloud 2.11.0, Ceph LCM automated Ceph OSD operations such as Ceph OSD or Ceph node removal are temporarily disabled due to the Storage known issue. To remove Ceph OSDs manually, see Remove Ceph OSD manually.

    Enables automated management of Ceph OSDs. For details, see Enable automated Ceph LCM.

    Caution

    The manageOsds parameter enables irreversible operations such as Ceph OSD removal. Therefore, use this feature with caution.

    clusterNet

    Specifies the CIDR for Ceph OSD replication network.

    Note

    The clusterNet and publicNet parameters support multiple IP networks. For details, see ref:enable-l3-for-ceph.

    publicNet

    Specifies the CIDR for communication between the service and operator.

    Note

    The clusterNet and publicNet parameters support multiple IP networks. For details, see ref:enable-l3-for-ceph.

    nodes

    Specifies the list of Ceph nodes. For details, see Node parameters. The nodes parameter is a map with machine names as keys and Ceph node specifications as values, for example:

    nodes:
      master-0:
        <node spec>
      master-1:
        <node spec>
      ...
      worker-0:
        <node spec>
    

    nodeGroups

    Specifies the list of Ceph nodes grouped by node lists or node labels. For details, see NodeGroups parameters. The nodeGroups parameter is a map with group names as keys and Ceph node specifications for defined nodes or node labels as values. For example:

    nodes:
      group-1:
        spec: <node spec>
        nodes: ["master-0", "master-1"]
      group-2:
        spec: <node spec>
        label: <nodeLabelExpression>
      ...
      group-3:
        spec: <node spec>
        nodes: ["worker-2", "worker-3"]
    

    The <nodeLabelExpression> must be a valid Kubernetes label selector expression.

    pools

    Specifies the list of Ceph pools. For details, see Pool parameters.

    objectStorage

    Specifies the parameters for Object Storage, such as RADOS Gateway, the Ceph Object Storage. Also specifies the RADOS Gateway Multisite configuration. For details, see RADOS Gateway parameters and Multisite parameters.

    maintenance Deprecated since 2.12.0

    Enables or disables the noout, norebalance, and nofill flags on the entire Ceph cluster. Set to false by default. Mirantis strongly recommends not using it on production deployments other than during update or during Ceph resources management.

    rookConfig

    Optional. String key-value parameter that allows overriding Ceph configuration options.

    extraLogging

    Enables or disables verbose logging in the ceph-controller service, which will expose resource changes in logs. Set to false by default.

    ingress

    Enables a custom ingress rule for public access on Ceph services, for example, Ceph RADOS Gateway. For details, see Enable TLS for Ceph public endpoints.

    rbdMirror

    Enables pools mirroring between two interconnected clusters. For details, see Enable Ceph RBD mirroring.

    clients

    List of Ceph clients. For details, see Clients parameters.

    disableOsSharedKeys

    Disables autogeneration of shared Ceph values for OpenStack deployments. Set to false by default.

    mgr Available since 2.11.0

    Contains the modules parameter that should include a list of Ceph Manager modules to enable on the Ceph cluster. For example:

    mgr:
      modules:
      - balancer
      - pg_autoscaler
    

    The balancer and pg_autoscaler Ceph Manager modules are enabled by default and cannot be disabled.

    Note

    Most Ceph Manager modules require additional configuration that you can perform through the ceph-tools pod on a managed cluster.

    Example configuration:

    spec:
      cephClusterSpec:
        manageOsds: true
        network:
          clusterNet: 10.10.10.0/24
          publicNet: 10.10.11.0/24
        nodes:
          master-0:
            <node spec>
          ...
        pools:
        - <pool spec>
        ...
        rookConfig:
          "mon max pg per osd": "600"
          ...
    
    Node parameters

    Parameter

    Description

    roles

    Specifies the mon, mgr, or rgw daemon to be installed on a Ceph node. You can place the daemons on any nodes upon your decision. Consider the following recommendations:

    • The recommended number of Ceph Monitors in a Ceph cluster is 3. Therefore, at least 3 Ceph nodes must contain the mon item in the roles parameter.

    • The number of Ceph Monitors must be odd. For example, if the KaaSCephCluster spec contains 3 Ceph monitors and you need to add more, the number of Ceph monitors must equal 5, 7, 9, and so on.

    • Do not add more than 2 Ceph monitors at a time and wait until the Ceph cluster is Ready.

    • For a better HA, the number of mgr roles must equal the number of mon roles.

    • If rgw roles are not specified, all rgw daemons will spawn on the same nodes with mon daemons.

      If a Ceph node contains a mon role, the Ceph Monitor Pod will be deployed on it. If the Ceph node contains a mgr role, it informs the Ceph controller that a Ceph Manager can be deployed on that node. However, only one Ceph Manager must be deployed on a node.

    storageDevices

    Specifies the list of devices to use for Ceph OSD deployment. Includes the following parameters:

    • name - the device name placed in the /dev folder. For example, vda.

    • config - a map of device configurations that must contain a device class. The device class must be defined in a pool and can contain a metadata device:

      storageDevices:
      - name: sdc
        config:
          deviceClass: hdd
          metadataDevice: nvme01
      

      The underlying storage format to use for Ceph OSDs is BlueStore.

      Note

      The deviceClass parameter is mandatory and must be set to hdd, ssd, or nvme.

    • fullPath Available since 2.11.0 - full path to device on the node. Specify only a valid by-path device path, for example, /dev/disk/by-path/....

    crush

    Specifies the explicit key-value CRUSH topology for a node. For details, see Ceph official documentation: CRUSH maps. Includes the following parameters:

    • datacenter - a physical data center that consists of rooms and handles data.

    • room - a room that accommodates one or more racks with hosts.

    • pdu - a power distribution unit (PDU) device that has multiple outputs and distributes electric power to racks located within a data center.

    • row - a row of computing racks inside room.

    • rack - a computing rack that accommodates one or more hosts.

    • chassis - a bare metal structure that houses or physically assembles hosts.

    • region - the geographic location of one or more Ceph Object instances within one or more zones.

    • zone - a logical group that consists of one or more Ceph Object instances.

    Example configuration:

    crush:
      datacenter: dc1
      room: room1
      pdu: pdu1
      row: row1
      rack: rack1
      chassis: ch1
      region: region1
      zone: zone1
    
    NodeGroups parameters

    Parameter

    Description

    spec

    Specifies a Ceph node specification. For the entire spec, see Node parameters.

    nodes

    Specifies a list of names of machines to which the Ceph node spec must be applied. Mutually exclusive with the label parameter. For example:

    nodeGroups:
      group-1:
        spec: <node spec>
        nodes:
        - master-0
        - master-1
        - worker-0
    

    label

    Specifies a string with a valid label selector expression to select machines to which the node spec must be applied. Mutually exclusive with nodes parameter. For example:

    nodeGroup:
      group-2:
        spec: <node spec>
        label: "ceph-storage-node=true,!ceph-control-node"
    
    Pool parameters

    Parameter

    Description

    name

    Specifies the pool name as a prefix for each Ceph block pool. The resulting Ceph block pool name will be <name>-<deviceClass>.

    useAsFullName

    Enables Ceph block pool to use only the name value as a name. The resulting Ceph block pool name will be <name> without the deviceClass suffix.

    role

    Specifies the pool role and is used mostly for Mirantis OpenStack for Kubernetes (MOS) pools.

    default

    Defines if the pool and dependent StorageClass should be set as default. Must be enabled only for one pool.

    deviceClass

    Specifies the device class for the defined pool. Possible values are HDD, SSD, and NVMe.

    replicated

    The replicated parameter is mutually exclusive with erasureCoded and includes the following parameters:

    • size - the number of pool replicas.

    • targetSizeRatio - Optional. A float percentage from 0.0 to 1.0, which specifies the expected consumption of the total Ceph cluster capacity. The default values are as follows:

    erasureCoded

    Enables the erasure-coded pool. For details, see Rook documentation: Erasure coded and Ceph documentation: Erasure coded pool. The erasureCoded parameter is mutually exclusive with replicated.

    failureDomain

    The failure domain across which the replicas or chunks of data will be spread. Set to host by default. The possible values are osd or host.

    mirroring

    Optional. Enables the mirroring feature for the defined pool. Includes the mode parameter that can be set to pool or image. For details, see Enable Ceph RBD mirroring.

    Example configuration:

    pools:
    - name: kubernetes
      role: kubernetes
      deviceClass: hdd
      replicated:
        size: 3
        targetSizeRatio: 10.0
      default: true
    

    To configure additional required pools for MOS, see MOS Deployment Guide: Deploy a Ceph cluster.

    Clients parameters

    Parameter

    Description

    name

    Ceph client name.

    caps

    Key-value parameter with Ceph client capabilities.

    Example configuration:

    clients:
    - name: glance
      caps:
        mon: allow r, allow command "osd blacklist"
        osd: profile rbd pool=images
    
    RADOS Gateway parameters

    Parameter

    Description

    name

    Ceph Object Storage instance name.

    dataPool

    Mutually exclusive with the zone parameter. Object storage data pool spec that should only contain replicated or erasureCoded and failureDomain parameters. The failureDomain parameter may be set to osd or host, defining the failure domain across which the data will be spread. For dataPool, Mirantis recommends using an erasureCoded pool. For details, see Rook documentation: Erasure coding. For example:

    cephClusterSpec:
      objectStorage:
        rgw:
          dataPool:
            erasureCoded:
              codingChunks: 1
              dataChunks: 2
    

    metadataPool

    Mutually exclusive with the zone parameter. Object storage metadata pool spec that should only contain replicated and failureDomain parameters. The failureDomain parameter may be set to osd or host, defining the failure domain across which the data will be spread. Can use only replicated settings. For example:

    cephClusterSpec:
      objectStorage:
        rgw:
          metadataPool:
            replicated:
              size: 3
            failureDomain: host
    

    where replicated.size is the number of full copies of data on multiple nodes.

    gateway

    The gateway settings corresponding to the rgw daemon settings. Includes the following parameters:

    • port - the port on which the Ceph RGW service will be listening on HTTP.

    • securePort - the port on which the Ceph RGW service will be listening on HTTPS.

    • instances - the number of pods in the Ceph RGW ReplicaSet. If allNodes is set to true, a DaemonSet is created instead.

      Note

      Mirantis recommends using 2 instances for Ceph Object Storage.

    • allNodes - defines whether to start the Ceph RGW pods as a DaemonSet on all nodes. The instances parameter is ignored if allNodes is set to true.

    For example:

    cephClusterSpec:
      objectStorage:
        rgw:
          gateway:
            allNodes: false
            instances: 1
            port: 80
            securePort: 8443
    

    preservePoolsOnDelete

    Defines whether to delete the data and metadata pools in the rgw section if the object storage is deleted. Set this parameter to true if you need to store data even if the object storage is deleted. However, Mirantis recommends setting this parameter to false.

    users and buckets

    Optional. To create new Ceph RGW resources, such as buckets or users, specify the following keys. Ceph controller will automatically create the specified object storage users and buckets in the Ceph cluster.

    • users - a list of strings that contain user names to create for object storage.

    • buckets - a list of strings that contain bucket names to create for object storage.

    zone

    Optional. Mutually exclusive with metadataPool and dataPool. Defines the Ceph Multisite zone where the object storage must be placed. Includes the name parameter that must be set to one of the zones items. For details, see Enable Multisite for Ceph RGW Object Storage.

    For example:

    cephClusterSpec:
      objectStorage:
        multisite:
          zones:
          - name: master-zone
          ...
        rgw:
          zone:
            name: master-zone
    

    SSLCert

    Optional. Custom TLS certificate parameters used to access the Ceph RGW endpoint. If not specified, a self-signed certificate will be generated.

    For example:

    cephClusterSpec:
      objectStorage:
        rgw:
          SSLCert:
            cacert: |
              -----BEGIN CERTIFICATE-----
              ca-certificate here
              -----END CERTIFICATE-----
            tlsCert: |
              -----BEGIN CERTIFICATE-----
              private TLS certificate here
              -----END CERTIFICATE-----
            tlsKey: |
              -----BEGIN RSA PRIVATE KEY-----
              private TLS key here
              -----END RSA PRIVATE KEY-----
    

    For configuration example, see Enable Ceph RGW Object Storage.

    Multisite parameters

    Parameter

    Description

    realms Technical Preview

    List of realms to use, represents the realm namespaces. Includes the following parameters:

    • name - the realm name.

    • pullEndpoint - optional, required only when the master zone is in a different storage cluster. The endpoint, access key, and system key of the system user from the realm to pull from. Includes the following parameters:

      • endpoint - the endpoint of the master zone in the master zone group.

      • accessKey - the access key of the system user from the realm to pull from.

      • secretKey - the system key of the system user from the realm to pull from.

    zoneGroups Technical Preview

    The list of zone groups for realms. Includes the following parameters:

    • name - the zone group name.

    • realmName - the realm namespace name to which the zone group belongs to.

    zones Technical Preview

    The list of zones used within one zone group. Includes the following parameters:

    • name - the zone name.

    • metadataPool - the settings used to create the Object Storage metadata pools. Must use replication. For details, see Pool parameters.

    • dataPool - the settings to create the Object Storage data pool. Can use replication or erasure coding. For details, see Pool parameters.

    • zoneGroupName - the zone group name.

    For configuration example, see Enable Multisite for Ceph RGW Object Storage.

  3. Select from the following options:

    • If you are creating a managed cluster, save the updated KaaSCephCluster template to the corresponding file and proceed with the managed cluster creation.

    • If you are configuring KaaSCephCluster of an existing managed cluster, run the following command:

      kubectl apply -n <managedClusterProjectName>
      

      Substitute <managedClusterProjectName> with the corresponding value.