Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!

Starting with MOSK 25.2, the MOSK documentation set covers all product layers, including MOSK management (formerly Container Cloud). This means everything you need is in one place. Some legacy names may remain in the code and documentation and will be updated in future releases. The separate Container Cloud documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.

Migrate Ceph pools from one failure domain to another¶

Warning

This procedure is valid for MOSK clusters that use the deprecated KaaSCephCluster custom resource (CR) instead of the MiraCeph CR that is available since MOSK 25.2 as a new Ceph configuration entrypoint. For the equivalent procedure with the MiraCeph CR, refer to the following section:

Migrate Ceph pools from one failure domain to another

The document describes how to change the failure domain of an already deployed Ceph cluster.

Note

This document focuses on changing the failure domain from a smaller to wider one, for example, from host to rack. Using the same instruction, you can move the failure domain from a wider to smaller scale.

Caution

Data movement implies the Ceph cluster rebalancing that may impact cluster performance, depending on the cluster size.

High-level overview of the procedure includes the following steps:

Set correct labels on the nodes.
Create the new bucket hierarchy.
Move nodes to new buckets.
Modify the CRUSH rules.
Add the manual changes to the KaaSCephCluster spec.
Scale the Ceph controllers.

Prerequisites¶

Verify that the Ceph cluster has enough space for multiple copies of data to migrate. Mirantis highly recommends that the Ceph cluster has a minimum of 25% of free space for the procedure to succeed.

Note

The migration procedure implies data movement and optional modification of CRUSH rules that cause a large amount of data (depending on the cluster size) to be first copied to a new location in the Ceph cluster before data removal.
Create a backup of the current KaaSCephCluster object from the MOSK namespace of the management cluster:
```
kubectl -n <moskClusterProject> get kaascephcluster -o yaml > kcc-backup.yaml
```
Substitute <moskClusterProject> with the corresponding MOSK cluster namespace of the management cluster.

In the rook-ceph-tools pod on a MOSK cluster, obtain a backup of the CRUSH map:

ceph osd getcrushmap -o /tmp/crush-map-orig
crushtool -d /tmp/crush-map-orig -o /tmp/crush-map-orig.txt

Migrate Ceph pools¶

This procedure contains an example of moving failure domains of all pools from host to rack. Using the same instruction, you can migrate pools from other types of failure domains, migrate pools separately, and so on.

To migrate Ceph pools from one failure domain to another:

Set the required CRUSH topology in the KaaSCephCluster object for each defined node. For details on the crush parameter, see Node parameters.

Setting the CRUSH topology to each node causes the Ceph Controller to set proper Kubernetes labels on the nodes.

On the MOSK cluster, verify that the required buckets and bucket types are present in the Ceph hierarchy:

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

Verify that the required bucket type is present by default:

ceph osd getcrushmap -o /tmp/crush-map
crushtool -d /tmp/crush-map -o /tmp/crush-map.txt
cat /tmp/crush-map.txt # Look for the section named → “# types”

Example of system response:

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 zone
type 10 region
type 11 root

Verify that the buckets with the required bucket type are present:

cat /tmp/crush-map.txt # Look for the section named → “# buckets”

Example of system response of an existing rack bucket:

# buckets
rack rack-1 {
  id -15
  id -16 class hdd
  # weight 0.00000
  alg straw2
  hash 0
}

If the required buckets are not created, create new ones with the required bucket type:

ceph osd crush add-bucket <bucketName> <bucketType> root=default

For example:

ceph osd crush add-bucket rack-1 rack root=default
ceph osd crush add-bucket rack-2 rack root=default
ceph osd crush add-bucket rack-3 rack root=default

Exit the ceph-tools pod.

Optional. Order buckets as required:
1. On the MOSK cluster, add the first Ceph CRUSH smaller bucket to its respective wider bucket:
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
ceph osd crush move <smallerBucketName> <bucketType>=<widerBucketName>
```
  Substitute the following parameters:
  - <smallerBucketName> with the name of the smaller bucket, for example host name
  - <bucketType> with the required bucket type, for example rack
  - <widerBucketName> with the name of the wider bucket, for example rack name
  For example:
```
ceph osd crush move kaas-node-1 rack=rack-1 root=default
```
  Warning
  
  Mirantis highly recommends moving one bucket at a time.
  
  For more details, refer to official Ceph documentation: CRUHS Maps: Moving a bucket.
2. After the bucket is moved to the new location in the CRUSH hierarchy, verify that no data rebalancing occurs:
```
ceph -s
```
  Caution
  
  Wait for rebalancing to complete before proceeding to the next step.
3. Add the remaining Ceph CRUSH smaller buckets to their respective wider buckets one by one.

Scale the Ceph Controller and Rook Operator deployments to 0 replicas:

kubectl -n ceph-lcm-mirantis scale deploy --all --replicas 0
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0

On the MOSK cluster, manually modify the CRUSH rules for Ceph pools to enable data placement on a new failure domain:

Enter the ceph-tools pod:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

List the CRUSH rules and erasure code profiles for the pools:

ceph osd pool ls detail

Example output

pool 1 'mirablock-k8s-block-hdd' replicated size 2 min_size 1 crush_rule 9 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1193 lfor 0/0/85 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 1.31
pool 2 '.mgr' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 70 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 6.06
pool 3 'openstack-store.rgw.otp' replicated size 2 min_size 1 crush_rule 11 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 2.27
pool 4 'openstack-store.rgw.meta' replicated size 2 min_size 1 crush_rule 12 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 1.50
pool 5 'openstack-store.rgw.log' replicated size 2 min_size 1 crush_rule 10 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 3.00
pool 6 'openstack-store.rgw.buckets.non-ec' replicated size 2 min_size 1 crush_rule 13 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 1.50
pool 7 'openstack-store.rgw.buckets.index' replicated size 2 min_size 1 crush_rule 15 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 2.25
pool 8 '.rgw.root' replicated size 2 min_size 1 crush_rule 14 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 3.75
pool 9 'openstack-store.rgw.control' replicated size 2 min_size 1 crush_rule 16 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1197 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw read_balance_score 3.00
pool 10 'other-hdd' replicated size 2 min_size 1 crush_rule 19 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1179 lfor 0/0/85 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 1.69
pool 11 'openstack-store.rgw.buckets.data' erasure profile openstack-store.rgw.buckets.data_ecprofile size 3 min_size 2 crush_rule 18 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1198 lfor 0/0/86 flags hashpspool,ec_overwrites stripe_width 8192 application rook-ceph-rgw
pool 12 'vms-hdd' replicated size 2 min_size 1 crush_rule 21 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode on last_change 1182 lfor 0/0/95 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.4 application rbd read_balance_score 1.24
pool 13 'volumes-hdd' replicated size 2 min_size 1 crush_rule 23 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 1185 lfor 0/0/89 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.2 application rbd read_balance_score 1.31
pool 14 'backup-hdd' replicated size 2 min_size 1 crush_rule 25 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1188 lfor 0/0/90 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.1 application rbd read_balance_score 2.06
pool 15 'images-hdd' replicated size 2 min_size 1 crush_rule 27 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1191 lfor 0/0/90 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.1 application rbd read_balance_score 1.50

For each replicated Ceph pool:
1. Obtain the current CRUSH rule name:
```
ceph osd crush rule dump <oldCrushRuleName>
```
2. Create a new CRUSH rule with the required bucket type using the same root, device class, and new bucket type:
```
ceph osd crush rule create-replicated <newCrushRuleName> <root> <bucketType> <deviceClass>
```
  For example:
```
ceph osd crush rule create-replicated images-hdd-rack default rack hdd
```
  For more details, refer to official Ceph documentation: CRUSH Maps: Creating a rule for a replicated pool.
3. Apply a new crush rule to the Ceph pool:
```
ceph osd pool set <poolName> crush_rule <newCrushRuleName>
```
  For example:
```
ceph osd pool set images-hdd crush_rule images-hdd-rack
```
4. Wait for data to be rebalanced after moving the Ceph pool under the new failure domain (bucket type) by monitoring Ceph health:
```
ceph -s
```
  Caution
  
  Update the following Ceph pool only after data rebalancing completes for the current Ceph pool.
5. Verify that the old CRUSH rule is not used anymore:
```
ceph osd pool ls detail
```
  The rule ID is located in the CRUSH map and must match the rule ID in the output of ceph osd dump.
6. Remove the old unused CRUSH rule and rename the new one to the original name:
```
ceph osd crush rule rm <oldCrushRuleName>
ceph osd crush rule rename <newCrushRuleName> <oldCrushRuleName>
```
For each erasure-coded Ceph pool:

Note

Erasure-coded pools require different number of buckets to store data. Instead of the number of replicas in replicated pools, erasure-coded pools require the coding chunks + data chunks number of buckets existing in the Ceph cluster. For example, if an erasure-coded pool has 2 coding chunks and 2 data chunks configured, then the pool requires 4 different buckets, for example, 4 racks, to store data.
1. Obtain the current parameters of the erasure-coded profile:
```
ceph osd erasure-code-profile get <ecProfile>
```
2. In the profile, add the new bucket type as the failure domain using the crush-failure-domain parameter:
```
ceph osd erasure-code-profile set <ecProfile> k=<int> m=<int> crush-failure-domain=<bucketType> crush-device-class=<deviceClass>
```
3. Create a new CRUSH rule in the profile:
```
ceph osd crush rule create-erasure <newEcCrushRuleName> <ecProfile>
```
4. Apply the new CRUSH rule to the pool:
```
ceph osd pool set <poolName> crush_rule <newEcCrushRuleName>
```
5. Wait for data to be rebalanced after moving the Ceph pool under the new failure domain (bucket type) by monitoring Ceph health:
```
ceph -s
```
  Caution
  
  Update the following Ceph pool only after data rebalancing completes for the current Ceph pool.
6. Verify that the old CRUSH rule is not used anymore:
```
ceph osd pool ls detail
```
  The rule ID is located in the CRUSH map and must match the rule ID in the output of ceph osd dump.
7. Remove the old unused CRUSH rule and rename the new one to the original name:
```
ceph osd crush rule rm <oldCrushRuleName>
ceph osd crush rule rename <newCrushRuleName> <oldCrushRuleName>
```
  Note
  
  New erasure-coded profiles cannot be renamed, so they will not be removed automatically during pools cleanup. Remove them manually, if needed.
Exit the ceph-tools pod.

In the management cluster, update the KaaSCephCluster object by setting the failureDomain: rack parameter for each pool. The configuration from the Rook perspective must match the manually created configuration. For example:

spec:
  cephClusterSpec:
    pools:
    - name: images
      ...
      failureDomain: rack
    - name: volumes
      ...
      failureDomain: rack
    ...
    objectStorage:
      rgw:
        dataPool:
          failureDomain: rack
          ...
        metadataPool:
          failureDomain: rack
          ...

Monitor the Ceph cluster health and wait until rebalancing is completed:
```
ceph -s
```
Example of a successful system response:
```
HEALTH_OK
```

Scale back the Ceph Controller and Rook Operator deployments:

kubectl -n ceph-lcm-mirantis scale deploy --all --replicas 3
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1