Note
This feature is available starting from the MCP 2019.2.5 maintenance update. Before enabling the feature, follow the steps described in Apply maintenance updates.
Before deploying a Ceph cluster with nodes in different L3 compartments, consider the following prerequisite steps. Otherwise, proceed to Deploy a Ceph cluster right away.
This document uses the terms failure domain and L3 compartment. Failure domains are a logical representation of a physical cluster structure. For example, one L3 segment spans two racks and another one spans a single rack. In this case, failure domains reside along the rack boundary, instead of the L3 segmentation.
Verify your networking configuration:
Note
Networking verification may vary depending on the hardware used for the deployment. Use the following steps as a reference only.
To ensure the best level of high availability, verify that the Ceph Monitor and RADOS Gateway nodes are distributed as evenly as possible over the failure domains.
Verify that the same number and weight of OSD nodes and OSDs are defined in each L3 compartment for the best data distribution:
In classes/cluster/cluster_name/ceph/osd.yml
, verify the Ceph OSDs
weight. For example:
backend:
bluestore:
disks:
- dev: /dev/vdc
block_db: /dev/vdd
class: hdd
weight: 1.5
In classes/cluster/cluster_name/infra/config/nodes.yml
, verify the
number of OSDs.
Verify the connection between the nodes from different compartments
through public or cluster VLANs. To use different subnets for the Ceph
nodes in different compartments, specify all subnets in
classes/cluster/cluster_name/ceph/common.yml
. For example:
parameters:
ceph:
common:
public_network: 10.10.0.0/24, 10.10.1.0/24
cluster_network: 10.11.0.0/24, 10.11.1.0/24
Prepare the CRUSHMAP:
To ensure at least one data replica in every failure domain, group the
Ceph OSD nodes from each compartment by defining the
ceph_crush_parent
parameter in
classes/cluster/cluster_name/infra/config/nodes.yml
for each Ceph OSD
node. For example, for three Ceph OSDs in rack01
:
ceph_osd_rack01:
name: ${_param:ceph_osd_rack01_hostname}<<count>>
domain: ${_param:cluster_domain}
classes:
- cluster.${_param:cluster_name}.ceph.osd
repeat:
count: 3
ip_ranges:
single_address: 10.11.11.1-10.11.20.255
backend_address: 10.12.11.1-10.12.20.255
ceph_public_address: 10.13.11.1-10.13.20.255
start: 1
digits: 0
params:
single_address:
value: <<single_address>>
backend_address:
value: <<backend_address>>
ceph_public_address:
value: <<ceph_public_address>>
params:
salt_master_host: ${_param:reclass_config_master}
ceph_crush_parent: rack01
linux_system_codename: xenial
In /classes/cluster/cluster_name/ceph/setup.yml
, create a new
CRUSHMAP and define the failure domains. For example, to have three
copies of each object distributed over rack01
, rack02
,
rack03
:
parameters:
ceph:
setup:
crush:
enforce: false # uncomment this line and set it to true only if
# you want to enforce CRUSHMAP with ceph.setup
# state !
type: # define any non-standard bucket type here
- root
- region
- rack
- host
- osd
root:
- name: default
room:
- name: room1
parent: default
- name: room2
parent: default
- name: room3
parent: default
rack:
- name: rack01 # OSD nodes defined in previous section
# will be added to this rack
parent: room1
- name: rack02
parent: room2
- name: rack03
parent: room3
rule:
default:
ruleset: 0
type: replicated
min_size: 2
max_size: 10
steps:
- take take default
- chooseleaf firstn 0 type region
- emit
Once done, proceed to Deploy a Ceph cluster.