Restarting the RADOS Gateway service using systemctl may fail. The workaround is to restart the service manually.


  1. Log in to an rgw node.

  2. Obtain the process ID of the RADOS Gateway service:

    ps uax | grep radosgw

    Example of system response:

    root     17526  0.0  0.0  13232   976 pts/0    S+   10:30   \
    0:00 grep --color=auto radosgw
    ceph     20728  0.1  1.4 1306844 58204 ?       Ssl  Jan28   \
    2:51 /usr/bin/radosgw -f --cluster ceph --name client.rgw.rgw01 --setuser ceph --setgroup ceph

    Where the process ID is 20728.

  3. Stop the process using the obtained process ID. For example:

    kill -9 $20728
  4. Start the RADOS Gateway service specifying the node name, for example, client.rgw.rgw01:

    /usr/bin/radosgw --cluster ceph --name client.rgw.rgw01 --setuser ceph --setgroup ceph
  5. Perform the steps 1 - 4 from the remaining rgw nodes one by one.


Fixed in 2019.2.3

The upgrade of a Ceph cluster from Jewel to Luminous using the Ceph - upgrade Jenkins pipeline job does not include an automatic check if other components were upgraded before upgrading the rgw nodes. As a result, uploading a file to object storage may fail. The workaround is to upgrade the rgw nodes only after you have successfully upgraded the mon, mgr, and osd nodes.


The tempest.api.object_storage.test_account_quotas.AccountQuotasTest.test_admin_modify_quota Tempest test fails because modifying the account quota is not possible even if the OpenStack user has the ResellerAdmin role. Setting a quota using the Swift CLI and API served by RADOS Gateway is also not possible. As a workaround set the quotas using the radosgw-admin utility (requires an SSH access to an OpenStack environment) as described in Quota management or using the RADOS Gateway Admin Operations API as described in Quotas.


Creating Swift containers with custom headers using the Heat stack or the tempest.api.orchestration.stacks.test_swift_resources.SwiftResourcesTestJSON.test_acl Tempest test fails. As a workaround, first create a container without additional parameters and then set the metadata variables as required.


Fixed in 2019.2.4

The mon_max_pg_per_osd variable is set in a wrong section and does not apply on the Ceph OSDs. The workaround is to manually apply the necessary changes to the cluster model.


  1. In classes/cluster/<cluster_name>/ceph/common.yml, define the additional parameters in the ceph:common pillar as follows:

              mon_max_pg_per_osd: 600
  2. In /classes/service/ceph/mon/cluster.yml and /classes/service/ceph/mon/single.yml, remove the configuration for mon_max_pg_per_osd:

     #  config:
     #    mon:
     #      mon_max_pg_per_osd: 600
  3. Apply the ceph.common state on the Ceph nodes:

    salt -C "I@ceph:common" state.sls ceph.common
  4. Set the noout and norebalance flags:

    ceph osd set noout
    ceph osd set norebalance
  5. Restart the Ceph Monitor services on the cmn nodes one by one. Verify that the nodes are in the HEALTH_OK status after each Ceph Monitor restart.

    salt -C <HOST_NAME> 'systemctl restart'
    salt -C <HOST_NAME> 'systemctl restart'
    salt -C <HOST_NAME> 'ceph -s'
  6. Restart the Ceph OSD services on the osd nodes one by one:

    1. On each Ceph OSD node verify the OSDs running:

      ceph001# ceph osd status 2>&1 | grep $(hostname)
    2. For each Ceph OSD number:

      ceph001# service ceph-osd@OSD_NR_FROM_LIST status
      ceph001# service ceph-osd@OSD_NR_FROM_LIST restart
      ceph001# service ceph-osd@OSD_NR_FROM_LIST status
    3. Verify that the cluster is in the HEALTH_OK status before restarting the next Ceph OSD.

  7. When the last Ceph OSD restarts, unset the noout and norebalance flags:

    ceph osd unset noout
    ceph osd unset norebalance