Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!

Starting with MOSK 25.2, the MOSK documentation set will cover all product layers, including MOSK management (formerly MCC). This means everything you need will be in one place. The separate MCC documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.

Ceph health reports PG_DAMAGED after a failed disk or node replacement

After adding a new OSD node on a compact cluster, Ceph health may report HEALTH_ERR with the ceph health detail command output containing PG_DAMAGED and OSD_SCRUB_ERRORS messages. For example:

$ ceph -s

  cluster:
    id:     8bca9dfb-df99-4920-bba0-e5bca59876b4
    health: HEALTH_ERR
            1 scrub errors
            Possible data damage: 1 pg inconsistent

  services:
    mon: 3 daemons, quorum a,b,c (age 3h)
    mgr: a(active, since 3h), standbys: b
    osd: 4 osds: 4 up (since 109m), 4 in (since 110m)
    rgw: 2 daemons active (2 hosts, 1 zones)
$ ceph health detail

HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
[ERR] OSD_SCRUB_ERRORS: 1 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
    pg 11.2a is active+clean+inconsistent, acting [3,1]

To fix the PG_DAMAGED health error:

  1. Obtain the damaged placement group (PG) ID:

    ceph health detail
    

    Example of system response:

    HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
    [ERR] OSD_SCRUB_ERRORS: 1 scrub errors
    [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
        pg 11.2a is active+clean+inconsistent, acting [3,1]
    

    In the example above, 11.2a is the required PG ID.

  2. Repair the damaged PG:

    ceph pg repair <pgid>
    

    Substitute <pgid> with a damaged PG ID. For example:

    ceph pg repair 11.2a