Newer documentation is now live.You are currently reading an older version.

Unresponsive OpenSDN API due to excessive Cassandra tombstone records

During HA operations such as hard node reboot and force node shutdown, when tf-api is under load, the number of Cassandra tombstone records may exceed the limit, causing the tf-config pods to become unresponsive.

To resolve the issue:

Manually trigger garbage collection and compaction as follows:

  1. Reduce the garbage collection grace period:

    ALTER TABLE config_db_uuid.obj_uuid_table WITH gc_grace_seconds = 10;
    
  2. After some time (10+ seconds), run the following commands to force deletion of tombstones and compact the database:

    nodetool garbagecollect -g CELL
    nodetool compact -s
    
  3. Restore default gc_grace_seconds = 864000 to avoid potential performance issues.