Unresponsive OpenSDN API due to excessive Cassandra tombstone records¶
During HA operations such as hard node reboot and force node shutdown, when
tf-api is under load, the number of Cassandra tombstone records may exceed
the limit, causing the tf-config pods to become unresponsive.
To apply the issue resolution:
Manually trigger garbage collection and compaction as follows:
Reduce the garbage collection grace period:
ALTER TABLE config_db_uuid.obj_uuid_table WITH gc_grace_seconds = 10;
After some time (10+ seconds), run the following commands to force deletion of tombstones and compact the database:
nodetool garbagecollect -g CELL nodetool compact -s
Restore default
gc_grace_seconds = 864000to avoid potential performance issues.