rpm-ostree troubleshooting

When working with rpm-ostree, it is crucial to be aware of common issues that can occur and to know how to troubleshoot them effectively.

Common problems during the tool usage include conflicts during upgrades, disk space limitations, and failed deployments. To troubleshoot these issues, consider basic techniques that include reviewing logs with journalctl, monitoring disk space, inspecting deployment health using rpm-ostree status, examining error messages for specific operations and so on.

Checking the system logs with journalctl

The journalctl command enables you to view the logs specifically related to the rpm-ostreed service. It provides valuable information about system events, errors, and warnings related to rpm-ostree.

When examining the logs, search for any error messages, warnings, or indications of failed operations. These can provide insights into the root cause of the issue and further troubleshooting steps.

To view and follow system logs for rpm-ostreed.service:

journalctl -f -u rpm-ostreed.service

Example system response:

-- Logs begin at Sun 2023-05-28 22:52:52 EDT. --
Jun 01 19:34:30 localhost.localdomain rpm-ostree[1323]: libostree HTTP error from remote MKEx for .....

To view system logs with a priority of ERR or higher for rpm-ostreed.service:

journalctl -p err -f -u rpm-ostreed.service

Examining error messages for specific operations

When performing operations such as upgrades, rollbacks, and installations with rpm-ostree, pay attention to any error messages that the system displays.

Error messages often provide specific details about the issue encountered, such as package conflicts, missing dependencies, or connectivity problems with repositories.

To examine error messages, run the desired rpm-ostree command and carefully read the output. Look for any error indications or specific error codes mentioned. These can help narrow down the issue and guide the troubleshooting process.

Example error messages during an upgrade:

Error: Can't upgrade to commit abcdefg: This package conflicts with package-xyz-1.0.0-1.x86_64.

The above error message indicates a package conflict preventing the upgrade. You can perform further investigation by checking the package versions, dependencies, and resolving the conflict accordingly.

Cancel an active transaction

The rpm-ostree tool provides a cancel command that you use to cancel an active transaction. This can come in handy in situations where you, for example, accidentally start an upgrade rebasing a large deployment and want to cancel the opration:

rpm-ostree cancel

Updating and managing repositories

To ensure a smooth and reliable experience with rpm-ostree, always keep the repositories up to date. This involves regular metadata updates and repository information refreshing.

To update the metadata for a repository:

rpm-ostree refresh-md

This fetches the latest information about available packages, dependencies, and updates. Similarly to updating remote branches in Git, refreshing of the metadata ensures that you have the latest information from the repositories.

Debug the cluster without SSH access

The MKEx ISO images include the mirantis/mkexdebug debug image, which you can use to debug your cluster.

  1. Following MKEx setup, upload the mirantis/mkexdebug image to a private, secure repository.

  2. Run the kubectl debug command to attach a dedicated troubleshooting container to the node you want to check:

    kubectl debug node/[node name] -it --image=mirantis/mkexdebug

    Once the troubleshooting container is in place, you can run commands against it to check the node.