Global recommendations for implementation of custom modules

The following global recommendations are intended to help creators of modules and cloud operators to work with the day-2 operations API for module implementation and execution, in order to keep the cluster and machines healthy and ensure safe and reliable cluster operability.

Functionality limitations

Module functionality is limited only by the Ansible itself along with playbook rules for a particular Ansible version. But Mirantis highly recommends paying a special attention to critical components of Container Cloud, some of which are mentioned below, and not managing them by the means of day-2 modules.

Important

The cloud operator takes all risks and responsibility for module execution on cluster machines. For any questions, contact Mirantis support.

  1. Do not restart Docker, containerd, and Kubernetes-related services.

  2. Do not configure Docker and Kubernetes node labels.

  3. Do not reconfigure or upgrade MKE.

  4. Do not change the MKE bundle.

  5. Do not reboot nodes using a day-2 module.

  6. Do not change network configuration, especially on critical LCM and external networks, so that they remain consistent with kaas-ipam objects.

  7. Do not change iptables, especially for Docker, Kubernetes, and Calico rules.

  8. Do not change partitions on the fly, especially the / and /var/lib/docker ones.

Ansible version

Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), the following Ansible versions are supported for Ubuntu 20.04 and 22.04: Ansible 2.12.10 and Ansible 5.10.0-collection. Therefore, your custom modules must be compatible with the corresponding Ansible versions provided for a specific Cluster release, on which your cluster is based.

To verify the Ansible version in a specific Cluster release, refer to the Cluster releases section in Release Notes. Use the Artifacts > System and MCR artifacts section of the corresponding Cluster release. For example, for 17.2.0.

Module implementation principles

Treat a day-2 module as an Ansible module to control a limited set of system resources related to one component, for example, a service or driver, so that a module contains a very limited amount of tasks to set up that component.

For example, if you need to configure a service on a host, the module must manage only package installation, related configuration files, and service enablement. Do not implement the module in a way so that it manages all tasks required for the day-2 configuration of a host. Split such functionality on tasks (modules) responsible for management of a single component. This helps to re-apply (re-run) every module separately in case of any changes.

Mirantis highly recommends using the following key principles during module implementation:

Idempotency

Any module re-run with the same configuration values must lead to the same result.

Granularity

The module must manage only one specific component on a host.

Reset action

The module must be able to revert changes introduced by the module, or at least the module must be able to disable the component controller. The Container Cloud LCM does not provide a way to revert a day-2 change due to unpredictability of potential functionality of any module. Therefore, the reset action must be implemented on the module level. For example, the package or file state can be present or absent, a service can be enabled or disabled. And these states must be controlled by the configuration values.

Modules testing

Mirantis highly recommends verifying any Container Cloud or custom module on one machine before applying it to all target machines. For the testing procedure, see Test a custom or Container Cloud module after creation.

Reboot required

A custom module may require node reboot after execution. Implement a custom module using the following options, so that it can notify lcm-agent and Container Cloud controllers about the required reboot:

  • If a module installs a package that requires a host reboot, then the /run/reboot-required and /var/run/reboot-required.pkgs files are created automatically by the package manager. LCM Agent detects these files and places information about the reboot reason in the LCMMachine status.

  • A module can create the /run/reboot-required file on the node. You can add the reason for reboot in the /run/lcm/reboot-required file as plain text. This text is passed to the reboot reason in the LCMMachine status.

Once done, you can handle a machine reboot using GracefulRebootRequest.