High availability in DriveTrain

High availability in DriveTrain

DriveTrain is the integration framework for the MCP product. Therefore, its continuous availability is essential for the MCP solution to function properly. Although you can deploy DriveTrain in the single node Docker Swarm mode for testing purposes, most production environments require a highly-available DriveTrain installation.

All DriveTrain components run as containers in Docker Swarm mode cluster which ensures services are provided continuously without interruptions and are susceptible to failures.

The following components ensure high availability of DriveTrain:

  • Docker Swarm mode is a special Docker mode that provides Docker cluster management. Docker Swarm cluster ensures:
    • High availability of the DriveTrain services. In case of failure on any infrastructure node, Docker Swarm reschedules all services to other available nodes. GlusterFS ensures the integrity of persistent data.
    • Internal network connectivity between the Docker Swarm services through the Docker native networking.
  • Keepalived is a routing utility for Linux that provides a single point of entry for all DriveTrain services through a virtual IP address (VIP). If the node on which the VIP is active fails, Keepalived fails over the VIP to other available nodes.
  • nginx is web-server software that exposes the DriveTrain service’s APIs that run in a private network to a public network space.
  • GlusterFS is a distributed file system that ensures the integrity of the MCP Registry and Gerrit data by storing the data in a shared storage on separate volumes. This ensures that persistent data is preserved during the failover.

The following diagram describes high availability in DriveTrain:

../../_images/d_drive_train_ha.png