DriveTrain

DriveTrain

  • Fixed the issue with the Linux kernel headers failing to install automatically during the upgrade of an MCP cluster.
  • Fixed the issue with the Nova and Cinder tests failing when performing the sanity testing using the CVP - Sanity checks Jenkins pipeline job on the OpenStack Queens environments.
  • Fixed the issue with the Deploy - upgrade MCP Drivetrain Jenkins pipeline job failing on the Update Drivetrain stage with the Failed to load ext_pillar reclass: ext_pillar.reclass error message. The issue affected the Kubernetes Calico-based deployments.
  • Fixed the issue with the OpenStack endpoints being unreachable when the HAProxy service stopped on the ctl, ntw, dbs, rgw, and prx VIP nodes.
  • Fixed the issue that caused MySQL being not available in case when the HAProxy service went down on a node. Added the Keepalived VRRP check on the dbs and other VCP nodes.
  • Fixed the issue with the requests hanging when connecting to the database due to the default HAProxy connection limit being too low for large clusters. Increased the maximum number of connections handled by the HAProxy process to 25000 by default and added the capability to modify this value.
  • Implemented the cleanup commissioning script to fix the issue with MAAS failing to reprovision hardware nodes with old software RAID. For details, see: MCP Deployment Guide: Add custom commissioning scripts.
  • Fixed the issue with OpenStack Nova missing the Memcached configuration for large clusters.
  • Fixed the issue with the CVP - Simplified performance tests (SPT) Jenkins pipeline job freezing in case if HW_NODES was set to an odd number of ctl and cmp nodes. In this case, the iperf processes kept running, which could cause subsequent pipeline job failures.
  • Fixed the issue with MAAS importing unnecessary large images during the Salt Master node bootstrapping and causing timeout errors, for example, TimeoutError: Node ‘cfg01.cookied-cicd-k8s-calico.local’ didn’t open SSH in 1800 sec.
  • Fixed the issue with the MCP cluster deployment pipeline jobs failing with the Can’t contact LDAP server error message.