Upgrade and update an MCP cluster
A typical MCP cluster includes multiple components, such as DriveTrain,
StackLight, OpenStack, OpenContrail, and Ceph. Most of MCP components have
their own versioning schema. For the majority of the components, MCP supports
multiple versions at once.
The upgrade of an MCP deployment to a new version is a multi-step process that
needs to take into account the cross-dependencies between the components of
the platform, and compatibility matrix of supported versions of the components.
The MCP components that do not have their own versioning schema within MCP and
are versioned by the MCP release include:
- The DriveTrain components: Aptly, Gerrit, Jenkins, Reclass, Salt formulas
and their subcomponents
- StackLight LMA
Caution
Before proceeding with the upgrade procedure, verify that you
have updated DriveTrain including Aptly, Gerrit, Jenkins, Reclass,
Salt formulas, and their subcomponents to the current MCP release
version. Otherwise, the current MCP product documentation is not
applicable to your MCP deployment.
Note
Starting from the MCP 2019.2.16 maintenance update, before proceeding
with any update or upgrade procedure, first verify that Nova cell mapping
is enabled. For details, see Disable Nova cell mapping.
Note
Starting from the MCP 2019.2.17 maintenance update, before proceeding
with the next update procedure, you can verify that the model contains
information about the necessary fixes and workarounds. For details, see
Verify DriveTrain.
For the MCP components with support for multiple versions, such as OpenStack
or OpenContrail, you usually can select between two operations:
- Minor version update (maintenance update)
New minor versions of the components artifacts are installed. Services are
restarted as necessary. This kind of update allows you to obtain the
latest bug and security fixes for the components, but it typically
does not change the components capabilities.
- Major version update (upgrade)
New major versions of the components artifacts are installed. Additional
orchestration tasks are executed to change the components configuration,
if necessary. This kind of update typically changes and improves
the components capabilities.
The following table outlines a general upgrade and update procedure workflow
of an MCP cluster. For the detailed upgrade and update workflow of MCP
components, refer to the corresponding sections below.
General upgrade and update procedure workflow
# |
Stage |
Description |
1 |
Upgrade or update DriveTrain |
Perform the basic LCM update or upgrade:
- Update the Reclass system.
- Fetch the corresponding Git repositories.
- Update all binary repository definitions on the Salt Master node.
- Update and sync all Salt formulas.
- Apply the
linux.repo,linux.user and openssh states
on all nodes.
- Upgrade or update the DriveTrain services.
- Optional. Upgrade system packages on the Salt Master node.
- Upgrade or update GlusterFS:
- Upgrade or update packages for the GlusterFS server
on each target host one by one.
- Upgrade or update packages for the GlusterFS clients
and re-mount volumes on each target GlusterFS client host
one by one.
- Obtain the
cluster.max-op-version option value
from GlusterFS and compare it with cluster.op-version
to identify whether a version upgrade is required.
- Update
cluster.op-version .
- Optional. Configure allowed and rejected IP addresses for the GlusterFS volumes.
|
2 |
Upgrade or
update OpenContrail (if applicable) |
- Verify the OpenContrail service statuses.
- Back up the Cassandra and ZooKeeper data.
- Stop the Neutron server services.
- Upgrade or update the OpenContrail analytics nodes simultaneously.
During upgrade, new Docker containers for the OpenContrail analytics
nodes are spawned. During update, the corresponding Docker images
are updated.
- Upgrade or update the OpenContrail controller nodes.
During upgrade, new Docker containers for the OpenContrail controller
nodes are spawned. During update, the corresponding Docker images
are updated.
All nodes are upgraded or updated simultaneously
except the one that meantime runs the
contrail-control service
and is upgraded or updated after other nodes.
- Upgrade or update the OpenContrail packages on the OpenStack
controller nodes simultaneously.
- Start the Neutron server services.
- Upgrade or update the OpenContrail data plane nodes one by one
with the workloads migration if needed since this step implies
downtime of the Networking service.
|
3 |
Upgrade or update OpenStack
or Kubernetes |
For OpenStack:
- On every OpenStack controller node one by one:
- Stop the OpenStack API services.
- Upgrade or update the OpenStack packages.
- Start the OpenStack services.
- Apply the OpenStack states.
- Verify that the OpenStack services are up and healthy.
- Upgrade the OpenStack data plane.
Caution
We recommend that you do not upgrade or update OpenStack
and RabbitMQ simultaneously. Upgrade or update the RabbitMQ component only
once OpenStack is running on the new version.
|
4 |
Upgrade or update Galera |
- Prepare the Galera cluster for the upgrade.
- Upgrade or update the MySQL and Galera packages on the Galera nodes
one by one.
- Verify the cluster status after upgrade.
|
5 |
Upgrade or
update RabbitMQ |
- Prepare the Neutron service for the RabbitMQ upgrade or update.
- Verify that the RabbitMQ upgrade pipeine job is present in Jenkins.
- Upgrade or update the RabbitMQ component.
Caution
We recommend that you do not upgrade or update OpenStack
and RabbitMQ simultaneously. Upgrade or update the RabbitMQ component only
once OpenStack is running on the new version.
|
|
|
For Kubernetes:
- Upgrade or update essential Kubernetes binaries, for example,
hypercube , etcd , cni .
- Restart essential Kubernetes services.
- Upgrade or update the addons definitions with the latest images.
- Perform the Kubernetes control plane changes, if any, on every
Kubernetes Master node one by one.
- Upgrade or update the Kubernetes Nodes one by one.
|
6 |
Upgrade or
update StackLight |
- During upgrade, enable the Ceph Prometheus plugin (if applicable).
- Upgrade or update system components including Telegraf, Fluentd,
Prometheus Relay, libvirt-exporter, and jmx-exporter.
- Upgrade or update Elasticsearch and Kibana one by one:
- Stop the corresponding service on all
log nodes.
- Upgrade or update the packages to the newest version.
- For Elasticsearch, reload the
systemd configuration.
- Start the corresponding service on all
log nodes.
- Verify that the Elasticsearch cluster status is
green .
- In case of a major version upgrade, transform the indices for the
new version of Elasticsearch and migrate Kibana to the new index.
- Upgrade or update components running in Docker Swarm:
- Disable and remove the previous versions of monitoring
services.
- Rebuild the Prometheus configuration by applying the
prometheus state on the mon nodes.
- Disable and remove the previous version of Grafana.
- Start the monitoring services by applying the
docker state
on the mon nodes.
- Apply the
saltutil.sync_all state and the grafana.client
state to refresh the Grafana dashboards.
|
7 |
Upgrade Ceph or Update Ceph |
For upgrade:
- Prepare the Ceph cluster for upgrade.
- Perform the backup.
- Upgrade the Ceph repository on each node one by one.
- Upgrade the Ceph packages on each node one by one.
- Restart the Ceph services on each node one by one.
- Verify the upgrade on each node one by one and wait for user
input to proceed.
- Perform the post-upgrade procedures.
For update:
- Update and install new Ceph packages on the
cmn nodes.
- Restart Ceph Monitor services on all
cmn nodes one by one.
- Starting from the 2019.2.8 update, restart Ceph Manager on all
mgr nodes one by one.
- Update and install new Ceph packages on the
osd nodes.
- Restart Ceph OSDs services on all
osd nodes one by one.
- Update and install new Ceph packages on the
rgw nodes.
- Restart Ceph RADOS Gateway services on all
rgw nodes one by one.
After the restart of every service, wait for the system to become
healthy.
|
8 |
Update the base operating system |
Install security updates on all nodes.
To reduce the size of new packages to be installed on a cluster during
update or upgrade, this is the final step of the procedure.
However, you can perform it at any stage to fetch only security patches.
|
9 |
Apply issues resolutions requiring manual application described in the
Addressed issues sections of all
Maintenance updates. |
Apply fixes that require manual application for all maintenance updates
one by one. |