OpenStack cluster

OpenStack and auxiliary services are running as containers in the kind: Pod Kubernetes resources. All long-running services are governed by one of the ReplicationController-enabled Kubernetes resources, which include either kind: Deployment, kind: StatefulSet, or kind: DaemonSet.

The placement of the services is mostly governed by the Kubernetes node labels. The labels affecting the OpenStack services include:

  • openstack-control-plane=enabled - the node hosting most of the OpenStack control plane services.

  • openstack-compute-node=enabled - the node serving as a hypervisor for Nova. The virtual machines with tenants workloads are created there.

  • openvswitch=enabled - the node hosting Neutron L2 agents and OpenvSwitch pods that manage L2 connection of the OpenStack networks.

  • openstack-gateway=enabled - the node hosting Neutron L3, Metadata and DHCP agents, Octavia Health Manager, Worker and Housekeeping components.

../../_images/os-k8s-pods-layout.png

Note

OpenStack is an infrastructure management platform. Mirantis OpenStack for Kubernetes (MOSK) uses Kubernetes mostly for orchestration and dependency isolation. As a result, multiple OpenStack services are running as privileged containers with host PIDs and Host Networking enabled. You must ensure that at least the user with the credentials used by Helm/Tiller (administrator) is capable of creating such Pods.

Infrastructure services

Service

Description

Storage

While the underlying Kubernetes cluster is configured to use Ceph CSI for providing persistent storage for container workloads, for some types of workloads such networked storage is suboptimal due to latency.

This is why the separate local-volume-provisioner CSI is deployed and configured as an additional storage class. Local Volume Provisioner is deployed as kind: DaemonSet.

Database

A single WSREP (Galera) cluster of MariaDB is deployed as the SQL database to be used by all OpenStack services. It uses the storage class provided by Local Volume Provisioner to store the actual database files. The service is deployed as kind: StatefulSet of a given size, which is no less than 3, on any openstack-control-plane node. For details, see OpenStack database architecture.

Messaging

RabbitMQ is used as a messaging bus between the components of the OpenStack services.

A separate instance of RabbitMQ is deployed for each OpenStack service that needs a messaging bus for intercommunication between its components.

An additional, separate RabbitMQ instance is deployed to serve as a notification messages bus for OpenStack services to post their own and listen to notifications from other services. StackLight also uses this message bus to collect notifications for monitoring purposes.

Each RabbitMQ instance is a single node and is deployed as kind: StatefulSet.

Caching

A single multi-instance of the Memcached service is deployed to be used by all OpenStack services that need caching, which are mostly HTTP API services.

Coordination

A separate instance of etcd is deployed to be used by Cinder, which require Distributed Lock Management for coordination between its components.

Ingress

Is deployed as kind: DaemonSet.

Image pre-caching

A special kind: DaemonSet is deployed and updated each time the kind: OpenStackDeployment resource is created or updated. Its purpose is to pre-cache container images on Kubernetes nodes, and thus, to minimize possible downtime when updating container images.

This is especially useful for containers used in kind: DaemonSet resources, as during the image update Kubernetes starts to pull the new image only after the container with the old image is shut down.

OpenStack services

Service

Description

Identity (Keystone)

Uses MySQL back end by default.

keystoneclient - a separate kind: Deployment with a pod that has the OpenStack CLI client as well as relevant plugins installed, and OpenStack admin credentials mounted. Can be used by administrator to manually interact with OpenStack APIs from within a cluster.

Image (Glance)

Supported back end is RBD (Ceph is required).

Volume (Cinder)

Supported back end is RBD (Ceph is required).

Network (Neutron)

Supported back ends are Open vSwitch and Tungsten Fabric.

Placement

Compute (Nova)

Supported hypervisor is Qemu/KVM through libvirt library.

Dashboard (Horizon)

DNS (Designate)

Supported back end is PowerDNS.

Load Balancer (Octavia)

Ceph Object Gateway (SWIFT)

Provides the object storage and a Ceph Object Gateway Swift API that is compatible with the OpenStack Swift API. You can manually enable the service in the OpenStackDeployment CR as described in Deploy an OpenStack cluster.

Instance HA (Masakari)

An OpenStack service that ensures high availability of instances running on a host. You can manually enable Masakari in the OpenStackDeployment CR as described in Deploy an OpenStack cluster.

Orchestration (Heat)

Key Manager (Barbican)

The supported back ends include:

  • The built-in Simple Crypto, which is used by default

  • Vault

    Vault by HashiCorp is a third-party system and is not installed by MOSK. Hence, the Vault storage back end should be available elsewhere on the user environment and accessible from the MOSK deployment.

    If the Vault back end is used, you can configure Vault in the OpenStackDeployment CR as described in Deploy an OpenStack cluster.

Tempest

Runs tests against a deployed OpenStack cloud. You can manually enable Tempest in the OpenStackDeployment CR as described in Deploy an OpenStack cluster.

Telemetry

Telemetry services include alarming (aodh), metering (Ceilometer), and metric (Gnocchi). All services should be enabled together through the list of services to be deployed in the OpenStackDeployment CR as described in Deploy an OpenStack cluster.

OpenStack database architecture

A complete setup of a MariaDB Galera cluster for OpenStack is illustrated in the following image:

../../_images/os-k8s-mariadb-galera.png

MariaDB server pods are running a Galera multi-master cluster. Clients requests are forwarded by the Kubernetes mariadb service to the mariadb-server pod that has the primary label. Other pods from the mariadb-server StatefulSet have the backup label. Labels are managed by the mariadb-controller pod.

The MariaDB Controller periodically checks the readiness of the mariadb-server pods and sets the primary label to it if the following requirements are met:

  • The primary label has not already been set on the pod.

  • The pod is in the ready state.

  • The pod is not being terminated.

  • The pod name has the lowest integer suffix among other ready pods in the StatefulSet. For example, between mariadb-server-1 and mariadb-server-2, the pod with the mariadb-server-1 name is preferred.

Otherwise, the MariaDB Controller sets the backup label. This means that all SQL requests are passed only to one node while other two nodes are in the backup state and replicate the state from the primary node. The MariaDB clients are connecting to the mariadb service.