Calculate a maintenance window duration for update

This section provides the background information on the approximate time spent on update operations for pods of different purposes, possible data plane impact during these operations, and the possibility of a parallel pods update. Such data helps the cloud administrators to correctly estimate maintenance windows and impacts on the workloads for their OpenStack deployments.

Caution

This section applies to the OpenStack deployments that use Open vSwitch as a networking back end and does not provide details on the Tungsten Fabric services update time.

Note

During the MOSK cluster update, numerous StackLight alerts may fire. This is an expected behavior. Ignore or temporarily mute them as described in Mirantis Container Cloud Operations Guide: Silence alerts.

Calculating the maintenance window

Caution

This section applies to the OpenStack deployments that use Open vSwitch as a networking back end and does not provide details on the Tungsten Fabric services update time.

The maintenance window for a MOSK cluster update is equal to the sum of time intervals required for the phases taking place during the cluster update.

Maintenance windows for cluster update phases

MOSK cluster update stage

Approx. stage time in minutes

Life-cycle management modules update

45

OpenStack and Tungsten Fabric components update 4

(0.78D + 359) * (NC + NG)/60 5, 6, 7

Ceph cluster update and upgrade

35 + 15 for minor Ceph version update
50 for major Ceph version update

Host operating system update on Kubernetes master nodes

15NKC 8

Kubernetes cluster update on Kubernetes master nodes

40

Host operating system and Kubernetes cluster

Sum of 20NC 6 to migrate instances from a compute node
and 40NKM 9 to update and restart a Kubernetes worker node
multiplied by RE 10
4

Maximum update time from the OpenStack services restart time table bellow. In most cases, it is the update time for the openvswitch-openvswitch-vswitchd-default service.

5

D – Density that is an estimated number of virtual machines per comput node

6(1,2)

NC – Number of OpenStack compute nodes

7

NG – Number of OpenStack gateway nodes

8

NKC – Number of Kubernetes control plane nodes

9

NKM – Number of Kubernetes worker nodes including all roles (Ceph, StackLight, gateway, OpenStack control plane nodes, OpenStack compute nodes), except for Kubernetes control plane nodes that are under the Kubernetes worker role.

10

RE – Boolean value. Equals to 1 if you would like to include manual nodes restarting stage into the maintenace window and to 0 if you plan to do it later.

The formula to calculate the maintenance window for a complete cluster update procedure is as follows:

\[t=((0.78D+359)*(NC+NG)/60)+(15NKC)+(RE*(20NC+40NKM))+(45+50+40)\]

Note

The time calculated by this formula is approximate. Additionally, for a more accurate calculation, consider other individual factors such as number of routers, frequency of CPU, and others, which can have a large impact on the update maintenance time in some edge cases.

Formula usage

Let’s calculate an update maintenance duration for the MOSK deployment with the following characteristics:

Node type

# of nodes

Kubernetes control plane nodes

3

Ceph nodes

3

OpenStack control plane nodes

3

OpenStack compute nodes

30

OpenStack gateway nodes

3

StackLight nodes

3

VMs per compute node

25

The update maintenance window for the cluster that is performed with the restart of all Kubernetes worker nodes during the update should be calculated as follows and is equal to a total of 2668 minutes or 45,5 hours:

\[(0.78*25+359)*(30+3)/60+15*3+1*(20*30+40*42)+(45+50+40) = 2668\]

The update time by phase for the given cluster is as follows:

MOSK cluster update stage

Approx. stage time in minutes

Life-cycle management modules update

45

OpenStack and Tungsten Fabric components update

208

Ceph cluster update and upgrade

50 (major Ceph version update)

Host operating system update on Kubernetes master nodes

45

Kubernetes cluster update on Kubernetes master nodes

40

Host operating system and Kubernetes cluster update

  • 600 to migrate instances from compute nodes

  • 1680 to upgrade operating system

Services update details

OpenStack services restart time

Pod name

Kubernetes kind

Data plane impact

Dependence on VMs density per compute node

Dependence on # of compute nodes

Parallel update

Readiness time, sec. 0

Replicas count (for medium cluster size) 0

Whole replicas restart time, sec.

barbican-api

Deployment

No

No

No

Yes, batches 10% of overall count

39

4

155

cinder-api

Deployment

No

No

No

Yes, batches 10% of overall count

67

4

269

cinder-backup

StatefulSet

No

No

No

No, sequentially

61

3

184

cinder-scheduler

StatefulSet

No

No

No

No, sequentially

157

3

472

cinder-volume

StatefulSet

No

No

No

No, sequentially

185

6

1110

designate-api

StatefulSet

No

No

No

No, sequentially

29

3

88

designate-central

StatefulSet

No

No

No

No, sequentially

67

3

200

designate-mdns

StatefulSet

No

No

No

No, sequentially

68

3

203

designate-producer

StatefulSet

No

No

No

No, sequentially

89

3

266

designate-worker

StatefulSet

No

No

No

No, sequentially

94

3

282

etcd-etcd

StatefulSet

No

No

No

No, sequentially

29

3

87

glance-api

Deployment

No

No

No

Yes, batches 10% of overall count

36

3

108

heat-api

Deployment

No

No

No

Yes, batches 10% of overall count

95

3

285

heat-cfn

Deployment

No

No

No

Yes, batches 10% of overall count

95

3

286

heat-engine

Deployment

No

No

No

Yes, batches 10% of overall count

46

15

464

horizon

Deployment

No

No

No

Yes, batches 10% of overall count

60

5

302

image-precaching-0

DaemonSet

No

No

No

Yes, batches 100% of overall count

1629

# of all nodes running OpenStack services

1629

ingress

Deployment

No

No

No

No, sequentially

12

5

60

ingress-error-pages

Deployment

No

No

No

No, sequentially

27

1

27

keystone-api

Deployment

No

No

No

Yes, batches 10% of overall count

20

15

204

keystone-client

Deployment

No

No

No

Yes, batches 10% of overall count

27

1

27

libvirt-libvirt-default

DaemonSet

NO

Yes

Yes

Yes, batches 10% of overall count

0.475 * D 1 + 97

NC 2

(0.475 * D 1 + 97) * NC 2 * 0.1

mariadb-controller

Deployment

No

No

No

No, sequentially

26

1

26

mariadb-server

StatefulSet

No

No

No

No, sequentially

461

3

1383

neutron-dhcp-agent-default

DaemonSet

NO

Yes

NO

No, sequentially

3.6 * D 1 * 510

NG 3

(3.6 * D 1 * 510) * NG 3

neutron-l3-agent-default

DaemonSet

No

No

No

No, sequentially

115

NG 3

115 * NG 3

neutron-l3-agent-<ID>

DaemonSet

No

Yes

Yes

No, sequentially

0.1975 * D 1 + 87

NC 2

(0.1975 * D 1 + 87) * NC 2

neutron-metadata-agent-default

DaemonSet

No

No

No

No, sequentially

60

NG 3

NG 3 * 60

neutron-netns-cleanup-cron-default

DaemonSet

No

No

No

Yes, batches 10% of overall count

363

NC 1 + NG 2

363

neutron-ovs-agent-default

DaemonSet

Yes

Yes

NO

Yes, batches 10% of overall count

3.6 * D 1 * 510

NC 2 + NG 3

(3.6 * D 1 * 510) * (NC 2 + NG 3) * 0.1

neutron-server

Deployment

No

No

No

Yes, batches 10% of overall count

14

12

140

nova-api-metadata

Deployment

No

No

No

Yes, batches 10% of overall count

41

3

122

nova-api-osapi

Deployment

No

No

No

Yes, batches 10% of overall count

79

15

794

nova-compute-default

DaemonSet

NO

Yes

Yes

Yes, batches 10% of overall count

0.585 * D 1 + 43

NC 2

(0.585 * D 1 + 43) * NC 2 * 0.1

nova-conductor

StatefulSet

No

No

No

No, sequentially

56

3

168

nova-novncproxy

Deployment

No

No

No

Yes, batches 10% of overall count

31

3

92

nova-scheduler

StatefulSet

No

No

No

No, sequentially

72

3

217

octavia-api

Deployment

No

No

No

Yes, batches 10% of overall count

36

3

107

octavia-health-manager-default

DaemonSet

No

No

No

No, sequentially

108

NG 3

108 * NG 3

octavia-housekeeping

Deployment

No

No

No

Yes, batches 10% of overall count

20

3

59

octavia-worker

Deployment

No

No

No

Yes, batches 10% of overall count

89

3

268

openstack-barbican-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

121

1

121

openstack-barbican-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

75

1

75

openstack-cinder-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

119

1

119

openstack-cinder-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

59

1

59

openstack-designate-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

121

1

121

openstack-designate-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

91

1

91

openstack-glance-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

121

1

121

openstack-glance-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

76

1

76

openstack-heat-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

118

1

118

openstack-heat-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

90

1

90

openstack-memcached-memcached

StatefulSet

No

No

No

No, sequentially

56

3

167

openstack-neutron-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

166

1

166

openstack-neutron-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

75

1

75

openstack-nova-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

149

1

149

openstack-nova-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

90

1

90

openstack-octavia-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

120

1

120

openstack-octavia-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

59

1

59

openvswitch-openvswitch-db-default

DaemonSet

No

No

Yes

No, sequentially

22

NC 2 + NG 3

(NC 2 + NG 3) * 22

openvswitch-openvswitch-vswitchd-default

DaemonSet

No

Yes

Yes

No, sequentially

0.78 * D 1 + 359

NC 2 + NG 3

(0.78 * D 1 + 359) * (NC 2 + NG 3)

openstack-rabbitmq-rabbitmq

StatefulSet

No

No

No

No, sequentially

151

1

151

openstack-rabbitmq-rabbitmq-exporter

Deployment

No

No

No

No, sequentially

75

1

75

placement-api

Deployment

No

No

No

Yes, batches 10% of overall count

35

4

140

0(1,2)

The calculated time presented in this column was calculated based on data obtained from the reference environment, the actual update time in different production environments may vary

1(1,2,3,4,5,6,7,8,9,10,11,12,13)

D stands for density representing an estimated number of virtual machines per compute node

2(1,2,3,4,5,6,7,8,9,10,11,12,13)

NC stands for number of OpenStack compute nodes

3(1,2,3,4,5,6,7,8,9,10,11,12,13,14)

NG stands for number of OpenStack gateway nodes