Increase memory limits for cluster components

When any Container Cloud component reaches the limit of memory resources usage, the affected pod may be killed by OOM killer to prevent memory leaks and further destabilization of resource distribution.

A periodic recreation of a pod killed by OOM killer is normal once a day or week. But if the alerts frequency increases or pods cannot start and move to the CrashLoopBack state, adjust the default memory limits to fit your cluster needs and prevent critical workloads interruption.

When any Container Cloud component reaches the limit of CPU resources usage, StackLight raises the CPUThrottlingHigh alerts. CPU limits for Container Cloud components (except the StackLight ones) were removed in the Cluster release 14.0.0. For earlier Cluster releases, use the resources:limits:cpu parameter located in the same section as the resources:limits:memory parameter of the corresponding component.

Note

For StackLight resources limits, refer to Resource limits.

To increase memory limits on a Container Cloud cluster:

In the spec:providerSpec:value: section of cluster.yaml, add the resources:limits parameters with the required values for necessary Container Cloud components:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit cluster <clusterName>

The limits key location in the Cluster object can differ depending on component. Different cluster types have different set of components that you can adjust limits for.

The following sections describe components that relate to a specific cluster type with corresponding limits key location provided in configuration examples. Limit values in the examples correspond to default values used since Container Cloud 2.24.0 (Cluster releases 15.0.1, 14.0.1, and 14.0.0).

Note

For StackLight resources limits, refer to Resource limits.

Limits for common components of any cluster type

No limits are set for the following components:

  • storage-discovery

The memory limits for the following components can be increased on the management and managed clusters:

  • client-certificate-controller

  • metrics-server

  • metallb

Note

  • For helm-controller, limits configuration is not supported.

  • For metallb applicable to bare metal and vSphere providers, the limits key in cluster.yaml differs from other common components.

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

Common components for any cluster type

Component name

Configuration example

<common-component-name>

spec:
  providerSpec:
    value:
      helmReleases:
      - name: client-certificate-controller
        values:
          resources:
            limits:
              memory: 500Mi

metallb

spec:
  providerSpec:
    value:
      helmReleases:
      - name: metallb
        values:
          controller:
            resources:
              limits:
                memory: 200Mi
                # no CPU limit and 200Mi of memory limit since Container Cloud 2.24.0
                # 200m CPU and 200Mi of memory limit since Container Cloud 2.23.0
          speaker:
            resources:
              limits:
                memory: 500Mi
                # no CPU limit and 500Mi of memory limit since Container Cloud 2.24.0
                # 500m CPU and 500Mi of memory limit since Container Cloud 2.23.0

Limits for management cluster components

No limits are set for the following components:

  • baremetal-operator

  • baremetal-provider

  • cert-manager

The memory limits for the following components can be increased on a management cluster in the spec:providerSpec:value:kaas:management:helmReleases: section:

  • admission-controller

  • event-controller

  • iam

  • iam-controller

  • kaas-exporter

  • kaas-ui

  • license-controller

  • proxy-controller 0

  • release-controller

  • rhellicense-controller 0

  • scope-controller

  • secret-controller Since 2.27.0

  • user-controller

0(1,2)

The proxy-controller and rhellicense-controller are replaced with secret-controller in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0).

The memory limits for the following components can be increased on a management cluster in the following sections:

  • spec:providerSpec:value:kaas:regional:[(provider:<provider-name>): helmReleases]:

  • spec:providerSpec:value:kaas:regionalHelmReleases:

  • agent-controller

  • byo-credentials-controller 1

  • byo-provider

  • lcm-controller

  • mcc-cache

  • openstack-provider

  • os-credentials-controller

  • rbac-controller

  • vsphere-credentials-controller

  • vsphere-provider

  • vsphere-vm-template-controller 2

  • squid-proxy

1

The byo-credentials-controller is replaced with secret-controller in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0).

2

The memory limits for vsphere-vm-template-controller can be increased for the controller itself and for the Packer job.

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

Limits for management cluster components

Component name

Configuration example

<mgmt-cluster-component-name>

spec:
  providerSpec:
    value:
      kaas:
        management:
          helmReleases:
          - name: release-controller
            values:
              resources:
                limits:
                  memory: 200Mi

openstack-provider

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: openstack
          helmReleases:
          - name: openstack-provider
            values:
              resources:
                openstackMachineController:
                  limits:
                    memory: 1Gi

os-credentials-controller

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: openstack
          helmReleases:
          - name: os-credentials-controller
            values:
              resources:
                limits:
                  memory: 1Gi
  • byo-provider

  • vsphere-provider

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: vsphere # <provider-name>
          helmReleases:
          - name: vsphere-provider # <provider-name>
            values:
              vsphereController: # <provider-name>Controller:
                resources:
                  limits:
                    memory: 1Gi
  • byo-credentials-controller

  • vsphere-credentials-controller

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: vsphere # <provider-name>
          helmReleases:
          - name: vsphere-credentials-controller
            # <provider-credentials-controller-name>
            values:
              resources:
                limits:
                  memory: 1Gi

vsphere-vm-template-controller

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: vsphere
          helmReleases:
          - name: vsphere-vm-template-controller
            values:
              resources:
                limits:
                  memory: 150Mi
              packer:
                packer_job:
                  resources:
                    limits:
                      memory: 500Mi
  • agent-controller

  • lcm-controller

  • rbac-controller

spec:
  providerSpec:
    value:
      kaas:
        regionalHelmReleases:
        - name: lcm-controller
          values:
            resources:
              limits:
                memory: 1Gi

mcc-cache

spec:
  providerSpec:
    value:
      kaas:
        regionalHelmReleases:
        - name: mcc-cache
          values:
            nginx:
              resources:
                limits:
                  memory: 500Mi
            registry:
              resources:
                limits:
                  memory: 500Mi
            kproxy:
              resources:
                limits:
                  memory: 300Mi

squid-proxy

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: vsphere
          helmReleases:
          - name: squid-proxy
            values:
              resources:
                limits:
                  memory: 1Gi