Increase memory limits for cluster components

When any MOSK component reaches the limit of memory resources usage, the affected pod may be killed by OOM killer to prevent memory leaks and further destabilization of resource distribution.

A periodic recreation of a pod killed by OOM killer is normal once a day or week. But if the alerts frequency increases or pods cannot start and move to the CrashLoopBack state, adjust the default memory limits to fit your cluster needs and prevent critical workloads interruption.

When any MOSK component reaches the limit of CPU resources usage, StackLight raises the CPUThrottlingHigh alerts. CPU limits for MOSK components (except the StackLight ones) were removed in Container Cloud 2.24.0 (Cluster releases 14.0.0, 14.0.1, and 15.0.1). For earlier versions, use the resources:limits:cpu parameter located in the same section as the resources:limits:memory parameter of the corresponding component.

To increase memory limits on a MOSK cluster:

In the spec:providerSpec:value: section of cluster.yaml, add the resources:limits parameters with the required values for necessary MOSK components:

kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit cluster <clusterName>

The limits key location in the Cluster object can differ depending on component. Different cluster types have different sets of components that you can adjust limits for.

The following sections describe components that relate to a specific cluster type with corresponding limits key location provided in configuration examples. Limit values in the examples correspond to default values used since Container Cloud 2.24.0 (Cluster releases 15.0.1, 14.0.1, and 14.0.0).

Note

For StackLight resources limits, refer to resource-limits.

Limits for common components of any cluster type

No limits are set for the following components:

  • storage-discovery

The memory limits for the following components can be increased on the management and managed clusters:

  • client-certificate-controller

  • metrics-server

  • metallb

Note

  • For helm-controller, limits configuration is not supported.

  • For metallb, the limits key in cluster.yaml differs from other common components.

Common components for any cluster type

Component name

Configuration example

<common-component-name>

spec:
  providerSpec:
    value:
      helmReleases:
      - name: client-certificate-controller
        values:
          resources:
            limits:
              memory: 500Mi

metallb

spec:
  providerSpec:
    value:
      helmReleases:
      - name: metallb
        values:
          controller:
            resources:
              limits:
                memory: 200Mi
                # no CPU limit and 200Mi of memory limit since MCC 2.24.0 (15.0.1, 14.0.0)
                # 200m CPU and 200Mi of memory limit since MCC 2.23.0 (11.7.0)
          speaker:
            resources:
              limits:
                memory: 500Mi
                # no CPU limit and 500Mi of memory limit since MCC 2.24.0 (15.0.1, 14.0.0)
                # 500m CPU and 500Mi of memory limit since MCC 2.23.0 (11.7.0)

Limits for management cluster components

No limits are set for the following components:

  • baremetal-operator

  • baremetal-provider

  • cert-manager

The memory limits for the following components can be increased on a management cluster in the spec:providerSpec:value:kaas:management:helmReleases: section:

  • admission-controller

  • credentials-controller Since MCC 2.28 (17.3.0 and 16.3.0)

  • event-controller

  • iam

  • iam-controller

  • kaas-exporter

  • kaas-ui

  • license-controller

  • proxy-controller 0

  • release-controller

  • scope-controller

  • secret-controller Since MCC 2.27 (17.2.0 and 16.2.0)

  • user-controller

0

The proxy-controller component is replaced with secret-controller in MCC 2.27.0 (Cluster releases 17.2.0 and 16.2.0).

The memory limits for the following components can be increased on a management cluster in the following sections:

  • spec:providerSpec:value:kaas:regional:provider:baremetal:helmReleases:

  • spec:providerSpec:value:kaas:regionalHelmReleases:

  • agent-controller

  • lcm-controller

  • mcc-cache

  • rbac-controller

  • squid-proxy

Limits for management cluster components

Component name

Configuration example

<mgmt-cluster-component-name>

spec:
  providerSpec:
    value:
      kaas:
        management:
          helmReleases:
          - name: release-controller
            values:
              resources:
                limits:
                  memory: 200Mi

baremetal-provider

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: baremetal
          helmReleases:
          - name: baremetal-provider
            values:
              cluster_api_provider_baremetal:
                resources:
                  requests:
                    cpu: 500m
                    memory: 500Mi
  • agent-controller

  • lcm-controller

  • rbac-controller

spec:
  providerSpec:
    value:
      kaas:
        regionalHelmReleases:
        - name: lcm-controller
          values:
            resources:
              limits:
                memory: 1Gi

mcc-cache

spec:
  providerSpec:
    value:
      kaas:
        regionalHelmReleases:
        - name: mcc-cache
          values:
            nginx:
              resources:
                limits:
                  memory: 500Mi
            registry:
              resources:
                limits:
                  memory: 500Mi
            kproxy:
              resources:
                limits:
                  memory: 300Mi

squid-proxy

spec:
  providerSpec:
    value:
      kaas:
        regional:
        - provider: baremetal
          helmReleases:
          - name: squid-proxy
            values:
              resources:
                limits:
                  memory: 1Gi