StackLight configuration parameters

This section describes the StackLight configuration keys that you can specify in the values section to change StackLight settings as required. Prior to making any changes to StackLight configuration, perform the steps described in StackLight configuration procedure. After changing StackLight configuration, verify the changes as described in Verify StackLight after configuration.


Some parameters are marked as mandatory. Failure to specify values for such parameters causes the Admission Controller to reject cluster creation.

OpenStack cluster configuration parameters

This section describes the OpenStack-related StackLight configuration keys. For MOSK cluster configuration keys, see MOSK cluster configuration parameters.




Example values

openstack.gnocchi.enabled (bool)

Enables Gnocchi monitoring. Set to false by default.

true or false




Example values

openstack.ironic.enabled (bool)

Enables Ironic monitoring. Set to false by default.

true or false




Example values

openstack.enabled (bool)

Enables OpenStack monitoring. Set to true by default.

true or false

openstack.namespace (string)

Defines the namespace within which the OpenStack virtualized control plane is installed. Set to openstack by default.





Example values

openstack.rabbitmq.credentialsConfig (map)

Defines the RabbitMQ credentials to use if credentials discovery is disabled or some required parameters were not found during the discovery.

  username: "stacklight"
  password: "stacklight"
  host: "rabbitmq.openstack.svc"
  queue: "notifications"
  vhost: "openstack"

openstack.rabbitmq.credentialsDiscovery (map)

Enables the credentials discovery to obtain the username and password from the secret object.

  enabled: true
  namespace: openstack
  secretName: os-rabbitmq-user-credentials

SSL certificates



Example values

openstack.externalFQDN (string) Deprecated

External FQDN used to communicate with OpenStack services for certificates monitoring. The option is deprecated, use openstack.externalFQDNs.enabled instead.

openstack.externalFQDNs.enabled (bool)

External FQDN used to communicate with OpenStack services. Used for certificates monitoring. Set to false by default.

true or false

openstack.insecure (string)

Defines whether to verify the trust chain of the OpenStack endpoint SSL certificates during monitoring.

  internal: true
  external: false




Example values

openstack.telegraf.credentialsConfig (map)

Specifies the OpenStack credentials to use if the credentials discovery is disabled or some required parameters were not found during the discovery.

  identityEndpoint: "" # "http://keystone-api.openstack.svc:5000/v3"
  domain: "" # "default"
  password: "" # "workshop"
  project: "" # "admin"
  region: "" # "RegionOne"
  username: "" # "admin"

openstack.telegraf.credentialsDiscovery (map)

Enables the credentials discovery to obtain all required parameters from the secret object.

  enabled: true
  namespace: openstack
  secretName: keystone-keystone-admin

openstack.telegraf.interval (string)

Specifies the interval of metrics gathering from the OpenStack API. Set to 1m by default.

1m, 3m

openstack.telegraf.insecure (bool)

Enables or disables the server certificate chain and host name verification. Set to true by default.

true or false

openstack.telegraf.skipPublicEndpoints (bool)

Enables or disables HTTP probes for public endpoints from the OpenStack service catalog. Set to false by default, meaning that Telegraf verifies all endpoints from the OpenStack service catalog, including the public, admin, and internal endpoints.

true or false

Tungsten Fabric



Example values

tungstenFabricMonitoring.enabled (bool)

Enables Tungsten Fabric monitoring.

Since MOSK 23.1, the parameter is set to true by default if Tungsten Fabric is deployed.

Before MOSK 23.1, the parameter is set to false by default. Set it to true only if Tungsten Fabric is deployed.

true or false

tungstenFabricMonitoring.exportersTimeout (string)

Available since MOSK 23.3. Defines the timeout of the tungstenfabric-exporter client requests. Set to 5s by default.

  exportersTimeout: "5s"

tungstenFabricMonitoring.analyticsEnabled (bool)

Available since MOSK 24.1. Enables or disables monitoring of the Tungsten Fabric analytics services.

In MOSK 24.1, defaults to true.

Since MOSK 24.2, the default value is set automatically based on the real state of the Tungsten Fabric analytics services (enabled or disabled) in the Tungsten Fabric cluster.

true or false

MOSK cluster configuration parameters

This section describes the MOSK cluster StackLight configuration keys. For OpenStack cluster configuration keys, see OpenStack cluster configuration parameters.

Alert configuration



Example values

prometheusServer.customAlerts (slice)

Defines custom alerts. Also, modifies or disables existing alert configurations. For the list of predefined alerts, see StackLight alerts. While adding or modifying alerts, follow the Alerting rules.

# To add a new alert:
- alert: ExampleAlert
    description: Alert description
    summary: Alert summary
  expr: example_metric > 0
  for: 5m
    severity: warning
# To modify an existing alert expression:
- alert: AlertmanagerFailedReload
  expr: alertmanager_config_last_reload_successful == 5
# To disable an existing alert:
- alert: TargetDown
  enabled: false

An optional field enabled is accepted in the alert body to disable an existing alert by setting to false. All fields specified using the customAlerts definition override the default predefined definitions in the charts’ values.




Example values

alerta.enabled (bool)

Enables or disables Alerta. Using the Alerta web UI, you can view the most recent or watched alerts, group, and filter alerts. Set to true by default.

true or false

Alertmanager integrations

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules.



Example values

alertmanagerSimpleConfig.genericReceivers (slice)

Provides a generic template for notifications receiver configurations. For a list of supported receivers, see Prometheus Alertmanager documentation: Receiver.

For example, to enable notifications to OpsGenie:

  - name: HTTP-opsgenie
    enabled: true # optional
    - api_url: ""
      api_key: "secret-key"
      send_resolved: true

alertmanagerSimpleConfig.genericRoutes (slice)

Provides a template for notifications route configuration. For details, see Prometheus Alertmanager documentation: Route.

- receiver: HTTP-opsgenie
  enabled: true # optional
  continue: true

alertmanagerSimpleConfig.inhibitRules.enabled (bool)

Disables or enables alert inhibition rules. If enabled, Alertmanager decreases alert noise by suppressing dependent alerts notifications to provide a clearer view on the cloud status and simplify troubleshooting. Enabled by default. For details, see Alert dependencies. For details on inhibition rules, see Prometheus documentation.

true or false

Alertmanager: notifications to email



Example values (bool)

Enables or disables Alertmanager integration with email. Set to false by default.

true or false (map)

Defines the notification parameters for Alertmanager integration with email. For details, see Prometheus Alertmanager documentation: Email configuration.

  enabled: false
  send_resolved: true
  to: ""
  from: ""
  auth_username: ""
  auth_password: password
  auth_identity: ""
  require_tls: true (map)

Defines the route for Alertmanager integration with email. For details, see Prometheus Alertmanager documentation: Route.

  matchers: []
  routes: []

Alertmanager: notifications to Microsoft Teams

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Microsoft Teams integration depends on the Internet access through HTTPS.



Example values

alertmanagerSimpleConfig.msteams.enabled (bool)

Enables or disables Alertmanager integration with Microsoft Teams. Requires a set up Microsoft Teams channel and a channel connector. Set to false by default.

true or false

alertmanagerSimpleConfig.msteams.url (string)

Defines the URL of an Incoming Webhook connector of a Microsoft Teams channel. For details about channel connectors, see Microsoft documentation.

alertmanagerSimpleConfig.msteams.route (map)

Defines the notifications route for Alertmanager integration with MS Teams. For details, see Prometheus Alertmanager documentation: Route.

  matchers: []
  routes: []

Alertmanager: notifications to Salesforce

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Salesforce integration depends on the Internet access through HTTPS.



Example values

clusterId (string)

Unique cluster identifier clusterId="<Cluster Project>/<Cluster Name>/<UID>", generated for each cluster using Cluster Project, Cluster Name, and cluster UID, separated by a slash. Used for both sf-notifier and sf-reporter services.

The clusterId is automatically defined for each cluster. Do not set or modify it manually.

Do not modify clusterId.

alertmanagerSimpleConfig.salesForce.enabled (bool)

Enables or disables Alertmanager integration with Salesforce using the sf-notifier service. Disabled by default.

true or false

alertmanagerSimpleConfig.salesForce.auth (map)

Defines the Salesforce parameters and credentials for integration with Alertmanager.

  url: "<SF instance URL>"
  username: "<SF account email address>"
  password: "<SF password>"
  environment_id: "<Cloud identifier>"
  organization_id: "<Organization identifier>"
  sandbox_enabled: "<Set to true or false>"

alertmanagerSimpleConfig.salesForce.route (map)

Defines the notifications route for Alertmanager integration with Salesforce. For details, see Prometheus Alertmanager documentation: Route.

  - severity="critical"
  routes: []


By default, only Critical alerts will be sent to Salesforce.

alertmanagerSimpleConfig.salesForce.feed_enabled (bool)

Enables or disables feed update in Salesforce. To save API calls, this parameter is set to false by default.

true or false

alertmanagerSimpleConfig.salesForce.link_prometheus (bool)

Enables or disables links to the Prometheus web UI in alerts sent to Salesforce. To simplify troubleshooting, set to true by default.

true or false

Alertmanager: notifications to ServiceNow


Prior to configuring the integration with ServiceNow, perform the following prerequisite steps using the ServiceNow documentation of the required version.

  1. In a new or existing Incident table, add the Alert ID field as described in Add fields to a table. To avoid alerts duplication, select Unique.

  2. Create an Access Control List (ACL) with read/write permissions for the Incident table as described in Securing table records.

  3. Set up a service account.



Example values

alertmanagerSimpleConfig.serviceNow.enabled (bool)

Enables or disables Alertmanager integration with ServiceNow. Set to false by default. Requires a set up ServiceNow account and compliance with the Incident table requirements above.

true or false

alertmanagerSimpleConfig.serviceNow (map)

Defines the ServiceNow parameters and credentials for integration with Alertmanager:

  • incident_table - name of the table created in ServiceNow. Do not confuse with the table label.

  • api_version - version of the ServiceNow HTTP API. By default, v1.

  • alert_id_field - name of the unique string field configured in ServiceNow to hold Prometheus alert IDs. Do not confuse with the table label.

  • auth.instance - URL of the instance.

  • auth.username - name of the ServiceNow user account with access to Incident table.

  • auth.password - password of the ServiceNow user account.

  enabled: true
  incident_table: "incident"
  api_version: "v1"
  alert_id_field: "u_alert_id"
    instance: ""
    username: "testuser"
    password: "testpassword"

Alertmanager: notifications to Slack

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Slack integration depends on the Internet access through HTTPS.



Example values

alertmanagerSimpleConfig.slack.enabled (bool)

Enables or disables Alertmanager integration with Slack. For details, see Prometheus Alertmanager documentation: Slack configuration. Set to false by default.

true or false

alertmanagerSimpleConfig.slack.api_url (string)

Defines the Slack webhook URL.

http://localhost:8888 (string)

Defines the Slack channel or user to send notifications to.


alertmanagerSimpleConfig.slack.route (map)

Defines the notifications route for Alertmanager integration with Slack. For details, see Prometheus Alertmanager documentation: Route.

  matchers: []
  routes: []

Alertmanager: Watchdog alert



Example values

prometheusServer.watchDogAlertEnabled (bool)

Enables or disables the Watchdog alert that constantly fires as long as the entire alerting pipeline is functional. You can use this alert to verify that Alertmanager notifications properly flow to the Alertmanager receivers. Set to true by default.

true or false

Byte limit for Telemeter client

For internal StackLight use only



Example values

telemetry.telemeterClient.limitBytes (string)

Specifies the size limit of the incoming data length in bytes for the Telemeter client. Defaults to 1048576.


Cluster size



Example values

clusterSize (string)

Specifies the approximate expected cluster size. Set to small by default. Other possible values include medium and large. Depending on the choice, appropriate resource limits are passed according to the resources or deprecated resourcesPerClusterSize parameter. The values differ by the OpenSearch and Prometheus resource limits:

  • small (default) - 2 CPU, 6 Gi RAM for OpenSearch, 1 CPU, 8 Gi RAM for Prometheus. Use small only for testing and evaluation purposes with no workloads expected.

  • medium - 4 CPU, 16 Gi RAM for OpenSearch, 3 CPU, 16 Gi RAM for Prometheus.

  • large - 8 CPU, 32 Gi RAM for OpenSearch, 6 CPU, 32 Gi RAM for Prometheus. Set to large only in case of lack of resources for OpenSearch and Prometheus.

small, medium, or large




Example values

grafana.renderer.enabled (bool)

Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Disables Grafana Image Renderer. For example, for resource-limited environments. Enabled by default.

true or false

grafana.homeDashboard (string)

Defines the home dashboard. Set to kubernetes-cluster by default. You can define any of the available dashboards.


High availability



Example values

highAvailabilityEnabled (bool) Mandatory

Enables or disables StackLight multiserver mode. For details, see StackLight database modes in Container Cloud Reference Architecture: StackLight deployment architecture. On managed clusters, set to false by default. On management clusters, true is mandatory.

true or false

Kubernetes network policies

Available since MCC 2.25.1 (Cluster releases 17.0.1 and 16.0.1)



Example values

networkPolicies.enabled (bool)

Enables or disables the Kubernetes Network Policy resource that allows controlling network connections to and from Pods deployed in the stackLight namespace. Enabled by default.

For the list of network policy rules, refer to StackLight rules for Kubernetes network policies. Customization of network policies is not supported.

true or false

Kubernetes tolerations



Example values

tolerations.default (slice)

Kubernetes tolerations to add to all StackLight components.

- key: "com.docker.ucp.manager"
  operator: "Exists"
  effect: "NoSchedule"

tolerations.component (map)

Defines Kubernetes tolerations (overrides the default ones) for any StackLight component.

  # elasticsearch:
  - key: "com.docker.ucp.manager"
    operator: "Exists"
    effect: "NoSchedule"
  - key: ""
    operator: "Exists"
    effect: "NoSchedule"

Log filtering for namespaces

Available since MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0)



Example values

logging.namespaceFiltering.logs.enabled (bool)

Limits the number of namespaces for Pods log collection. Enabled by default with the following list of monitored Kubernetes namespaces:

Kubernetes namespaces monitored by default
  • ceph
    If Ceph is enabled
  • ceph-lcm-mirantis
    If Ceph is enabled
  • default

  • kaas

  • kube-node-lease

  • kube-public

  • kube-system

  • lcm-system

  • local-path-storage

  • metallb

  • metallb-system

  • node-feature-discovery

  • openstack

  • openstack-ceph-shared
    If Ceph is enabled
  • openstack-lma-shared

  • openstack-provider-system

  • openstack-redis

  • openstack-tf-share
    If Tungsten Fabric is enabled
  • openstack-vault

  • osh-system

  • rook-ceph
    If Ceph is enabled
  • stacklight

  • system

  • tf
    If Tungsten Fabric is enabled

true or false

logging.namespaceFiltering.logs.extraNamespaces (map)

Adds extra namespaces to collect Kubernetes Pod logs from. Requires logging.enabled and logging.namespaceFiltering.logs.enabled set to true. Defines a YAML-formatted list of namespaces, which is empty by default.

      enabled: true
      - custom-ns-1 (bool)

Limits the number of namespaces for Kubernetes events collection. Disabled by default due to sysdig scanner present on some MOSK clusters and due to cluster-scoped objects producing events by default to the default namespace, but it is not passed to StackLight configuration anyhow. Requires logging.enabled set to true.

true or false (map)

Adds extra namespaces to collect Kubernetes events from. Requires logging.enabled and set to true. Defines a YAML-formatted list of namespaces, which is empty by default.

      enabled: true
      - custom-ns-1

Log verbosity



Example values

stacklightLogLevels.default (string)

Defines the log verbosity level for all StackLight components if not defined using component. To use the component default log verbosity level, leave the string empty.

  • trace - most verbose log messages, generates large amounts of data

  • debug - messages typically of use only for debugging purposes

  • info - informational messages describing common processes such as service starting or stopping; can be ignored during normal system operation but may provide additional input for investigation

  • warn - messages about conditions that may require attention

  • error - messages on error conditions that prevent normal system operation and require action

  • crit - messages on critical conditions indicating that a service is not working, working incorrectly or is unusable, requiring immediate attention

    Since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0), the NO_SEVERITY severity label is automatically added to a log with no severity label in the message. This enables greater control over determining which logs Fluentd processes and which ones are skipped by mistake.

stacklightLogLevels.component (map)

Defines (overrides the default value) the log verbosity level for any StackLight component separately. To use the component default log verbosity, leave the string empty.

  kubeStateMetrics: ""
  prometheusAlertManager: ""
  prometheusBlackboxExporter: ""
  prometheusNodeExporter: ""
  prometheusServer: ""
  alerta: ""
  alertmanagerWebhookServicenow: ""
  elasticsearchCurator: ""
  postgresql: ""
  prometheusEsExporter: ""
  sfNotifier: ""
  sfReporter: ""
  fluentd: ""
  # fluentdElasticsearch ""
  fluentdLogs: ""
  telemeterClient: ""
  telemeterServer: ""
  tfControllerExporter: ""
  tfVrouterExporter: ""
  telegrafDs: ""
  telegrafS: ""
  # elasticsearch: ""
  opensearch: ""
  # kibana: ""
  grafana: ""
  opensearchDashboards: ""
  metricbeat: ""
  prometheusMsTeams: ""




Example values

logging.enabled (bool) Mandatory

Enables or disables the StackLight logging stack. For details about the logging components, see Container Cloud Reference Architecture: StackLight Deployment architecture. Set to true by default. On management clusters, true is mandatory.

true or false

logging.level (bool)

Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Sets the least important level of log messages to send to OpenSearch. Requires logging.enabled set to true.

The default logging level is INFO, meaning that StackLight will drop log messages for the lower DEBUG and TRACE levels. Levels from WARNING to EMERGENCY require attention.


The FLUENTD_ERROR logs are of special type and cannot be dropped.

  • TRACE - the most verbose logs. Such level generates large amounts of data.

  • DEBUG- messages typically of use only for debugging purposes.

  • INFO - informational messages describing common processes such as service starting or stopping. Can be ignored during normal system operation but may provide additional input for investigation.

  • NOTICE - normal but significant conditions that may require special handling.

  • WARNING - messages on unexpected conditions that may require attention.

  • ERROR - messages on error conditions that prevent normal system operation and require action.

  • CRITICAL - messages on critical conditions indicating that a service is not working or working incorrectly.

  • ALERT - messages on severe events indicating that action is needed immediately.

  • EMERGENCY - messages indicating that a service is unusable.

logging.metricQueries (map)

Allows configuring OpenSearch queries for the data present in OpenSearch. Prometheus Elasticsearch Exporter then queries the OpenSearch database and exposes such metrics in the Prometheus format. For details, see Create logs-based metrics. Includes the following parameters:

  • indices - specifies the index pattern

  • interval and timeout - specify in seconds how often to send the query to OpenSearch and how long it can last before timing out

  • onError and onMissing - modify the prometheus-es-exporter behavior on query error and missing index. For details, see Prometheus Elasticsearch Exporter.

For usage example, see Create logs-based metrics.

logging.retentionTime (map)

Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Specifies the retention time per index. Includes the following parameters:

  • logstash - specifies the logstash-* index retention time.

  • events - specifies the kubernetes_events-* index retention time.

  • notifications - specifies the notification-* index retention time.

The allowed values include integers (days) and numbers with suffixes: y, m, w, d, h, including capital letters.

    logstash: 3
    events: "2w"
    notifications: "1M"

Logging: Enforce OOPS compression

Available since MCC 2.25.0 (Cluster releases 17.0.0 and 16.0.0)



Example values


Enforces 32 GB of heap size, unless the defined memory limit allows using 50 GB of heap. Requires logging.enabled set to true. Enabled by default. When disabled, StackLight computes heap as ⅘ of the set memory limit for any resulting heap value. For more details, see Tune OpenSearch performance.

  enforceOopsCompression: true

Logging to external outputs

Available since MCC 2.23.0 (Cluster release 11.7.0)



Example values

logging.externalOutputs (map)

Specifies external Elasticsearch, OpenSearch, and syslog destinations as fluentd-logs outputs. Requires logging.enabled: true. For configuration procedure, see Enable log forwarding to external destinations.

      # disabled: false
      type: elasticsearch
      level: info
      plugin_log_level: info
      tag_exclude: '{fluentd-logs,systemd}'
      host: elasticsearch-host
      port: 9200
      logstash_date_format: '%Y.%m.%d'
      logstash_format: true
      logstash_prefix: logstash
        # disabled: false
        chunk_limit_size: 16m
        flush_interval: 15s
        flush_mode: interval
        overflow_action: block
      disabled: true
      type: opensearch

Logging to external outputs: secrets

Available since MCC 2.23.0 (Cluster release 11.7.0)



Example values

logging.externalOutputSecretMounts (map)

Specifies authentication secret mounts for external log destinations. Requires logging.externalOutputs to be enabled and a Kubernetes secret to be created under the stacklight namespace. Contains the following values:

  • secretName

    Mandatory. Kubernetes secret name.

  • mountPath

    Mandatory. Mount path of the Kubernetes secret defined in secretName.

  • defaultMode

    Optional. Decimal number defining secret permissions, 420 by default.

Secret mount configuration:

  - secretName: elasticsearch-certs
    mountPath: /tmp/elasticsearch-certs
    defaultMode: 420
  - secretName: opensearch-certs
    mountPath: /tmp/opensearch-certs

Elasticsearch configuration for the above secret mount:

      ca_file: /tmp/elasticsearch-certs/ca.pem
      client_cert: /tmp/elasticsearch-certs/client.pem
      client_key: /tmp/elasticsearch-certs/client.key
      client_key_pass: password

Logging to syslog

Deprecated since MCC 2.23.0 (Cluster release 11.7.0)


Since Container Cloud 2.23.0 (Cluster release 11.7.0), logging.syslog is deprecated for the sake of logging.externalOutputs. For details, see Logging to external outputs.



Example values

logging.syslog.enabled (bool)

Enables or disables remote logging to syslog. Disabled by default. Requires logging.enabled set to true. For details and configuration example, see Enable remote logging to syslog.

true or false (string)

Specifies the remote syslog host.


logging.syslog.level (string)

Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Specifies logging level for the syslog output.


logging.syslog.port (string)

Specifies the remote syslog port.


logging.syslog.packetSize (string)

Defines the packet size in bytes for the syslog logging output. Set to 1024 by default. May be useful for syslog setups allowing packet size larger than 1 kB. Mirantis recommends that you tune this parameter to allow sending full log lines.


logging.syslog.protocol (bool)

Specifies the remote syslog protocol. Set to udp by default.

tcp or udp

logging.syslog.tls.enabled (bool)

Optional. Disabled by default. Enables or disables TLS. Use TLS only for the TCP protocol. TLS will not be enabled if you set a protocol other than TCP.

true or false

logging.syslog.tls.verify_mode (int)

Optional. Configures TLS verification.

  • 0 for OpenSSL::SSL::VERIFY_NONE

  • 1 for OpenSSL::SSL::VERIFY_PEER



logging.syslog.tls.certificate (string)

Defines how to pass the certificate. secret takes precedence over hostPath.

  • secret - specifies the name of the secret holding the certificate.

  • hostPath - specifies an absolute host path to the PEM certificate.

  secret: ""
  hostPath: "/etc/ssl/certs/ca-bundle.pem"
tag_exclude (string)
Since MCC 2.23.0 (11.7.0)

Optional. Overrides tag_include. Sets logs by tags to exclude from the destination output. For example, to exclude all logs with the test tag, set tag_exclude: '/.*test.*/'.

How to obtain tags for logs

Select from the following options:

  • In the main OpenSearch output, use the logger field that equals the tag.

  • Use logs of a particular Pod or container by following the below order, with the first match winning:

    1. The value of the app Pod label. For example, for app=opensearch-master, use opensearch-master as the log tag.

    2. The value of the k8s-app Pod label.

    3. The value of the Pod label.

    4. If a release_group Pod label exists and the component Pod label starts with app, use the value of the component label as the tag. Otherwise, the tag is the application label joined to the component label with a -.

    5. The name of the container from which the log is taken.

The values for tag_exclude and tag_include are placed into <match> directives of Fluentd and only accept regex types that are supported by the <match> directive of Fluentd. For details, refer to the Fluentd official documentation.


tag_include (string)
Since MCC 2.23.0 (11.7.0)

Optional. Is overridden by tag_exclude. Sets logs by tags to include to the destination output. For example, to include all logs with the auth tag, set tag_include: '/.*auth.*/'.


Monitoring of Ceph



Example values

ceph.enabled (bool)

Enables or disables Ceph monitoring on managed clusters. Set to false by default.

true or false

Monitoring of external endpoint



Example values

externalEndpointMonitoring.enabled (bool)

Enables or disables HTTP endpoints monitoring. If enabled, the monitoring tool performs the probes against the defined endpoints every 15 seconds. Set to false by default.

true or false

externalEndpointMonitoring.certificatesHostPath (string)

Defines the directory path with external endpoints certificates on host.

/etc/ssl/certs/ (slice)

Defines the list of HTTP endpoints to monitor. The endpoints must successfully respond to a liveness probe. For success, a request to a specific endpoint must result in a 2xx HTTP response code.


Monitoring of Ironic



Example values

ironic.endpoint (string)

Enables or disables monitoring of bare metal Ironic. To enable, specify the Ironic API URL.


ironic.insecure (bool)

Defines whether to skip the chain and host verification. Set to false by default.

true or false

Monitoring of Mirantis Kubernetes Engine



Example values

mke.enabled (bool)

Enables or disables Mirantis Kubernetes Engine (MKE) monitoring. Set to true by default.

true or false

mke.dockerdDataRoot (string)

Defines the dockerd data root directory of persistent Docker state. For details, see Docker documentation: Daemon CLI (dockerd).


Monitoring of SSL certificates



Example values

sslCertificateMonitoring.enabled (bool)

Enables or disables StackLight to monitor and alert on the expiration date of the TLS certificate of an HTTPS endpoint. If enabled, the monitoring tool performs the probes against the defined endpoints every hour. Set to false by default.

true or false (slice)

Defines the list of HTTPS endpoints to monitor the certificates from.


Monitoring of workload



Example values

metricFilter (map)

On the clusters that run large-scale workloads, workload monitoring generates a big amount of resource-consuming metrics. To prevent generation of excessive metrics, you can disable workload monitoring in the StackLight metrics and monitor only the infrastructure.

The metricFilter parameter enables the cAdvisor (Container Advisor) and kubeStateMetrics metric ingestion filters for Prometheus. Set to false by default. If set to true, you can define the namespaces to which the filter will apply. The parameter is designed for managed clusters.

  enabled: true
  action: keep
  - kaas
  - kube-system
  - stacklight
  • enabled - enable or disable metricFilter using true or false

  • action - action to take by Prometheus:

    • keep - keep only metrics from namespaces that are defined in the namespaces list

    • drop - ignore metrics from namespaces that are defined in the namespaces list

  • namespaces - list of namespaces to keep or drop metrics from regardless of the boolean value for every namespace




Example values

nodeSelector.default (map)

Defines the NodeSelector to use for the most of StackLight pods (except some pods that refer to DaemonSets) if the NodeSelector of a component is not defined.

  role: stacklight

nodeSelector.component (map)

Defines the NodeSelector to use for particular StackLight component pods. Overrides nodeSelector.default.

    role: stacklight
    component: alerta
  # kibana:
  #   role: stacklight
  #   component: kibana
    role: stacklight
    component: opensearchdashboards




Example values

elasticsearch.retentionTime (map)

Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Specifies the retention time per index. Includes the following parameters:

  • logstash - specifies the logstash-* index retention time.

  • events - specifies the kubernetes_events-* index retention time.

  • notifications - specifies the notification-* index retention time.

The allowed values include integers (days) and numbers with suffixes: y, m, w, d, h, including capital letters.

By default, values set in elasticsearch.logstashRetentionTime are used. However, the elasticsearch.retentionTime parameters, if defined, take precedence over elasticsearch.logstashRetentionTime.

    logstash: 3
    events: "2w"
    notifications: "1M"

elasticsearch.logstashRetentionTime (int)

Removed in Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0).

Defines the OpenSearch (Elasticsearch) logstash-* index retention time in days. The logstash-* index stores all logs gathered from all nodes and containers. Set to 1 by default.


Due to the known issue 27732-2, a custom setting for this parameter is dismissed during cluster deployment and changes to one day (default). Refer to the known issue description for the affected Cluster releases and available workaround.

1, 5, 15

elasticsearch.persistentVolumeClaimSize (string) Mandatory

Specifies the OpenSearch (Elasticsearch) PVC(s) size. The number of PVCs depends on the StackLight database mode. For HA, three PVCs will be created, each of the size specified in this parameter. For non-HA, one PVC of the specified size.


You cannot modify this parameter after cluster creation.


Due to the known issue 27732-1, that is fixed in Container Cloud 2.22.0 (Cluster releases 11.6.0 and 12.7.0), the OpenSearch PVC size configuration is dismissed during a cluster deployment. Refer to the known issue description for affected Cluster releases and available workarounds.

  persistentVolumeClaimSize: 30Gi

elasticsearch.persistentVolumeUsableStorageSizeGB (integer)

Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Optional. Specifies the number of gigabytes that is exclusively available for the OpenSearch data. Defines ceiling for storage-based retention where 80% of the defined value is assumed as available disk space for normal OpenSearch node functioning. If not set (by default), the number of gigabytes from elasticsearch.persistentVolumeClaimSize is used.

This parameter is useful in the following cases:

  • The real storage behind the volume is shared between multiple consumers. As a result, OpenSearch cannot use all elasticsearch.persistentVolumeClaimSize.

  • The real volume size is bigger than elasticsearch.persistentVolumeClaimSize. As a result, OpenSearch can use more than elasticsearch.persistentVolumeClaimSize.

  persistentVolumeUsableStorageSizeGB: 160

OpenSearch Dashboards extra settings



Example values

logging.dashboardsExtraConfig (map)

Additional configuration for opensearch_dashboards.yml.

    opensearch.requestTimeout: 60000

OpenSearch extra settings



Example values

logging.extraConfig (map)

Additional configuration for opensearch.yml.

    cluster.max_shards_per_node: 5000




Example values

prometheusServer.alertResendDelay (string)

Defines the minimum amount of time for Prometheus to wait before resending an alert to Alertmanager. Passed to the --rules.alert.resend-delay flag. Set to 2m by default.

2m, 90s

prometheusServer.alertsCommonLabels (dict)

Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Defines the list of labels to be injected to firing alerts while they are sent to Alertmanager. Empty by default.

The following labels are reserved for internal purposes and cannot be overridden: cluster_id, service, severity.


When new labels are injected, Prometheus sends alert updates with a new set of labels, which can potentially cause Alertmanager to have duplicated alerts for a short period of time if the cluster currently has firing alerts.

  region: west
  environment: prod

prometheusServer.persistentVolumeClaimSize (string) Mandatory

Specifies the Prometheus PVC(s) size. The number of PVCs depends on the StackLight database mode. For HA, three PVCs will be created, each of the size specified in this parameter. For non-HA, one PVC of the specified size.


You cannot modify this parameter after cluster creation.

  persistentVolumeClaimSize: 16Gi

prometheusServer.queryConcurrency (string)

Available since Container Cloud 2.24.0 (Cluster release 14.0.0). Defines the number of concurrent queries limit. Passed to the --query.max-concurrency flag. Set to 20 by default.


prometheusServer.retentionSize (string)

Defines the Prometheus database retention size. Passed to the --storage.tsdb.retention.size flag. Set to 15GB by default.

15GB, 512MB

prometheusServer.retentionTime (string)

Defines the Prometheus database retention period. Passed to the --storage.tsdb.retention.time flag. Set to 15d by default.

15d, 1000h, 10d12h

Prometheus Blackbox Exporter



Example values

blackboxExporter.customModules (map)

Specifies a set of custom Blackbox Exporter modules. For details, see Blackbox Exporter configuration: module. The http_2xx, http_2xx_verify, http_openstack, http_openstack_insecure, tls, tls_verify names are reserved for internal usage and any overrides will be discarded.

    prober: http
    timeout: 5s
      method: POST
        Content-Type: application/json
      body: '{}'

blackboxExporter.timeoutOffset (string)

Specifies the offset to subtract from timeout in seconds (--timeout-offset), upper bounded by 5.0 to comply with the built-in StackLight functionality. If nothing is specified, the Blackbox Exporter default value is used. For example, for Blackbox Exporter v0.19.0, the default value is 0.5.

timeoutOffset: "0.1"

Prometheus custom recording rules



Example values

prometheusServer.customRecordingRules (slice)

Defines custom Prometheus recording rules. Overriding of existing recording rules is not supported.

- name: ExampleRule.http_requests_total
  - expr: sum by(job) (rate(http_requests_total[5m]))
    record: job:http_requests:rate5m
  - expr: avg_over_time(job:http_requests:rate5m[1w])
    record: job:http_requests:rate5m:avg_over_time_1w

Prometheus custom scrape configurations



Example values

prometheusServer.customScrapeConfigs (map)

Defines custom Prometheus scrape configurations. For details, see Prometheus documentation: scrape_config. The names of default StackLight scrape configurations, which you can view in the Status -> Targets tab of the Prometheus web UI, are reserved for internal usage and any overrides will be discarded. Therefore, provide unique names to avoid overrides.

    scrape_interval: 10s
    scrape_timeout: 5s
    - role: endpoints
    - source_labels:
      - __meta_kubernetes_service_label_app
      - __meta_kubernetes_endpoint_port_name
      regex: grafana;service
      action: keep
    - source_labels:
      - __meta_kubernetes_pod_name
      target_label: pod

Prometheus metrics filtering

Available since Container Cloud 2.24.0 (Cluster release 14.0.0)



Example values

metricsFiltering.enabled (bool)

Configuration for managing Prometheus metrics filtering. When enabled (default), only actively used and explicitly white-listed metrics get scraped by Prometheus.

    enabled: true

metricsFiltering.extraMetricsInclude (map)

List of extra metrics to whitelist, which are dropped by default. Contains the following parameters:

  • <job name> - scraping job name as a key for extra white-listed metrics to add under the key. For the list of job names, see White list of Prometheus scrape jobs. If a job name is not present in this list, its target metrics are not dropped and are collected by Prometheus by default.

    You can also use group key names to add metrics to more than one job using _group-<key name>. The following list combines jobs by groups:

    List of jobs by groups
     - blackbox
     - blackbox-external-endpoint
     - kubernetes-master-api
     - mcc-blackbox
     - mke-manager-api
     - msr-api
     - openstack-blackbox-ext
     - openstack-dns-probe # Since MOSK 24.3
     - refapp
     - helm-controller
     - kaas-exporter
     - kubelet
     - kubernetes-apiservers
     - mcc-controllers
     - mcc-providers
     - rabbitmq-operator-metrics
     - etcd-server
     - ucp-kv
     - cadvisor
     - calico
     - etcd-server
     - helm-controller
     - ironic
     - kaas-exporter
     - kubelet
     - kubernetes-apiservers
     - mcc-cache
     - mcc-controllers
     - mcc-providers
     - mke-metrics-controller
     - mke-metrics-engine
     - openstack-ingress-controller
     - postgresql
     - prometheus-alertmanager
     - prometheus-elasticsearch-exporter
     - prometheus-grafana
     - prometheus-libvirt-exporter
     - prometheus-memcached-exporter
     - prometheus-msteams
     - prometheus-mysql-exporter
     - prometheus-node-exporter
     - prometheus-rabbitmq-exporter
     - prometheus-relay
     - prometheus-server
     - rabbitmq-operator-metrics
     - telegraf-docker-swarm
     - telemeter-client
     - telemeter-server
     - tf-control
     - tf-redis
     - tf-vrouter
     - ucp-kv
     - alertmanager-webhook-servicenow
     - cadvisor
     - calico
     - etcd-server
     - helm-controller
     - ironic
     - kaas-exporter
     - kubelet
     - kubernetes-apiservers
     - mcc-cache
     - mcc-controllers
     - mcc-providers
     - mke-metrics-controller
     - mke-metrics-engine
     - openstack-ingress-controller
     - patroni
     - postgresql
     - prometheus-alertmanager
     - prometheus-elasticsearch-exporter
     - prometheus-grafana
     - prometheus-libvirt-exporter
     - prometheus-memcached-exporter
     - prometheus-msteams
     - prometheus-mysql-exporter
     - prometheus-node-exporter
     - prometheus-rabbitmq-exporter
     - prometheus-relay
     - prometheus-server
     - rabbitmq-operator-metrics
     - sf-notifier
     - telegraf-docker-swarm
     - telemeter-client
     - telemeter-server
     - tf-control
     - tf-redis
     - tf-vrouter
     - tf-zookeeper
     - ucp-kv
     - helm-controller
     - kaas-exporter
     - mcc-controllers
     - mcc-providers
     - mcc-controllers
     - mcc-providers
     - mcc-cache
     - mcc-controllers
     - mcc-controllers
     - mcc-providers


    The prometheus-coredns job from the go-collector-metrics and process-collector-metrics groups is removed in Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0).

  • <list of metrics to collect> - extra metrics of <job name> to be white-listed.

    enabled: true
        - container_memory_failcnt
        - container_network_transmit_errors_total
        - felix_route_table_per_iface_sync_seconds_sum
        - felix_bpf_dataplane_endpoints
        - go_gc_heap_goal_bytes
        - go_gc_heap_objects_objects

Prometheus Node Exporter



Example values

nodeExporter.netDeviceExclude (string)

Excludes monitoring of RegExp-specified network devices. The number of network interface-related metrics is significant and may cause extended Prometheus RAM usage in big clusters. Therefore, Prometheus Node Exporter only collects information of a basic set of interfaces (both host and container) and excludes the following monitoring interfaces:

  • veth/cali - the host-side part of the container-host Ethernet tunnel

  • o-hm0 - the OpenStack Octavia management interface for communication with the amphora machine

  • tap, qg-, qr-, ha- - the Open vSwitch virtual bridge ports

  • br-(ex|int|tun) - the Open vSwitch virtual bridges

  • docker0, br- - the Docker bridge (master for the veth interfaces)

  • ovs-system - the Open vSwitch interface (mapping interfaces to bridges)

To enable information collecting for the interfaces above, edit the list of blacklisted devices as needed.

  netDeviceExclude: "^(veth.+|cali.+|o-hm0|tap.+|qg-.+|qr-.+|ha-.+|br-.+|ovs-system|docker0)$"

nodeExporter.extraCollectorsEnabled (slice)

Enables Node Exporter collectors. For a list of available collectors, see Node Exporter Collectors. The following collectors are enabled by default in StackLight:

  • arp

  • conntrack

  • cpu

  • diskstats

  • entropy

  • filefd

  • filesystem

  • hwmon

  • loadavg

  • meminfo

  • netdev

  • netstat

  • nfs

  • stat

  • sockstat

  • textfile

  • time

  • timex

  • uname

  • vmstat

  - bcache
  - bonding
  - softnet

Prometheus Relay


Prometheus Relay is set up as an endpoint in the Prometheus datasource in Grafana. Therefore, all requests from Grafana are sent to Prometheus through Prometheus Relay. If Prometheus Relay reports request timeouts or exceeds the response size limits, you can configure the parameters below. In this case, Prometheus Relay resource limits may also require tuning.



Example values

prometheusRelay.clientTimeout (string)

Specifies the client timeout in seconds. If empty, defaults to a value determined by the cluster size: 10 for small, 30 for medium, 60 for large.


The cluster size parameters are available since Container Cloud 2.24.0 (Cluster release 14.0.0).


prometheusRelay.responseLimitBytes (string)

Specifies the response size limit in bytes. If empty, defaults to a value determined by the cluster size: 6291456 for small, 18874368 for medium, 37748736 for large.


The cluster size parameters are available since Container Cloud 2.24.0 (Cluster release 14.0.0).


Prometheus remote write

Allows sending of metrics from Prometheus to a custom monitoring endpoint. For details, see Prometheus Documentation: remote_write.



Example values

prometheusServer.remoteWriteSecretMounts (slice)

Skip this step if your remote server does not have authorization. Defines additional mounts for remoteWrites secrets. Secret objects with credentials needed to access the remote endpoint must be precreated in the stacklight namespace. For details, see Kubernetes Secrets.


To create more than one file for the same remote write endpoint, for example, to configure TLS connections, use a single secret object with multiple keys in the data field. Using the following example configuration, two files will be created, cert_file and key_file:

    cert_file: aWx1dnRlc3Rz
    key_file: dGVzdHVzZXI=
- secretName: prom-secret-files
  mountPath: /etc/config/remote_write

prometheusServer.remoteWrites (slice)

Defines the configuration of a custom remote_write endpoint for sending Prometheus samples.


If the remote server uses authorization, first create secret(s) in the stacklight namespace and mount them to Prometheus through prometheusServer.remoteWriteSecretMounts. Then define the created secret in the authorization field.

-  url: http://remote_url/push
     credentials_file: /etc/config/remote_write/key_file

Resource limits



Example values

resourcesPerClusterSize (map)

Provides the capability to override the default resource requests or limits for any StackLight component for the predefined cluster sizes.


Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), resourcesPerClusterSize is deprecated. Use the resources parameter instead.

StackLight components for resource limits customization


The below list has the componentName: <podNamePrefix>/<containerName> format.

alerta: alerta/alerta
alertmanager: prometheus-alertmanager/prometheus-alertmanager
alertmanagerWebhookServicenow: alertmanager-webhook-servicenow/alertmanager-webhook-servicenow
blackboxExporter: prometheus-blackbox-exporter/blackbox-exporter
elasticsearch: opensearch-master/opensearch # Deprecated
elasticsearchCurator: elasticsearch-curator/elasticsearch-curator
elasticsearchExporter: elasticsearch-exporter/elasticsearch-exporter
fluentdElasticsearch: fluentd-logs/fluentd-logs # Deprecated
fluentdLogs: fluentd-logs/fluentd-logs
fluentdNotifications: fluentd-notifications/fluentd
grafana: grafana/grafana
grafanaRenderer: grafana/grafana-renderer # Removed in MCC 2.27.0 (17.2.0 and 16.2.0)
iamProxy: iam-proxy/iam-proxy # Deprecated
iamProxyAlerta: iam-proxy-alerta/iam-proxy
iamProxyAlertmanager: iam-proxy-alertmanager/iam-proxy
iamProxyGrafana: iam-proxy-grafana/iam-proxy
iamProxyKibana: iam-proxy-kibana/iam-proxy # Deprecated
iamProxyOpenSearchDashboards: iam-proxy-kibana/iam-proxy
iamProxyPrometheus: iam-proxy-prometheus/iam-proxy
kibana: opensearch-dashboards/opensearch-dashboards # Deprecated
kubeStateMetrics: prometheus-kube-state-metrics/prometheus-kube-state-metrics
libvirtExporter: prometheus-libvirt-exporter/prometheus-libvirt-exporter
metricCollector: metric-collector/metric-collector
metricbeat: metricbeat/metricbeat
nodeExporter: prometheus-node-exporter/prometheus-node-exporter
opensearch: opensearch-master/opensearch
opensearchDashboards: opensearch-dashboards/opensearch-dashboards
patroniExporter: patroni/patroni-patroni-exporter
pgsqlExporter: patroni/patroni-pgsql-exporter
postgresql: patroni/patroni
prometheusEsExporter: prometheus-es-exporter/prometheus-es-exporter
prometheusMsTeams: prometheus-msteams/prometheus-msteams
prometheusRelay: prometheus-relay/prometheus-relay
prometheusServer: prometheus-server/prometheus-server
sfNotifier: sf-notifier/sf-notifier
sfReporter: sf-reporter/sf-reporter
stacklightHelmControllerController: stacklight-helm-controller/controller
telegrafDockerSwarm: telegraf-docker-swarm/telegraf-docker-swarm
telegrafDs: telegraf-ds-smart/telegraf-ds-smart # Deprecated
telegrafDsSmart: telegraf-ds-smart/telegraf-ds-smart
telegrafOpenstack: telegraf-openstack/telegraf-openstack # replaced with osdpl-exporter in 24.1
telegrafS: telegraf-docker-swarm/telegraf-docker-swarm # Deprecated
telemeterClient: telemeter-client/telemeter-client
telemeterServer: telemeter-server/telemeter-server
telemeterServerAuthServer: telemeter-server/telemeter-server-authorization-server
tfControllerExporter: prometheus-tf-controller-exporter/prometheus-tungstenfabric-exporter
tfVrouterExporter: prometheus-tf-vrouter-exporter/prometheus-tungstenfabric-exporter
  # elasticsearch:
        cpu: "1000m"
        memory: "4Gi"
        cpu: "2000m"
        memory: "8Gi"
        cpu: "1000m"
        memory: "4Gi"
        cpu: "4000m"
        memory: "16Gi"

resources (map)

Provides the capability to override the containers resource requests or limits for any StackLight component.

StackLight components for resource limits customization


The below list has the componentName: <podNamePrefix>/<containerName> format.

alerta: alerta/alerta
alertmanager: prometheus-alertmanager/prometheus-alertmanager
alertmanagerWebhookServicenow: alertmanager-webhook-servicenow/alertmanager-webhook-servicenow
blackboxExporter: prometheus-blackbox-exporter/blackbox-exporter
elasticsearch: opensearch-master/opensearch # Deprecated
elasticsearchCurator: elasticsearch-curator/elasticsearch-curator
elasticsearchExporter: elasticsearch-exporter/elasticsearch-exporter
fluentdElasticsearch: fluentd-logs/fluentd-logs # Deprecated
fluentdLogs: fluentd-logs/fluentd-logs
fluentdNotifications: fluentd-notifications/fluentd
grafana: grafana/grafana
grafanaRenderer: grafana/grafana-renderer # Removed in MCC 2.27.0 (17.2.0 and 16.2.0)
iamProxy: iam-proxy/iam-proxy # Deprecated
iamProxyAlerta: iam-proxy-alerta/iam-proxy
iamProxyAlertmanager: iam-proxy-alertmanager/iam-proxy
iamProxyGrafana: iam-proxy-grafana/iam-proxy
iamProxyKibana: iam-proxy-kibana/iam-proxy # Deprecated
iamProxyOpenSearchDashboards: iam-proxy-kibana/iam-proxy
iamProxyPrometheus: iam-proxy-prometheus/iam-proxy
kibana: opensearch-dashboards/opensearch-dashboards # Deprecated
kubeStateMetrics: prometheus-kube-state-metrics/prometheus-kube-state-metrics
libvirtExporter: prometheus-libvirt-exporter/prometheus-libvirt-exporter
metricCollector: metric-collector/metric-collector
metricbeat: metricbeat/metricbeat
nodeExporter: prometheus-node-exporter/prometheus-node-exporter
opensearch: opensearch-master/opensearch
opensearchDashboards: opensearch-dashboards/opensearch-dashboards
patroniExporter: patroni/patroni-patroni-exporter
pgsqlExporter: patroni/patroni-pgsql-exporter
postgresql: patroni/patroni
prometheusEsExporter: prometheus-es-exporter/prometheus-es-exporter
prometheusMsTeams: prometheus-msteams/prometheus-msteams
prometheusRelay: prometheus-relay/prometheus-relay
prometheusServer: prometheus-server/prometheus-server
sfNotifier: sf-notifier/sf-notifier
sfReporter: sf-reporter/sf-reporter
stacklightHelmControllerController: stacklight-helm-controller/controller
telegrafDockerSwarm: telegraf-docker-swarm/telegraf-docker-swarm
telegrafDs: telegraf-ds-smart/telegraf-ds-smart # Deprecated
telegrafDsSmart: telegraf-ds-smart/telegraf-ds-smart
telegrafOpenstack: telegraf-openstack/telegraf-openstack # replaced with osdpl-exporter in 24.1
telegrafS: telegraf-docker-swarm/telegraf-docker-swarm # Deprecated
telemeterClient: telemeter-client/telemeter-client
telemeterServer: telemeter-server/telemeter-server
telemeterServerAuthServer: telemeter-server/telemeter-server-authorization-server
tfControllerExporter: prometheus-tf-controller-exporter/prometheus-tungstenfabric-exporter
tfVrouterExporter: prometheus-tf-vrouter-exporter/prometheus-tungstenfabric-exporter
      cpu: "50m"
      memory: "200Mi"
      memory: "500Mi"

Using the example above, each pod in the alerta service will be requesting 50 millicores of CPU and 200 MiB of memory, while being hard-limited to 500 MiB of memory usage. Each configuration key is optional.


The logging mechanism performance depends on the cluster log load. If the cluster components send an excessive amount of logs, the default resource requests and limits for fluentdLogs (or fluentdElasticsearch) may be insufficient, which may cause its pods to be OOMKilled and trigger the KubePodCrashLooping alert. In such case, increase the default resource requests and limits for fluentdLogs. For example:

  # fluentdElasticsearch:
      memory: "500Mi"
      memory: "1500Mi"

Salesforce reporter

On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled. The Salesforce reporter depends on the Internet access through HTTPS.



Example values

clusterId (string)

Unique cluster identifier clusterId="<Cluster Project>/<Cluster Name>/<UID>", generated for each cluster using Cluster Project, Cluster Name, and cluster UID, separated by a slash. Used for both sf-reporter and sf-notifier services.

The clusterId key is automatically defined for each cluster. Do not set or modify it manually.

Do not modify clusterId.

sfReporter.enabled (bool)

Enables or disables reporting of Prometheus metrics to Salesforce. For details, see Container Cloud Reference Architecture: StackLight Deployment architecture. Disabled by default.

true or false

sfReporter.salesForceAuth (map)

Salesforce parameters and credentials for the metrics reporting integration.


Modify this parameter if sf-notifier is not configured or if you want to use a different Salesforce user account to send reports to.

  url: "<SF instance URL>"
  username: "<SF account email address>"
  password: "<SF password>"
  environment_id: "<Cloud identifier>"
  organization_id: "<Organization identifier>"
  sandbox_enabled: "<Set to true or false>"

sfReporter.cronjob (map)

Defines the Kubernetes cron job for sending metrics to Salesforce. By default, reports are sent at midnight server time.

  schedule: "0 0 * * *"
  concurrencyPolicy: "Allow"
  failedJobsHistoryLimit: ""
  successfulJobsHistoryLimit: ""
  startingDeadlineSeconds: 200

Storage class

In an HA StackLight setup, when highAvailabilityEnabled is set to true, all StackLight Persistent Volumes (PVs) use the Local Volume Provisioner (LVP) storage class not to rely on dynamic provisioners such as Ceph, which are not available in every deployment. In a non-HA StackLight setup, when no storage class is specified, PVs use the default storage class of a cluster.



Example values

storage.defaultStorageClass (string)

Defines the StorageClass to use for all StackLight Persistent Volume Claims (PVCs) if a component StorageClass is not defined using the componentStorageClasses. To use the default storage class, leave the string empty.

lvp, standard

storage.componentStorageClasses (map)

Defines (overrides the defaultStorageClass value) the storage class for any StackLight component separately. To use the default storage class, leave the string empty.

  elasticsearch: ""
  opensearch: ""
  fluentd: ""
  postgresql: ""
  prometheusAlertManager: ""
  prometheusServer: ""