Pike, Queens
In the OpenStack environments with OpenContrail and Barbican, if you use a
non-default Keystone domain, the LBaaS VIP cannot be created. LBaaS cannot
download a secret created by the Barbican user in any project other than the
project where opencontrail_barbican_user
has admin privileges.
Workaround:
On every OpenStack controller node where Barbican API is installed, add
the following configuration to /etc/barbican/policy.json
:
barbican:
server:
policy:
all_domains_reader: 'user:<user_ID> and project:<project_ID>'
secret_acl_read: "'read':%(target.secret.read)s or rule:all_domains_reader"
container_acl_read: "'read':%(target.container.read)s or rule:all_domains_reader"
By default, LBaaS uses the admin user to obtain secrets from Barbican.
Replace <user_ID>
and <project_ID>
with a corresponding OpenStack
ID of this user and the project where this user has an admin role.
Log in to the Salt Master node.
Apply the following state:
salt -C 'I@barbican:server' state.apply barbican
This configuration adds appropriate rights to read the secrets and containers from Barbican.
Queens
The Reclass model for OpenStack Queens includes the deprecated Heat CloudWatch API, which may cause false positive alerts for the Heat CloudWatch service in StackLight LMA. The issue affects only the existing deployments with OpenStack Queens.
Workaround:
Upgrade your MCP deployment to the Build ID 2019.2.0 as described in MCP Operations Guide: Upgrade MCP to a newer release version.
Open your Git project repository with the Reclass model on the cluster level.
In openstack/init.yml
, specify the following class:
openstack_heat_cloudwatch_api_enabled: False
Log in to the Salt Master node.
Apply the haproxy
state on all OpenStack controller nodes:
salt ctl* state.apply haproxy
Apply the nginx
state on all proxy nodes:
salt prx* state.apply nginx
Queens
When resetting the OpenStack administrator password, the
state.sls keystone
state does not apply the changes. The issue affects
only the OpenStack Queens release.
Workaround:
Log in to an OpenStack controller node.
Source the keystonercv3
file:
source /root/keystonercv3
Set a new password:
openstack user set admin --password <new_password>
Once done, the services that use the administrator password will fail to authenticate.
From the Salt Master node, open the
/srv/salt/reclass/classes/<cluster_name>/infra/secrets.yaml
file and
specify the new password using the keystone_admin_password
parameter.
Re-run the Deploy - OpenStack Jenkins pipeline job.
Queens. Fixed in 2019.2.3
Changing the logging level for the OpenStack services may fail.
Workaround:
salt cmp* state.apply nova,neutron,cinder
. Alternatively, re-run the
Deploy - OpenStack Jenkins pipeline job.Pike, Queens
On the OpenStack Pike or Queens environments with Octavia,
if during creation, updating, or deleting of a load balancer
or other resources a gtw
node is rebooted or the octavia-worker
service is restarted, the stale load balancer stucks in the PENDING_UPDATE
or PENDING_DELETE
state.
Workaround:
Log in to any OpenStack controller node.
Obtain the target load balancer ID:
openstack loadbalancer list | awk '/ PENDING_CREATE / {print $2}
Choose from the following options:
For the MCP version 2019.2.4 and later, run the following command:
openstack loadbalancer delete --force <load_balancer_id>
Note
The --force
flag requires admin rights and works only
if a load balancer was not updated during the last hour.
For the MCP versions older than 2019.2.4:
Log in to any dbs
node.
Log in to the MySQL database:
mysql -uoctavia -p
Run the following command with the load balancer ID obtained in the step 2. For example:
update load_balancer set provisioning_status='ERROR' \
where id='0fc571fe-6ad1-4311-ab13-765b5526cd30';
Pike, Queens
On the OpenStack Pike or Queens environments with Octavia,
if a gtw
node hosting the Octavia services has issues
with tenant network causing the Octavia management network lb-mgmt-net
to become unreachable from this gtw
node,
the Octavia controller services stop working properly
without connection to the amphora instances.
Workaround:
If you run the Octavia services on all gtw
nodes using
octavia_manager_cluster
and only one gtw
node
has tenant network issues, manually stop the Octavia controller services
(octavia-health-manager
, octavia-housekeeping
, octavia-worker
)
on the affected node until the network issue on this node is resolved.
In this case, the Octavia controller services will continue working
properly.
If you run the Octavia services only on the gtw01
node, manually
stop the Octavia controller services and choose from the following
options:
Start the Octavia controller services on another gtw0x
node:
Open your Git project repository with the Reclass model on the cluster level.
In cluster/<cluster_name>/infra/config/nodes.yml
,
change the node for the Octavia services, for example, to gtw02
:
parameters:
reclass:
storage:
node:
openstack_gateway_node02:
classes:
- cluster.${_param:cluster_name}.openstack.octavia_manager
params:
octavia_hm_bind_ip: ${_param:octavia_health_manager_node01_address}
Log in to the Salt Master node.
Apply the following states:
salt-call state.sls reclass.storage
salt '*' saltutil.refresh_pillar
salt -C 'I@neutron:client' state.sls neutron.client
salt '*' mine.update
For the gtw
node where you moved the Octavia services, apply the
Octavia states. For example:
salt 'gtw02*' state.sls octavia
TECHNICAL PREVIEW Enable octavia_manager_cluster
:
Open your Git project repository with the Reclass model on the cluster level.
In infra/<cluster_name>/infra/config/init.yml
,
change the following class
- system.reclass.storage.system.openstack_gateway_single_octavia
to
- system.reclass.storage.system.openstack_gateway_cluster_octavia
Log in to the Salt Master node.
Apply the following states:
salt-call state.sls reclass.storage
salt '*' saltutil.refresh_pillar
salt -C 'I@neutron:client' state.sls neutron.client
salt '*' mine.update
salt -C "I@octavia:manager and not *01*" state.sls octavia
Pike, Queens
The Nova scheduler counts the disk space of the volume-backed instances and causes NoValidHostFound errors from Nova when booting an instance. The reason is that Nova considers the size of the root volume specified in the instance flavor to be consumed by that instance on the compute host even if the instance is booted from the Cinder volume and does not consume any disk resources on the compute host.
Workarounds:
If your cloud uses instances booted only or mostly from Cinder volumes, increase the disk overcommit ratio:
Open your Git project repository with the Reclass model on the cluster level.
In cluster/<cluster_name>/openstack/control.yml
, increase the disk
allocation ratio as required using the disk_allocation_ratio
parameter:
nova:
controller:
disk_allocation_ratio: <integer>
From the Salt Master node, apply the nova
state:
salt 'ctl*' state.apply nova
If only some instances boot from Cinder volumes, create a separate flavor of zero size for the root volume to be used by such instances. Use these flavors when creating instances booted from Cinder volumes.
Open your Git project repository with the Reclass model on the cluster level.
In cluster/<cluster_name>/openstack/control.yml
, define a new
flavor and set disk
to 0
. Set other parameters as required.
For example:
nova:
client:
enabled: true
server:
identity:
flavor:
flavor1:
flavor_id: 10
ram: 4096
disk: 0
vcpus: 1
From the Salt Master node, apply the novaclient
state:
salt 'ctl*' state.apply novaclient
Pike, Queens
A Neutron port on a private network may receive traffic from other networks or VLANs during wiring. The workaround is to use iptables instead of the Open vSwitch security groups.
Workaround:
Open your Git project repository with the Reclass model on the cluster level.
In cluster/<cluster_name>/openstack/compute.yaml
, set the
firewall_driver
to iptables_hybrid
:
neutron:
compute:
firewall_driver: iptables_hybrid
Apply the neutron
state from the Salt Master node:
salt -C 'I@neutron:server' state.sls neutron
Pike, Queens
The Keepalived service may fail during the upgrade from MCP versions lower than 2018.11.0 to 2019.2.0. The workaround is to disable Keepalived monitoring and enable it once you complete the upgrade.
Workaround:
Open your Git project repository with the Reclass model on the cluster level.
In classes/cluster/<cluster_name>/infra/init.yml
, disable Keepalived
monitoring:
keepalived:
_support:
telegraf:
enabled: false
Verify that all nodes have the Telegraf Keepalived plugin disabled:
salt -C "*" saltutil.refresh_pillar
Verify that no nodes respond against test.ping
:
salt -C "I@keepalived:_support:telegraf:enabled:True" test.ping
Apply the change:
salt -C 'I@telegraf:agent' state.sls telegraf
Once you complete the upgrade, revert the step 2.
Apply the change:
salt -C 'I@telegraf:agent' state.sls telegraf
Pike, Queens
After deployment of an OpenStack environment, the VCP nodes may have an incorrect DNS server if MAAS is used. The reason is that during a VCP node boot, it obtains a DNS server from MAAS, which may differ from the DNS server specified in the deployment model.
To resolve the issue for the existing VCP nodes, remove the wrong DNS server
address from the /etc/resolv.conf
configuration file.
To resolve the issue before deploying a new environment or adding new VCP
nodes to an existing environment, specify the network data in cloud-init
.
To apply a workaround for existing VCP nodes:
Log in to the Salt Master node.
Obtain the MAAS DNS server address:
salt-call pillar.get maas:region:bind:host
Remove the MAAS DNS server address from the affected nodes:
salt -C '<target_compound>' cmd.run 'sed -i /<maas_server>/d /etc/resolv.conf
To apply a permanent solution for every new VCP node:
Log in to the Salt Master node.
Obtain the list of VCP nodes defined in the model:
salt '<any_kvm_node>' --out json pillar.get salt:control:cluster:internal:node | jq -r '.[] | keys[]'
Example of system response:
bmk01
cid01
cid02
cid03
...
prx01
prx02
Determine the VCP nodes pillars that contain the cloud-init
data:
salt '<any_kvm_node>' --out yaml pillar.items | grep 'salt_control_cluster_node_cloud_init_'
Example of system response:
salt_control_cluster_node_cloud_init_openstack_control:
salt_control_cluster_node_cloud_init_openstack_dns:
salt_control_cluster_node_cloud_init_openstack_proxy:
salt_control_cluster_node_cloud_init_infra_storage:
salt_control_cluster_node_cloud_init_cicd_control:
salt_control_cluster_node_cloud_init_stacklight_telemetry:
Open your Git project repository with the Reclass model on the cluster level.
In classes/cluster/<cluster_name>/infra/kvm.yml
, specify the network
data for the required nodes. For example, for the cid
and prx
nodes:
parameters:
_param:
salt_control_vcp_deploy_interface: 'ens2'
salt_control_vcp_deploy_interface_netmask: ${_param:deploy_network_netmask}
salt_control_vcp_deploy_interface_gateway: ${_param:deploy_network_gateway}
salt_control_vcp_dns_server_1: ${_param:dns_server01}
salt_control_vcp_dns_server_2: ${_param:dns_server02}
salt_control_cluster_node_cloud_init_network_data:
network_data:
links:
- type: 'phy'
id: ${_param:salt_control_vcp_deploy_interface}
name: ${_param:salt_control_vcp_deploy_interface}
services:
- type: "dns"
address: ${_param:salt_control_vcp_dns_server_1}
- type: "dns"
address: ${_param:salt_control_vcp_dns_server_2}
salt_control_common_network_data_networks_deploy_interface_no_dhcp_common: &common_no_dhcp_data
link: ${_param:salt_control_vcp_deploy_interface}
type: 'ipv4'
id: 'private-ipv4'
netmask: ${_param:salt_control_vcp_deploy_interface_netmask}
routes:
- gateway: ${_param:salt_control_vcp_deploy_interface_gateway}
network: '0.0.0.0'
netmask: '0.0.0.0'
salt_control_cluster_node_cloud_init_cicd_control:
network_data: ${_param:salt_control_cluster_node_cloud_init_network_data}
salt_control_cluster_node_cloud_init_openstack_proxy:
network_data: ${_param:salt_control_cluster_node_cloud_init_network_data}
salt:
control:
cluster:
internal:
node:
cid01:
cloud_init:
network_data:
networks:
- <<: *common_no_dhcp_data
ip_address: ${_param:cicd_control_node01_deploy_address}
cid02:
cloud_init:
network_data:
networks:
- <<: *common_no_dhcp_data
ip_address: ${_param:cicd_control_node02_deploy_address}
cid03:
cloud_init:
network_data:
networks:
- <<: *common_no_dhcp_data
ip_address: ${_param:cicd_control_node03_deploy_address}
prx01:
cloud_init:
network_data:
networks:
- <<: *common_no_dhcp_data
ip_address: ${_param:openstack_proxy_node01_deploy_address}
prx02:
cloud_init:
network_data:
networks:
- <<: *common_no_dhcp_data
ip_address: ${_param:openstack_proxy_node02_deploy_address}
Synchronize the Salt resources:
salt -C 'I@salt:control' saltutil.sync_all
Proceed with OpenStack environment deployment:
Automatically, as described in MCP Deployment Guide: Deploy an OpenStack environment.
Manually, by booting the VCP nodes:
salt -C 'I@salt:control' salt.control