Tungsten Fabric known issues and limitations¶
This section lists the Tungsten Fabric known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.6.
[10096] tf-control does not refresh IP addresses of Cassandra pods
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot
[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1
[18148] TF resets BGP_ASN, ENCAP_PRIORITY, and VXLAN_VN_ID_MODE to defaults
[19195] Managed cluster status is flapping between the Ready/Not Ready states
Limitations¶
Tungsten Fabric does not provide the following functionality:
Automatic generation of network port records in DNSaaS (Designate) as Neutron with Tungsten Fabric as a backend is not integrated with DNSaaS. As a workaround, you can use the Tungsten Fabric built-in DNS service that enables virtual machines to resolve each other names.
Secret management (Barbican). You cannot use the certificates stored in Barbican to terminate HTTPs in a load balancer.
Role Based Access Control (RBAC) for Neutron objects.
Modification of custom vRouter DaemonSets based on the SR-IOV definition in the
OsDpl
CR.
[10096] tf-control does not refresh IP addresses of Cassandra pods¶
The tf-control
service resolves the DNS names of Cassandra pods at startup
and does not update them if Cassandra pods got new IP addresses, for example,
in case of a restart. As a workaround, to refresh the IP addresses of
Cassandra pods, restart the tf-control
pods one by one:
Caution
Before restarting the tf-control
pods:
Verify that the new pods are successfully spawned.
Verify that no vRouters are connected to only one
tf-control
pod that will be restarted.
kubectl -n tf delete pod tf-control-<hash>
[13755] TF pods switch to CrashLoopBackOff after a simultaneous reboot¶
Rebooting all Cassandra cluster TFConfig or TFAnalytics nodes, maintenance, or other circumstances that cause the Cassandra pods to start simultaneously may cause a broken Cassandra TFConfig and/or TFAnalytics cluster. In this case, Cassandra nodes do not join the ring and do not update the IPs of the neighbor nodes. As a result, the TF services cannot operate Cassandra cluster(s).
To verify that a Cassandra cluster is affected:
Run the nodetool status command specifying the config
or
analytics
cluster and the replica number:
kubectl -n tf exec -it tf-cassandra-<config/analytics>-dc1-rack1-<replica number> -c cassandra -- nodetool status
Example of system response with outdated IP addresses:
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN <outdated ip> ? 256 64.9% a58343d0-1e3f-4d54-bcdf-9b9b949ca873 r1
DN <outdated ip> ? 256 69.8% 67f1d07c-8b13-4482-a2f1-77fa34e90d48 r1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN <actual ip> 3.84 GiB 256 65.2% 7324ebc4-577a-425f-b3de-96faac95a331 rack1
Workaround:
Manually delete a Cassandra pod from the failed config
or analytics
cluster to re-initiate the bootstrap process for one of the Cassandra nodes:
kubectl -n tf delete pod tf-cassandra-<config/analytics>-dc1-rack1-<replica number>
[15684] Pods fail when rolling Tungsten Fabric 2011 back to 5.1¶
Some tf-control
and tf-analytics
pods may fail during the Tungsten
Fabric rollback from version 2011 to 5.1. In this case, the control
container from the tf-control
pod and/or the collector
container from
the tf-analytics
pod contain SYS_WARN
messages such as
… AMQP_QUEUE_DELETE_METHOD caused: PRECONDITION_FAILED - queue
‘<contrail-control/contrail-collector>.<nodename>’ in vhost ‘/’ not empty
….
The workaround is to manually delete the queue that fails to be deleted by
AMQP_QUEUE_DELETE_METHOD
:
kubectl -n tf exec -it tf-rabbitmq-<num of replica> -- rabbitmqctl delete_queue <queue name>
[18148] TF resets BGP_ASN, ENCAP_PRIORITY, and VXLAN_VN_ID_MODE to defaults¶
During LCM operations such as Tungsten Fabric update or upgrade, the following
parameters defined by the cluster administrator are reset to the following
defaults upon the tf-config
pod restart:
BGP_ASN
to64512
ENCAP_PRIORITY
toMPLSoUDP,MPLSoGRE,VXLAN
VXLAN_VN_ID_MODE
toautomatic
As a workaround, manually set up values for the required parameters if they differ from the defaults:
controllers:
tf-config:
provisioner:
containers:
- env:
- name: BGP_ASN
value: <USER_BGP_ASN_VALUE>
- name: ENCAP_PRIORITY
value: <USER_ENCAP_PRIORITY_VALUE>
name: provisioner
[19195] Managed cluster status is flapping between the Ready/Not Ready states¶
The status of a managed cluster may be flapping between the Ready
and
Not Ready
states in the Container Cloud web UI. In this case, if the
cluster Status field includes a message about not ready
tf/tf-tool-status-aggregator
and/or tf-tool-status-party
deployments
with 1/1
replicas, the status flapping may be caused by frequent updates of
these deployments by the Tungsten Fabric Operator.
Workaround:
Verify whether the
tf/tf-tool-status-aggregator
andtf-tool-status-party
deployments are up and running:kubectl -n tf get deployments
Safely disable the
tf/tf-tool-status-aggregator
andtf-tool-status-party
deployments through theTFOperator
CR:spec: controllers: tf-tool: status: enabled: false statusAggregator: enabled: false statusThirdParty: enabled: false