Skip to main contentMerative SPM on Kubernetes

Known issues

Some issues might require product changes to resolve them.

Helm Charts

Storage initialized using a runbook version older than 20.9.0

If storage has been initialized using a version of the runbook older than 20.9.0, you will need to update your override values to include a supplementary group:

global:
mq:
security:
context:
supplementalGroups: [888]

This is due to a update in IBM MQ, the details of which can be found here.

WebSphere Liberty

Context Root Not Found error on the SPM home page, BIRT, or caseload summary pages

The Context Root Not Found error occurs because the deployment does not install the BIRT application.

The logs contain occurrences of the ICWWKS4001I message

For example:

[1/22/19 8:48:18:272 GMT] 000000ba com.ibm.ws.security.token.internal.TokenManagerImpl ICWWKS4001I: The security token cannot be validated. This can be for the following reasons:
1. The security token was generated on another server using different keys.
2. The token configuration or the security keys of the token service that created the token has been changed.
3. The token service that created the token is no longer available.

The root cause is users not clearing the browser cache after the application is redeployed. Users might have old, local cookie files.

However, after a redeployment or an upgrade, the application does not recognize the cookies that are presented to it by the computer, which causes the error messages in the logs.

IBM MQ XAER_PROTO issue

The following issue has been observed in the JMS consumer pods when JMS transactions is processed by a newly deployed pod, at the initial JMS transactions. The exception in thrown by the IBM MQ Resource Adapter.

[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.transaction.xa.XAException: The method 'xa_start' has failed with errorCode '-6'. com.ibm.tx.jta.impl.JTAXAResourceImpl.start 307" at ...
[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.transaction.xa.XAException: The method 'xa_start' has failed with errorCode '-6'. com.ibm.tx.jta.impl.RegisteredResources.startRes 1053" at ...
[ERROR ] WTRN0078E: An attempt by the transaction manager to call start on a transactional resource has resulted in an error.
The error code was XAER_PROTO. The exception stack trace follows: javax.transaction.xa.XAException: The method 'xa_start' has failed with errorCode '-6'.
at com.ibm.mq.jmqi.JmqiXAResource.start(JmqiXAResource.java:980)
at com.ibm.mq.connector.xa.XARWrapper.start(XARWrapper.java:680)
at com.ibm.ws.Transaction.JTA.JTAResourceBase.start(JTAResourceBase.java:121)
at [internal classes]
at com.ibm.mq.connector.inbound.AbstractWorkImpl.run(AbstractWorkImpl.java:210)

The issue does not affect the state of the Queue Manager. It does not disrupt the connection between the JMS consumer pods and IBM MQ Queue Manager nor the ability of JMS consumer to process JMS messages.

Timeout messages WTRN0006W and WTRN0124I

WebSphere Liberty messages WTRN0006W and WTRN0124I may appear in messages.log as shown here:

WTRN0006W: Transaction 0000017A49B1BD8E0000000125A119FC115513F2FEA8810BAA357685E0C649F93911ADC10000017A49B1BD8E0000000125A119FC115513F2FEA8810BAA357685E0C649F93911ADC100000001 has timed out after 180 seconds.
WTRN0124I: When the timeout occurred the thread with which the transaction is, or was most recently, associated was Thread[Default Executor-thread-4,5,Default Executor Thread Group]. The stack trace of this thread when the timeout occurred was:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:218)
com.ibm.ws.threading.internal.BoundedBuffer.waitGet_(BoundedBuffer.java:176)
com.ibm.ws.threading.internal.BoundedBuffer.take(BoundedBuffer.java:647)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1085)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)

When these messages appear with a stacktrace similar to above it is not a relvant error or a thread hanging, as per this WebSphere Application Server support page, which also covers Liberty.

Method calls that cross the client/server boundary

Some application customizations may cause the client application to reference classes in the server, which, due to isolation and classloading policies, is not possible. This results in errors such as java.lang.ClassCastException or java.lang.NoClassDefFoundError exceptions.

Workaround for client/server boundary errors

The workaround is to make the referenced class(es) available to both the client and server applications. For example, if the class is in the struct.jar file you can modify the build of your client application image (Client.ear) to include the jar file. Before building and deploying the application images, modify the ClientEAR.Dockerfile file to include these lines at the end:

RUN if [ "$EAR_NAME" = "Curam" ] ; then \
cp /config/apps/CuramServerCode.ear/struct.jar /opt/ibm/wlp/usr/servers/defaultServer/apps/${EAR_NAME}.ear/ClientModule.war/WEB-INF/lib \
&& cp /config/apps/CuramServerCode.ear/struct.jar /config/apps/${EAR_NAME}.ear/ClientModule.war/WEB-INF/lib ; \
fi

Why the error occurs

As part of the SPM journey to Kubernetes, SPM made a number of changes in the areas of transaction isolation, messaging architecture, and elasticity so that SPM could leverage the benefits of Kubernetes.

For more information about SPM Kubernetes architecture changes, see Kubernetes architecture in the Social Program Management SPM on Kubernetes Guide.

The Social Program Management PDF documentation is available to download from Merative Support Docs.

Part of these changes included splitting the Curam applications into separate server and client .ear files, which provided necessary flexibility and isolation when deploying and running in Kubernetes. However, classes that cross the client/server boundary cause errors due to this isolation. For SPM product classes that have this requirement the Liberty shared folder is used. These product classes are the EJB local interfaces as described in the SPM Transaction isolation. For more information about EJB local interface, see Transaction isolation in the Social Program Management SPM on Kubernetes Guide.

In relation to the Liberty classloader for SPM, SPM specifies the parentFirst classloader policy. This was done to increase cohesion and lower coupling between the client and server. The use of this policy means that the classes are loaded from the application server libraries to the application. The shared folder is loaded before the applications and it loads all the common classes used by both the server code (CuramServerCode.ear) and client code (Curam.ear). After that the server and client load their classes using those already loaded previously from the shared folder.

Minikube

Minikube dashboard command on Red Hat

When you follow the stepsKubernetes dashboard in Monitoring the application, the minikube dashboard command might not work on Red Hat operating systems. For more information, see Minikube issue 5815.

Limitations when using the Minikube none driver

There are a number of limitations associated with the Minikube none driver that are documented in Minikube documentation. You must evaluate the impacts of these limitations for your implementation.

However, the unavailability of the minikube ssh command might make it difficult to analyze and resolve problems and issues. For example, switching to the kvm2 driver enables the use of minikube ssh and resolves issues with the Docker Registry.