Skip to main contentMerative SPM on Kubernetes

Monitoring XML servers

Introduction

Monitoring of the XML server is available through:

Monitoring XML server statistics

SPM XML servers produce statistics for each job processed. For more information about XML server architecture, see XML server architecture in the Social Program Management XML Infrastructure Guide.

The Social Program Management PDF documentation is available to download from Merative Support Docs.

Out-of-the-box these statistics are written to text files in the /opt/ibm/Curam/xmlserver/stats/ folder but are not directly accessible to Prometheus or other monitoring tools.

SPM 8.0.1.0 This topic describes a sample implementation to make XML server statistics available to Prometheus. The ability to access XML server statistics in a Kubernetes deployment requires using the XML server -forcestatswrite start option.

This option is only available in SPM V8.0.1 or higher. For more information about Statistics and starting the XML server, see Statistics and Starting the XML server in the Social Program Management XML Infrastructure Guide.

Sample implementation

The sample implementation is based on a proof-of-concept performed to illustrate the capability. Artifacts used for the proof-of-concept are provided as samples in the GitHub repo.

Notable decisions and limitations used in performing the proof-of-concept:

  • The Prometheus client_java is used for implementing the Prometheus collector and exporter, hence it is coded in Java.
  • A sidecar container is used to run the Prometheus collector and exporter as this allows for:
    • Avoiding making any Kubernetes or Prometheus-specific code changes to the XML server code base. This is achieved by the use of persistent storage, which is described in more detail below
    • Avoiding additional processes in the xmlserver pod, which could impact on the main xmlserver process.

At a high-level, providing XML server statistics to Prometheus involves the following steps:

  • Implementing a Prometheus collector and exporter to make the /opt/ibm/Curam/xmlserver/stats/ThreadPoolWorker-* file contents available to Prometheus
  • Creating a Docker image for the sidecar container that contains the Prometheus collector and exporter
  • Defining a sidecar container in the xmlserver deployment
  • Defining the necessary Prometheus configuration for the Kubernetes cluster. The sample deployment provided utilizes the OpenShift PodMonitor.

The role of persistent storage in monitoring XML server statistics

Persistent storage is required by this sample implementation so that:

  • The xmlserver-metrics sidecar container has access to the same persistent file system as the xmlserver container.
  • As the XML server writes statistics to the files in its ./stats folder those same statistics records are available to the xmlserver-metrics sidecar.
  • No changes are required to the XML server code base or the xmlserver container.

The following describes the files involved in the use of persistent storage and how they interact:

  • For the xmlserver-metrics container dockerfiles/Liberty/xmlserver-metrics/XMLServer-metrics.Dockerfile creates the /opt/ibm/Curam/xmlserver/ folder and the ./tmp and ./stats subfolders in its local file system.
  • In helm-charts/xmlserver/templates/deployment.yaml the xmlserver-metrics container defines the same volumeMounts: as the xmlserver container and they both reference the deployment’s persistentVolumeClaim:.
  • In the xmlserver-metrics container the start-prometheus_collector_exporter.sh startup shell script ensures the persistent mountpoint is available and creates symbolic links from the container’s file system to the persistent file system.

More details on the above files can be found in the relevant sections that follow.

Prometheus collector and exporter

This section describes the Prometheus collector and exporter code provided in the xmlserver_prometheus.jar and xmlserver_prometheus-sources.jar files that reside in the dockerfiles/Liberty/xmlserver-metrics/ folder. You may find the following information useful in understanding the sample implementation and how to develop an implementation suitable for your specific environment and use of the XML server.

Choosing XML server statistics

For more information about Statistics and starting the XML server, see Statistics in the Social Program Management XML Infrastructure Guide.

StatisticDescription
Successtrue, false (failed)
Job preview typePDF, HTML, TEXT, RTF
Elapsed connectionelapsed time (milliseconds) from connection start to connection close
Elapsed jobjob run time (milliseconds)
Elapsed job preview sendtime (milliseconds) to send the preview data to the client
Job preview data lengthlength of the preview data (in bytes) sent to the client
Timestamptimestamp (Java timestamp value) when the secure connection entered the system
Template IDtemplate ID processed
Template versiontemplate version number
Template localelocale of the template

For purposes of the proof-of-concept we used these statistics as being illustrative:

  • Success - reported as success or failure
  • Job type - PDF, HTML, TEXT, and RTF
  • Elapsed connection
  • Job preview data length

Your own requirements, depending on your application usage, etc. may differ.

Coding the Prometheus metric variables

These Prometheus client_java APIs illustrate coding of the above proof-of-concept statistics:

Jobs run:

/** Prometheus counter - of xmlserver jobs. */
private static final Counter jobTotals = Counter.build()
.namespace("curam_xmlserver")
.name("jobs_total")
.help("Job counts with labels for status=success|fail ; type=PDF|HTML|TEXT|RTF")
.labelNames("status", "type")
.register();

Job elapsed, or connect, time:

/** Prometheus histogram - job connect time. */
static final Histogram jobConnectTime = Histogram.build()
.namespace("curam_xmlserver")
.name("job_connect_seconds")
.help("Job connect time")
.buckets(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9,
1.0, 1.5, 2.0, 2.5, 3.0, 4.0, 5.0)
.register();

Job bytes processed:

/** Prometheus histogram - job data length. */
static final Histogram jobDataLength = Histogram.build()
.namespace("curam_xmlserver")
.name("job_data_length_bytes")
.help("Job preview data length")
.buckets(5000, 7000, 9000, 10000, 12000, 14000, 15000,
20000, 25000, 30000, 35000, 50000)
.register();

Java code for the Prometheus collector and exporter

This is a high-level, pseudo-code description of the Java implementation of the Prometheus collector and exporter:

main() method:

  • Process optional arguments
    • -httpPort <port> - Port that Prometheus will listen on. Defaults to: 8080
    • -checkInterval <seconds> - Number of seconds between checks for data in the ./stats/ folder to check for termination. Defaults to: 10
  • Start the collector thread
  • Start the exporter thread

Collector thread:

  • Identify all ThreadPoolWorker-* files in the ./stats folder
  • While the thread is not interrupted
    • Iterate over each file
      • If there is content
        • If the content has not been previously read
          • Create a java.io.RandomAccessFile in read mode
          • Bypass previously read content
          • Process each new line in the file
            • Call the io.prometheus.client.Counter.inc() method to increment status (success|fail) and type (PDF|HTML|TEXT|RTF)
            • Call the io.prometheus.client.Histogram.observe(double amt) method for elapsed time
            • Call the io.prometheus.client.Histogram.observe(double amt) method for bytes processed
          • Record the new number of bytes read from the file
      • Sleep for the specified interval

Exporter thread:

  • Using the specified port create a new io.prometheus.client.exporter.HTTPServer.HTTPServer(int port)

Creating the Docker image for the sidecar container

In support of building the sidecar Docker image the following files are included in the dockerfiles/Liberty/xmlserver-metrics folder of the GitHub repo:

  • The .Dockerfile that builds the image: XMLServer-metrics.Dockerfile
  • The sample proof-of-concept implementation executable, xmlserver_prometheus.jar, and source, xmlserver_prometheus-sources.jar
  • The start shell script, ./xmlserver-metrics/start-prometheus_collector_exporter.sh, invoked by the xmlserver-metrics deployment’s cmd: structure and when invoked:
    • Ensures the persistent storage mount point is available and the expected directories exist.
    • Starts the Prometheus collector/exporter Java program in the background.
    • Waits for the Java program to terminate and logs the end of the process.
  • The stop shell script, ./xmlserver-metrics/stop-prometheus_collector_exporter.sh, invoked by the xmlserver-metrics deployment’s lifecycle: structure and when invoked:
    • Cleans up the .pid file created when the Java process was started.

Building the Docker image

Building the xmlserver-metrics image uses a similar process to building the other Docker images. To build the sidecar image run the following commands:

Note: In the following commands, the value of $PROJECT will change depending on deployment type:

  • For a Minikube deployment the value of $PROJECT is arbitrary.
  • For a Kubernetes deployment the value of $PROJECT must equate to a valid namespace in the customer’s account.
  • For an OpenShift deployment the value of $PROJECT must equate to a valid project. For this runbook it is best to stick with the value chosen for $PROJECT in Creating a CRC project.
cd $SPM_HOME/dockerfiles/Liberty/
docker build \
--tag $DOCKER_REGISTRY/$PROJECT/xmlserver-metrics:latest \
--build-arg "BASE_REGISTRY=registry.connect.redhat.com/" \
--file xmlserver-metrics/XMLServer-metrics.Dockerfile .

Deploying the sidecar container and configuring for Prometheus

Sample files are provided to define the sidecar and Prometheus configuration and xmlserver-metrics is only installed by Helm when these settings are provided:

  • global.apps.common.persistence.enabled: true
  • global.useBetaFeatures: true
  • xmlserver.metrics.enabled: true

For more information see the XML Server Configuration Reference.

The following files are included in the helm-charts/xmlserver/ folder of the GitHub repo for deploying the xmlserver-metrics sidecar container and configuring it for the XML server:

  • ./templates/deployment.yaml - defines the xmlserver-metrics sidecar
  • ./templates/monitor-xmlserver.yaml - defines a PodMonitor

The Prometheus port, xmlmetrics, referenced in both deployment files defaults to 8080.

Confirming the status of the xmlserver-metrics container

Once deployed, a Kubernetes get pods command will show how many of the containers are ready. For the xmlserver deployment it should show 2/2 (two of two) ready.

For deployment issues use the Kubernetes describe command against the pod to confirm the deployment of the xmlserver-metrics container.

You can view the xmlserver-metrics stdout using the Kubernetes logs command and you must use -c xmlserver-metrics on the command to specify the container; otherwise, it defaults to the xmlserver container. While the xmlserver-metrics container is running log4j output is written to StatsForPrometheus.log in the pod’s persistent tmp folder: /opt/ibm/Curam/xmlserver/tmp.

Interacting with the XML server statistics

While ultimately you will want to view the XML server statistics in Prometheus or Grafana (not covered here), command line interactions are useful for testing and quickly getting performance indicators.

For instance:

  • Using the curl command in a pod’s shell you can get the current metrics (remember that with two containers your Kubernetes exec command must use -c xmlserver-metrics to enter the correct shell), for example:
    • curl -s http://localhost:8080/metrics - displays all the Prometheus data and help text
    • curl -s http://localhost:8080/metrics | grep "curam_xmlserver_jobs_total.status=.success" - displays the count of successful XML server jobs
    • curl -s http://localhost:8080/metrics | grep "curam_xmlserver_jobs_total.status" - displays the count of successful and failed jobs

Monitoring the XML server JVM

The performance of the Java Virtual Machine (JVM) running the XML server directly impacts the performance of the XML server.

The xmlserver image includes the Prometheus JMX Exporter and you can configure your deployment to enable monitoring of the XML server JVM. Your Helm override file needs to specify the following to enable the monitoring:

  • global.useBetaFeatures: true
  • xmlserver.jvmStats.enabled: true

Additionally you can configure the port and JMX Exporter configuration. This is the default configuration:

  • xmlserver.jvmStats.port: 8083
  • xmlserver.jvmStats.configYaml: ''

How you configure the JMX Exporter will depend upon how you consume the data. For reference:

XML server health checks

SPM 8.1.0.0 A readiness and liveness probe are available for xmlserver deployments. These probes complement each other to ensure the xmlserver service is not accessed until it is ready and remains available.

In support of the probes a folder, xmlserverprobes has been added to the /opt/ibm/Curam/xmlserver folder in xmlserver containers.

To enable these probes you need to enable them in your Helm override file. Depending on your local environment you may need to adjust the timing values to achieve successful operation. See the XML Server Configuration Reference topic and the K8s Configure Liveness, Readiness and Startup Probes page for more information.

readinessProbe

The readinessProbe depends on this info level log4j message: ”XML Server awaiting connections on port”. Thus, if you have modified the default xmlserver log4j configuration to suppress this message you cannot use the readinessProbe.

livenessProbe

The livenessProbe adds messages such as these to the /opt/ibm/Curam/xmlserver/tmp/xmlserver.log file for each probe invocation (default 30 seconds):

[xmlserver] [INFO] [XMLConnectionHandler] Mon Apr 03 12:28:02 BST 2023 Connection received.
[xmlserver] [INFO] [XMLConnectionHandler] Awaiting job configuration.
[xmlserver] [INFO] [XMLConnectionHandler] Probe check received.