Monitoring webMethods components with Prometheus and Grafana

This article is a quick introduction to monitoring Software AG product components as part of an environment containing third party components, using Prometheus and Grafana. It covers the Mirco Services Runtime (10.3 and later) as well as touches on Integration Server 9.9 (which should also work for earlier/ later versions).

Applicable Versions: MSR 10.3 and later, IS 9.9 and later

Overview

Monitoring of components to ensure that they are available and functioning correctly is one of the requirements of almost any project. The Software AG products can mostly be monitored via Command Central – but that monitoring is limited. It doesn’t cater for monitoring third party products (customers, quite rightly, don’t want to use multiple products to monitor their environment). Also, it doesn’t provide the type of metrics that operations teams are interested in – metrics such as “is processing slowing down for component A or is there a bottleneck in the processing pipeline”.

There are many solutions to the problem – I am sure we have all worked in environments where the likes of Splunk, Graylog, the ELK stack etc. is being used for some form of monitoring/ log scraping. With the increasing move to cloud-based environments, the need for products that allow you to monitor your AWS/ Azure/ <other> environments in addition to whatever you have running on premise and allowing one to monitor different versions of products, and to fulfill these requirements, products like Prometheus and Grafana are a quite good combination. They are easy to install, easy to get to grips with and of course the cost is very attractive, especially on penny-pinching projects.

This combination is not the silver bullet that will solve all your problems and if you want to use this combination to do to-the-second monitoring, you may need to look at other alternatives. If you want to have a general view on the health of your environment and want to see what performance is like over the course of a day/ week, then this may be just the combination for you.

Some of the Software AG products already provide metrics in Prometheus format – the Micro Services Runtime (MSR), Apama correlators, API Gateway to name a few. However, these metrics are available only for the newer releases of the products, typically 10.3 and later (check the documentation before committing to anything!).

This article limits itself to the use of the MSR 10.3, IS 9.9 (as this predates the availability of metrics and was a requirement in a customer project) althoug I can attest to the fact that products like the Apama correlator and TCDB (using TMC) can be monitored just as well and just as easily.

Installation and setup

The installation used for preparing this article consists of CentOS 7 VMs running in VirtualBox (Version 5.2.22 r126460) and as such you may have to do it differently if you use a different environment and commands will need to be changed accordingly.

 

Prometheus, Grafana and MSR 10.3 are all installed using Docker images. The Docker version is 18.09.8.

For the MSR, you need to accept the license agreement in the Docker store: https://hub.docker.com/_/softwareag-webmethods-microservicesruntime

It is assumed that you already have Docker and Docker-compose installed. Minimal steps are given below in case you don’t – detailed instructions and different environments are not in scope for this article, please refer to the Docker documentation where needed. Note also that Docker is used here simply because of the ease of use, this article does not detail a “best-practice Docker installation” and also steers clear of using any component of the  Kubernetes stack.

  • Install Docker
    • sudo yum install -y yum-utils device-mapper-persistent-data lvm2
    • sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
    • sudo yum install docker-ce
    • usermod -aG docker <your user> and ensure Docker is started
  • Install Docker-compose
    • sudo curl -L "https://github.com/docker/compose/releases/download/1.24.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
    • sudo chmod +x /usr/local/bin/docker-compose

Install Prometheus and Grafana

Prometheus/ Grafana will be installed in Docker containers using docker-compose due to the ease of use and speed to get it up and running. Manual installation/ configuration is not covered but is very easy to do.

  • Search for the images you want to use (suggested are prom/prometheus and grafana/grafana)
    • docker search prometheus, docker search grafana (confims that your web access is working…)
  • This assumes the location of a Prometheus configuration file (prometheus.yml) in the directory from where you run the Docker-compose command from, a sample is given below (this file has targets for Prometheus itself, MSR on port 5555 in another container and IS 9.9 installed on some server on port 5755. The scrape interval was shortened to provide somewhat quicker feedback:

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'MSRMetrics'
    metrics_path: /metrics
    basic_auth:
      username: Administrator
      password: ******
    static_configs:
    - targets: ['<server>:5555']

  - job_name: 'IS99Metrics'
    metrics_path: /customMetricsTest
    basic_auth:
      username: Administrator
      password: ******
    static_configs:
    - targets: ['<server>:5755']

  • Create a docker-compose yaml file, sample content below. This installs one extra plugin – the diagram panel (useful if you want to include diagram-type panels and illustrating how to include panels that are not available in Grafana by default after installation) and uses a persistent volume. Please change the admin password to something appropriate!

version: "3"
services:
  prometheus:
    image: prom/prometheus
    ports:
      - 9090:9090
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml

  grafana:
    restart: always
    image: grafana/grafana
    ports:
      - "3000:3000"
    environment:
      - GF_INSTALL_PLUGINS=jdbranham-diagram-panel
      - GF_SECURITY_ADMIN_PASSWORD=secret
    volumes:
      - grafana-storage:/var/lib/grafana
volumes:
  grafana-storage:

  • docker-compose up –d (run command from the same directory as where your docker.compose.yaml file is located)
  • check if you can access Prometheus and Grafana from a browser:
    • http://<server>:9090 (Prometheus)
      • Basic test:
    • http://<server>:3000 (Grafana – initial login is admin/admin – change the password or not, this is a local instance for your use only)
    • If you can’t access it, check that the ports are open/ firewall not active in the virtual machine.
    • Set up a connection to Prometheus in Grafana (since we are running both Prometheus and Grafana from the same docker-compose file, you can simply use “prometheus” as server name). You can add Prometheus-specific dashboards from here as well (use Dashboards tab shown below)

 

That’s it – Prometheus/ Grafana should now be up and running, so you can collect metrics (Prometheus) and display them (Grafana).

Install 10.3 MSR server

The next (and last) step in scope of this document is installing/ setting up the MSR container.

The reason for using the Docker image made available by Software AG is simply for speed and ease of use – think twice before using it as-is in any environment other than a sandbox!

The MSR install contains some custom packages and jar files – for this we do a custom build and then use docker-compose.

The following folder layout works quite easily (example only):

├── docker-compose.yaml
├── Dockerfile
├── jars
│   └── <any custom jar files>
├── MSR103.xml
└── packages
    └── <packages that you want to deploy>

 The corresponding Dockerfile (\ implies line continues below):

FROM store/softwareag/webmethods-microservicesruntime:10.3
COPY packages/<package 1>/ \
/opt/softwareag/IntegrationServer/instances/default/packages/<package 1>/
COPY jars/ /opt/softwareag/IntegrationServer/lib/jars/
COPY MSR103.xml \ /opt/softwareag/IntegrationServer/instances/default/config/licenseKey.xml
# I used MSR10.3 and the license key was expired, so I added a valid one here

The corresponding docker-compose.yaml file:

version: "3"
services:

  msr1:
    build:
      context: .
    ports:
      - 5555:5555

That’s it, you are done, just run:

docker-compose up –d

After a short while you should be able to access http://<server>:5555 in a browser (again, ensure that you have the port open through the firewall or else the firewall shut down).

You should now be able to see the metrics exposed from the MSR in Prometheus (look for sag_is_*) e.g.

The IS 9.9 metrics

There is more than one way of creating custom metrics for versions that do not support Prometheus metrics OOTB or if you need to expose metrics that are not exposed OOTB.

For information on how to do this, please refer to the section titled “INSTRUMENTING” on the Prometheus website (https://prometheus.io/docs/ ).

For the sake of simplicity on the IS side, one can simply expose a Flow/ Java service by creating a URL alias (don’t use ports that are used for any other purpose – e.g. don’t do what I did in my simple test and use the primary port!

Important is that metrics have to be in the expected Prometheus format, e.g.

# HELP pipeline_elapsed_time elapsed time
# TYPE pipeline_traversal_stats histogram
pipeline_traversal_stats_time1{host="prometheusTest:5755"} 0.45
pipeline_traversal_stats_time2{host="prometheusTest:5755"} 0.247
pipeline_traversal_stats_time3{host="prometheusTest:5755"} 0.42
pipeline_traversal_stats_time4{host="prometheusTest:5755"} 0.53
pipeline_traversal_stats_count1{host="prometheusTest:5755"} 4472
pipeline_traversal_stats_count2{host="prometheusTest:5755"} 8796
pipeline_traversal_stats_count3{host="prometheusTest:5755"} 9693
pipeline_traversal_stats_count4{host="prometheusTest:5755"} 7612

The above is, of course, rather simplistic and does not necessarily follow Prometheus best practice (refer to https://prometheus.io/docs/practices ), but it is functional, as can be seen below:

These metrics can now be used to create custom Grafana dashboards.

Further steps

This article is intended as a “quick-start” guide only, it does not cover the operation of Prometheus/ Grafana, Docker or the instrumenting of custom metrics in any depth.

No security-related aspects are covered in this article – if you are running in an environment where all traffic must be encrypted (and you need to ensure that Prometheus is at least accessed with basic authentication), look at the information available on the respective websites and follow the guidelines given there.

Logical next steps once you have your metrics available in Prometheus is to create custom Grafana dashboards – a very simple panel is shown below (the query simply uses the metric as-is without any filter):

The queries must be refined, and panel types carefully chosen to ensure that the dashboards are useful and usable to the end user. Just displaying all available metrics on one big dashboard defeats the purpose as the user is likely to be overwhelmed by die amount of information made available.

The panel shown above will make sense only if you want to specifically look at the numbers over a time period – if you need to be informed of any one of these exceeding a specific threshold, the above panel will be of little use and something like a set of gauges (as much as these are abused) or singestat values may be a better option, e.g.:

As mentioned earlier in the article, one may (looking at reality, one WILL) need custom metrics to be made available to Prometheus and as such may need to look at writing custom exporters. There is already a wide variety of third-party exporters available, so check to see if you are perhaps already covered.

Metrics related to the individual servers can be exposed by running the official Prometheus node exporter on the servers in question, see https://github.com/prometheus/node_exporter

There is a variety of Grafana dashboards available already that can be easily customized or extended. For the MSR, refer to the article by Kalpesh (http://techcommunity.softwareag.com/pwiki/-/wiki/Main/Monitoring%20webMethods%20Microservices%20Runtime%20with%20Prometheus%20and%20Grafana). For other products, look at the dashboards available on the Grafana website under http://grafana.com/grafana/dashboards, one of these being for TCDB that was created by Anthony Dahanne (specific LATER versions only, check the requirements before you decide to use it!).

Alternatives to Prometheus

Prometheus is a convenient time series database, but it is by no means the only way to gather metrics that can be used for Grafana dashboards. A good variety of data source plugins are available on the Grafana website (https://grafana.com/grafana/plugins?type=datasource ), offering the possibility of connecting to a variety of data sources other than Prometheus.

Summary

This article barely scratched the surface of monitoring with Prometheus and Grafana. It is by design rather rather shallow and does not go in depth into any specific area as it is intended as a starter that will hopefully assist in making cross-product and cross-system monitoring a somewhat easier task.

1 Like

We use IS 10.5 (not MSR). Is there something we need to do to enable metrics endpoint ? These are running as stand-alone and not using docker.

For versions where the metrics are not available, refer to the section " The IS 9.9 metrics" - you will need to do some development.