API Gateway High Availability and Sizing for 10.5

webMethods API Gateway tutorial

Author: Rizwan, Mohammed (Mohammed.Rizwan@softwareag.com)
Supported Versions: 10.5 and above

Introduction

This guide aims to provide information necessary to set up a highly available setup for API Gateway 10.5.

Components in API Gateway

API Gateway bundles the following components that enable you to get started with almost zero configuration. 

  • API Gateway Application (Server + UI)
  • Elasticsearch (shipped as InternalDataStore) - Primary datastore for APIs metadata, policies, events, metrics etc
  • Kibana  - Dashboards and data visualization software
  • Filebeat - To help centralize API Gateway logs

Capacity Planning

Performance report

You can find the details about the API Gateway performance report at performance reports.

Elasticsearch Best Practices

Elasticsearch serves as a primary data store for API Gateway. While doing capacity planning, it is important to factor in the volume of data that might be stored in Elasticsearch. Refer to Elasticsearch Best Practices for more details. 

Clustering API Gateway

We must cluster API Gateways in order to achieve high availability.  This involves clustering the data store (Elasticsearch) and API Gateway.  The following table lists the cluster configuration requirements for different editions of API Gateway.

API Gateway Edition
Data Handled
Elasticsearch Cluster - Required?
API Gateway Cluster using Terracotta - Required?
Additional comments
Standard
  • Ports
  • Administration Configurations
  • Threat Protection policies
(tick) (Optional) (error)
  • In a Standard Edition cluster, the DoS policy enforcement is done per node level.
Advanced
  • All
(tick) (tick)
  • In an Advanced Edition cluster, the DoS policy enforcement is done per node level.

Clustering the Elasticsearch instances will enable the synchronization of data across different nodes.  In addition, API Gateway (Advanced) instances will have to be clustered using Terracotta. API Gateway uses the Terracotta's distributed caching for the following reasons. 

  • In-memory Aggregation of metrics (used for Throttling, Monitor Service Performance, Monitor Service Level Agreements policies) across cluster nodes
  • Notify the other nodes in a cluster about an event such as Service Create, Service Update, Policy Update etc. Though the data synchronization happens through Elasticsearch, API Gateway relies on this notification to update its in-memory caches.

Component Details

API Gateway (both Standard and Advanced editions) has a built-in data store (Elasticsearch). 

Following are the minimum requirements needed to achieve High Availability of API Gateway

  1. 3 API Gateway Advanced Edition Instances
    • Though 2 nodes of API Gateway are sufficient, we recommend using 3 instances
    • If only 2 nodes of API Gateway are used, then it is mandatory to have 3 Elasticsearch Instances to avoid split brain scenario 
  2. 3 API Gateway Standard Edition (only needed in case of a paired deployment scenario)
    • Though 2 nodes of API Gateway are sufficient, we recommend using 3 instances
  3. 2 Terracotta Server instances   (Active-Passive)

Clustering in Paired Deployment Scenario

In this section, you will find configurations required to set up the deployment mentioned below - Threat Protection in DMZ and Authentication & Policy enforcement in Green Zone.

Configuring API Gateway Standard Edition

It is OK not to cluster the API Gateway Standard Editions. For achieving high availability, you just keep adding more nodes and make an entry in the load balancer.  This will mean that you will have to repeat the configurations for ports & threat protection rules that you typically do in the Standard Editions for all nodes. 

As a workaround from repeating the configurations, you can cluster just the Elasticsearch nodes. The sequence, in this case, would be, Start one instance of API Gateway, configure the External, Registration Ports, Threat Protection Rules etc. This will be stored in Elasticsearch and would have synced across other nodes. Now when you (re)start other instances of API Gateways, these would initialize with the data that is already synced and no further configurations are needed.  In this case, the only configuration that is needed is clustering of Elasticsearch. There is no need for Terracotta.   You can follow the below steps for clustering the Elasticsearch nodes. 

  • Stop all API Gateway and Elasticsearch instances
  • Cluster the Elasticsearch
  • Start the Elasticsearch cluster
  • Start the API Gateway application

Clustering the Elasticsearch

Clustering the Elasticsearch is needed to synchronize data across different Gateway nodes.  As a general recommendation, it is better to make the life-cycle (start/stop) of Elasticsearch independent to that API Gateway's in a clustered environment.  In order to do this, set the following property to false in <INSTALL-DIR>\IntegrationServer\instances\<INSTANCE-NAME>\packages\WmAPIGateway\config\resources\elasticsearch\config.properties

Elasticsearch Autostart

pg.gateway.elasticsearch.autostart=false

With this API Gateway will not attempt to start/stop the Elasticsearch when during its start/stop.  So it is important to make sure that the Elasticsearch is started before the API Gateway is started and the Elasticsearch is stopped after the API Gateway is stopped.  The above setting will also help to scale the Elasticsearch independent of API Gateway application. 

By default, API Gateway ships its own Elasticsearch but it is also possible to use an external Elasticsearch as a data store for API Gateway.  Refer to the Using External Elastic Search section Elasticsearch Best Practices for more details.  In this document, we will explain the configurations assuming the usage of the Elasticsearch shipped with API Gateway.  The details about API Gateway's connection to Elasticsearch are available in  <INSTALL-DIR>\IntegrationServer\instances\<INSTANCE-NAME>\packages\WmAPIGateway\config\resources\elasticsearch\config.properties.

Elasticsearch Connection

pg.gateway.elasticsearch.hosts=localhost:9240
Advanced Elasticsearch Configuration: Please note, that this section will only talk about the configurations needed in Elasticsearch for clustering. For advanced configurations and tuning, refer to Elasticsearch Best Practices.

For clustering the Elasticsearch, the Elasticsearch configuration file <INSTALL-DIR>/InternalDataStore/config/elasticsearch.yml has to be adapted. You must provide the cluster configurations in the elasticsearch.yml file in the <INSTALL-DIR>/InternalDataStore/config/ folder before starting the Elasticsearch for the very first time. When you start Elasticsearch, the node will auto-bootstrap itself into a new cluster. You cannot change the configuration after bootstrap and thus, Elasticsearch will not merge separate clusters together after they have formed, even if you subsequently try and configure all the nodes into a single cluster. For more information, see Bootstrapping a cluster.

Configuring Elasticsearch cluster

If you have started API Gateway before setting up the Elasticsearch cluster configuration, perform the following step before proceeding on to configuration:

  • Log off and exit from API Gateway
  • Delete the nodes folder from the <Installation Location>\InternalDataStore\data folder
  • Make the necessary cluster configuration
  • Start Elasticsearch

Open elasticsearch.yml from SAG_root/InternalDataStore/config/elasticsearch.yml in any node that you want to cluster.
The following configuration is a sample of how the configuration appears initially.

cluster.name: SAG_EventDataStore
node.name: node1
path.logs: ../../EventDataStore/logs
network.host: 0.0.0.0
http.port: 9240
discovery.seed.hosts: ["node1:9340"]
transport.tcp.port: 9340
path.repo: ['<SHARED/NETWORK_FILESYSTEM>']
cluster.initial_master_nodes: ["node1"]

Provide the name of the cluster in the cluster. name property. Nodes with the same cluster names form a cluster. That is if there are three nodes in the cluster, the value in the cluster.name property must be same across all three nodes. In other words, Elasticsearch forms a cluster with nodes that have the same cluster.name. For example, cluster.name: "SAG_EventDataStore"

Provide the names of all participating nodes, as seen in the node.name property, and the ports they use, as seen in the transport.tcp.port property, in the discovery.seed_hosts property in the following format: host_name:port_name. If there are three nodes in the cluster, the value in the discovery.seed_hosts property will be like the example given here: discovery.seed_hosts: ["node1:9340","node2:9340","node3":"9340"]. The names of all nodes appear in the cluster.initial_master_nodes property. The node name displayed in this property is the same as seen in the node.name property.
Sample configuration of a node is as follows:

cluster.name: SAG_EventDataStore
node.name: node1
path.logs: ../../EventDataStore/logs
network.host: 0.0.0.0
http.port: 9240
discovery.seed.hosts: ["node1:9340", "node2:9340", "node3:9340"]
transport.tcp.port: 9340
path.repo: ['<SHARED/NETWORK_FILESYSTEM>']
cluster.initial_master_nodes: ["node1", "node2", "node3"]

The specified nodes are clustered.

cluster.initial_master_nodes:  The first time a cluster is started, cluster.initial_master_nodes must be set to perform cluster bootstrapping. It should contain the names of the master-eligible nodes in the initial cluster and be defined on every master-eligible node in the cluster. This setting helps prevent split brain, the existence of two masters in a single cluster.

discovery.seed_hosts: This works by providing Elasticsearch a list of nodes that it should try to contact. Once the node contacts a member of the unicast list, it will receive a full cluster state that lists all nodes in the cluster. It will then proceed to contact the master and join.

path.repo: This is the location where the Elasticsearch will write the snapshots to. So it is important to have a location that is accessible to all the nodes. 

The above configuration will have to be repeated for the other Elasticsearch nodes. After this, start all the three Elasticsearch Instances and once up, you check the cluster health using the URL  - http://<ElasticNode1>:9240/_cluster/health?pretty=true

Cluster Health

{
  "cluster_name" : "SAG_EventDataStore",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 11,
  "active_shards" : 22,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

As mentioned before, the API Gateway Standard Edition does not need to be clustered using Terracotta. The API Gateway nodes will just have to connect to Elasticsearch cluster configured above.   

IMPORTANT: It is important to note that,

  • Any changes that are made in the API Gateway after all nodes are up will not be immediately visible in the other nodes even though the data would have synchronized through the Elasticsearch cluster. Other API Gateway nodes will have to be restarted to see the new changes. This is not required in Advanced Editions where Terracotta is used.
  • It is also ok not to cluster Elasticsearch that are used with API Gateway Standard Editions. In this case, the configurations such as ports, threat protection rules etc, will have to manually perform in each of the API Gateway nodes.

Clustering the API Gateway Advanced Edition

Clustering the API Gateway Advanced Edition requires clustering of Elasticsearch and clustering of API Gateway nodes using Terracotta. 

Clustering the Elasticsearch

Similar configurations as that of Standard Edition. But you need a different Elasticsearch cluster here that caters to the Advanced Edition. 

Clustering the API Gateway

After setting up the Elasticsearch cluster and API Gateway's connection to the Elasticsearch cluster, the next step to perform is to cluster the API Gateway nodes using Terracotta. 

Firstly, you have to configure the Terracotta Server Array (TSA). As a basic requirement, you have to configure two Terracotta Server in an Active-Passive mode. Please refer to the Terracotta documentation for setting up a TC cluster.  Here is a sample terracotta server configuration file for your reference -  tc-config.xml

After setting up the Terracotta Server Array,  you now need to configure API Gateway for clustering.  Starting API Gateway version 10.3, you can use API Gateway User interface to do this. Log in as a user with API Gateway Administration privileges, navigate to the Administration page and select Clustering section. Enable Clustering and provide the TSA url configured above. Perform these steps for all the API Gateway nodes. For detailed steps, please refer to the API Gateway Configuration Guide. 

Clustering in Full Deployment Scenario

In this section, you will find configurations required to setup the deployment mentioned below - Threat Protection, Authentication & Policy enforcement in Green Zone.

This deployment requires the exact clustering configurations that are discussed above under the section Clustering the API Gateway Advanced Edition.

Scaling Tips

Though the capacity planning helps to manage the demand, we still have to be prepared for spikes on demand. This document will give some tips to scale up /down the components of API Gateway. As seen before, API Gateway bundles Elasticsearch, Kibana and other components.  These components more often than not will be running in the same physical machine as the API Gateway application.  But there may be cases, where you would want to have Elasticsearch and Kibana running in different physical machines than the API Gateway application, enabling you to scale up/down these components independently.   The below section on Dismantling API Gateway will provide you the details on how to run these components in different physical machines. 

Dismantling API Gateway

Refer to the document to Dismantling API Gateway

Avoiding a single point of failure

Refer to the document for Avoiding single point of failure for Elasticsearch in Kibana.

Scale up

API Gateway

  • Scaling up an API Gateway would mean adding a new API Gateway node to an existing cluster.  Refer to the Clustering API Gateway section for the details. 
  • As long as the API Gateway node is configured properly for cluster, it' just a matter of adding the node to Load balancer or adding the IP to DNS server if the LB is configured to use DNS load balancing. Setting "portClusteringEnabled" to true in all nodes helps this node to inherit the port settings and can start serving the requests immediately.
  • In a paired deployment setup, if a new node is getting added to DMZ, connections have to be established explicitly from all nodes in green zone to DMZ. One could use the API Gateway REST API to automate these port settings. Then the new node can be added to LB as said above.

Elasticsearch (InternalDataStore)

You can scale up the Elasticsearch by simply adding a new node to the existing Elasticsearch cluster. API Gateway will auto discover the new node. The new node "discovery.seed_hosts" in elasticsearch.yml file should point to the other nodes in the cluster. This will allow the new node to join the running cluster without restarting the cluster but you should also update the other nodes "discovery.seed_hosts" so that the setting survives the restart. This can be ignored if the scale out is only temporary. You may also want to manage the "minimum number of master nodes" to avoid split brain based on your cluster change. "minimum number of master nodes" should be (n+1)/2 where "n" is master eligible nodes. By default, all nodes are master eligible.

Adding a new node to a cluster

This section explains how to add a new node to an Elasticsearch cluster. You can add nodes to a cluster by configuring new nodes to find an existing cluster and start them up. For example, consider that a new node node4 is added to a cluster that already has three nodes in it namely node1, node2, and node3.

Open elasticsearch.yml from SAG_root/InternalDataStore/config from the system where the new node is being added. The following configuration is a sample of how the configuration appears initially.

cluster.name:"SAG_EventDataStore"
node.name: node4
path.logs: SAG_root\InternalDataStore/logs
network.host:0.0.0.0
http.port:9240
discovery.seed_hosts: ["node4:9340"]
transport.tcp.port:9340
path.repo:['SAG_root\InternalDataStore/archives']
cluster.initial_master_nodes:["node4"]

Provide the name of the node, as seen in the node.name property, and port number used by the node, as seen in the http.port property, in the discovery.seed_hosts property in the following format: host_name:port_name. For example node4:9340
Sample configuration after providing the new node details:

cluster.name:"SAG_EventDataStore"
cluster.initial_master_nodes:["node1","node2","node3"]
node.name: node4
path.logs: SAG_root\InternalDataStore/logs
network.host:0.0.0.0
http.port:9240
discovery.seed_hosts: ["node1:9340","node2:9340","node3":"9340","node4:9340"]
transport.tcp.port:9340
path.repo:['SAG_root\InternalDataStore/archives']

Save the configuration. The new node is added to the cluster.

When you restart an Elasticsearch cluster, you must restart the master node first.

API Gateway ships 7 categories of indices by default. They are listed below.

  • gateway_default_<Gateway_Asset_Type> (store for Core Configurations - 1 index for one each type of gateway asset)
  • gateway_default_analytics (store for runtime transactions)
    • gateway_default_analytics_policyviolationevents
    • gateway_default_analytics_threatprotectionevents 
    • gateway_default_analytics_lifecycleevents
    • gateway_default_analytics_errorevents
    • gateway_default_analytics_performancemetrics
    • gateway_default_analytics_transactionalevents-000001
    • gateway_default_analytics_monitorevents
  • gateway_default_dashboard
  • gateway_default_license (store for license details)
  • gateway_default_audit (store for audit logs)
  • gateway_default_cache (store for cache statistics) 
  • gateway_default_log  (store for application logs)

The following command will increase the replica to 2 for "gateway_default_analytics_transactionalevents" index.

curl -XPUT host:9240/gateway_default_analytics_transactionalevents/_settings -d '{
{
  "number_of_replicas": 2
}'

Kibana

Refer here to host kibana in a dedicated machine. To scale-out, put a Loadbalancer in front of Kibana nodes. Refer here to avoid single point of failure for Elasticsearch.

Filebeat

No special configurations needed as this scales up along with API Gateway 

Scale down

API Gateway

  • Put the node in "Quiesce" mode. This will start rejecting the requests and LB routs the request to other healthy nodes. Allow some cooling period for in-flight transactions to complete. Bring the instance down and remove the same from LB.
  • Scaling down is not straight forward for Paired Gateway because of P2P communication.
    • To scale down the DMZ nodes, remove it from LB.
    • To scale down the greenzone nodes in paired gateway setup, disable the internal ports using REST API. Bring the instance down. In-flight transactions would fail as the communication channel is closed.

Elasticsearch (InternalDataStore)

  • Instruct the cluster to exclude the node using the code snippet below. You can also _ip for IP and _name for Node name.
curl -XPUT host:9240/_cluster/settings -d '{
     "transient" : {
    "cluster.routing.allocation.exclude._host" : "mynode"
  }
}'
  • Allow some cooling period for the cluster to re-balance the shards. Monitor the cluster health until it turns "green" using http://host:9240/_cluster/health?pretty=true. Bring the instance down.
  • Make sure re-adjust the "minimum number of master" nodes if required as stated in scale up procedure.
  • You may have to adjust the "replicas" if required as stated in scale up procedure.

image.png

image.png

tc-config.xml (26.2 KB)

check.png

image.png