API Gateway High Availability and Sizing

webMethods API Gateway tutorial

Author: Vaidyanathan, Praveen
Supported Versions: 9.12 and above

Introduction

This guide aims to provide information necessary to setup an highly available setup for API Gateway.

Components in API Gateway

API Gateway bundles the following components that enables you to get started with almost zero configuration. 

  • API Gateway Application (Server + UI)
  • Elasticsearch (shipped as InternalDataStore) - Primary datastore for APIs metadata, policies, events, metrics etc
  • Kibana  - Dashboards and data visualization software
  • Filebeat - To help centralize API Gateway logs

Capacity Planning

Performance report

You can find the details about the API Gateway performance report at performance reports.

Sizing

Software AG can provide pointers that help perform capacity planning. Please contact Software AG support for recommendations.

Elasticsearch Best Practices

Elasticsearch serves as a primary data store for API Gateway. While doing the capacity planning, it is important to factor in the volume of data that might be stored in Elasticsearch. Refer to Elasticsearch Best Practices for more details. 

Clustering API Gateway

We must cluster API Gateways in order to achieve high availability.  This involves clustering the data store (Elasticsearch) and API Gateway.  Following table lists the cluster configuration requirements for different editions of API Gateway.

API Gateway Edition
Data Handled
Elasticsearch Cluster - Required?
API Gateway Cluster using Terracotta - Required?
Additional comments
Standard
  • Ports
  • Administration Configurations
  • Threat Protection policies
(tick) (Optional) (error)
  • In a Standard Edition cluster, the DoS policy enforcement is done per node level.
Advanced
  • All
(tick) (tick)
  • In an Advanced Edition cluster, the DoS policy enforcement is done per node level.

Clustering the Elasticsearch instances will enable the synchronization of data across different nodes.  In addition, API Gateway (Advanced) instances will have to be clustered using Terracotta. API Gateway uses the Terracotta's distributed caching for following reasons. 

  • In-memory Aggregation of metrics (used for Throttling, Monitor Service Performance, Monitor Service Level Agreements policies) across cluster nodes
  • Notify the other nodes in a cluster about an event such as Service Create, Service Update, Policy Update etc. Though the data synchronization happens through Elasticsearch, API Gateway relies on these notification to update its in-memory caches.

Component Details

API Gateway (both Standard and Advanced editions) has a built in data store (Elasticsearch). 

Following are the minimum requirements needed to achieve High Availability of API Gateway

  1. 3 API Gateway Advanced Edition Instances
    • Though 2 nodes of API Gateway are sufficient, we recommend to use 3 instances
    • If only 2 nodes of API Gateway are used, then it is mandatory to have 3 Elasticsearch Instances to avoid split brain scenario 
  2. 3 API Gateway Standard Edition (only needed in case of a paired deployment scenario)
    • Though 2 nodes of API Gateway are sufficient, we recommend to use 3 instances
  3. 2 Terracotta Server instances   (Active-Passive)

Clustering in Paired Deployment Scenario

In this section you will find configurations required to setup the deployment mentioned below - Threat Protection in DMZ and Authentication & Policy enforcement in Green Zone.

Configuring API Gateway Standard Edition

It is OK not to cluster the API Gateway Standard Editions. For achieving high availability, you just keep adding more nodes and make an entry in the load balancer.  This will mean that you will have to repeat the configurations for ports & threat protection rules that you typically do in the Standard Editions for all nodes. 

As as workaround from repeating the configurations, you can cluster just the Elasticsearch nodes. The sequence in this case would be, Start one instance of API Gateway, configure the External, Registration Ports, Threat Protection Rules etc. This will be stored in Elasticsearch and would have synced across other nodes. Now when you (re)start other instances of API Gateways, these would initialize with the data that is already synced and no further configurations are needed.  In this case, only configuration that is needed is clustering of Elasticsearch. There is no need for Terracotta.   You can follow the below steps for clustering the Elasticsearch nodes. 

  • Stop all API Gateway and Elasticsearch instances
  • Cluster the Elasticsearch
  • Start the Elasticsearch cluster
  • Start the API Gateway application

Clustering the Elasticsearch

Clustering the Elasticsearch is needed to synchronize data across different Gateway nodes.  As a general recommendation, it is better to make the life-cycle (start/stop) of Elasticsearch independent to that API Gateway's in a clustered environment.  In order to do this, set the following property to false in <INSTALL-DIR>\IntegrationServer\instances\<INSTANCE-NAME>\packages\WmAPIGateway\config\resources\elasticsearch\config.properties

Elasticsearch Autostart

pg.gateway.elasticsearch.autostart=false

With this API Gateway will not attempt to start/stop the Elasticsearch when during its start/stop.  So it is important to make sure that the Elasticsearch is started before the API Gateway is started and the Elasticsearch is stopped after the API Gateway is stopped.  The above setting will also help scaling the Elasticsearch independent of API Gateway application. 

By default, API Gateway ships its own Elasticsearch but it is also possible to use an external Elasticsearch as a data store for API Gateway.  Refer to the Using External Elastic Search section Elasticsearch Best Practices for more details.  In this document, we will explain the configurations assuming the usage of the Elasticsearch shipped with API Gateway.  The details about API Gateway's connection to Elasticsearch are available in  <INSTALL-DIR>\IntegrationServer\instances\<INSTANCE-NAME>\packages\WmAPIGateway\config\resources\elasticsearch\config.properties.

Elasticsearch Connection

pg.gateway.elasticsearch.hosts=localhost:9240
Advanced Elasticsearch Configuration: Please note, that this section will only talk about the configurations needed in Elasticsearch for clustering. For advanced configurations and tuning, refer to Elasticsearch Best Practices.

For clustering the Elasticsearch, the Elasticsearch configuration file <INSTALL-DIR>/InternalDataStore/config/elasticsearch.yml has to be adapted. Here the cluster name has to be specified and the cluster nodes have to be configured. A sample configuration looks like this:

Elasticsearch Cluster Configuration

cluster.name: APIG_EventDataStore
node.name: <ElasticNode1>
path.logs: ../../EventDataStore/logs
network.host: 0.0.0.0
node.master: true
node.data: true
http.port: 9240
discovery.zen.ping.unicast.hosts: ["ElasticNode1:9340", "ElasticNode2:9340, ElasticNode3:9340"]
transport.tcp.port: 9340
path.repo: ['<SHARED/NETWORK_FILESYSTEM>']
discovery.zen.minimum_master_nodes: 2

discovery.zen.minimum_master_nodes: The minimum_master_nodes setting is extremely important to the stability of your cluster. This setting helps prevent split brain, the existence of two masters in a single cluster.

discovery.zen.ping.unicast.hosts: This works by providing Elasticsearch a list of nodes that it should try to contact. Once the node contacts a member of the unicast list, it will receive a full cluster state that lists all nodes in the cluster. It will then proceed to contact the master and join.

path.repo: This is the location where the Elasticsearch will write the snapshots to. So it is important to have a location that is accessible to all the nodes. 

The above configuration will have to be repeated for the other Elasticsearch nodes. After this, start all the three Elasticsearch Instances and once up, you check the cluster health using the URL  - http://<ElasticNode1>:9240/_cluster/health?pretty=true

Cluster Health

{
  "cluster_name" : "APIG_EventDataStore",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 20,
  "active_shards" : 40,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Start the API Gateway instances.

As mentioned before, API Gateway Standard Edition does not need to be clustered using Terracotta. The API Gateway nodes will just have to connect to Elasticsearch cluster configured above.   

IMPORTANT: It is important to note that,

  • Any changes that are made in the API Gateway after all nodes are up will not be immediately visible in the other nodes even though the data would have synchronized through Elasticsearch cluster. Other API Gateway nodes will have to be restarted to see the new changes. This is not required in Advanced Editions where Terracotta is used.
  • It is also ok not to cluster Elasticsearch that are used with API Gateway Standard Editions. In this case, the configurations such as ports, threat protection rules etc, will have to manually performed in each of the API Gateway nodes.

Clustering the API Gateway Advanced Edition

Clustering the API Gateway Advanced Edition requires clustering of Elasticsearch and clustering of API Gateway nodes using Terracotta. 

Clustering the Elasticsearch

Similar configurations as that of Standard Edition. But you need a different Elasticsearch cluster here that caters to the Advanced Edition. 

Clustering the API Gateway

After setting up the Elasticsearch cluster and API Gateway's connection to the Elasticsearch cluster, the next step to perform is to cluster the API Gateway nodes using Terracotta. 

Firstly, you have to configure Terracotta Server Array (TSA). As a basic requirement, you have to configure two Terracotta Server in an Active-Passive mode. Please refer to the Terracotta documentation for setting up a TC cluster.  Here is a sample terracotta server configuration file for your reference -  tc-config.xml

After setting up the Terracotta Server Array,  you now need to configure API Gateway for clustering.  Starting API Gateway version 10.3, you can use API Gateway User interface to do this. Login as user with API Gateway Administration privileges, navigate to Administration page and select Clustering section. Enable Clustering and provide the TSA url configured above. Perform these steps for all the API Gateway nodes. For detailed steps, please refer to the API Gateway Configuration Guide. 

Clustering in Full Deployment Scenario

In this section you will find configurations required to setup the deployment mentioned below - Threat Protection, Authentication & Policy enforcement in Green Zone.

This deployment requires the exact clustering configurations that are discussed above under the section Clustering the API Gateway Advanced Edition.

Scaling Tips

Though the capacity planning helps to manage the demand, we still have to be prepared for spikes on demand. This document will give some tips to scale up /down the components of API Gateway. As seen before, API Gateway bundles Elasticsearch, Kibana and other components.  These components more often than not, will be running in the same physical machine as the API Gateway application.  But there may be cases, where you would want to have Elasticsearch and Kibana running in different physical machines than the API Gateway application, enabling you to scale up/down these components independently.   The below section on Dismantling API Gateway will provide you the details on how to run these components in different physical machines. 

Dismantling API Gateway

Refer to the document to Dismantling API Gateway

Avoiding single point of failure

Refer to the document for Avoiding single point of failure for Elasticsearch in Kibana.

Scale up

API Gateway

  • Scaling up a API Gateway would mean adding a new API Gateway node to an existing cluster.  Refer to the Clustering API Gateway section for the details. 
  • As long as the API Gateway node is configured properly for cluster, it' just a matter of adding the node to Load balancer or adding the IP to DNS server if the LB is configured to use DNS load balancing. Setting "portClusteringEnabled" to true in all nodes helps this node to inherit the port settings and can start serving the requests immediately.
  • In a paired deployment setup, if a new node is getting added to DMZ, connections have to be established explicitly from all nodes in greenzone to DMZ. One could use the API Gateway REST api to automate this port settings. Then the new node can be added to LB as said above.

Elasticsearch (InternalDataStore)

You can scale up the Elasticsearch by simply adding a new node to the existing Elasticsearch cluster. API Gateway will auto discover the new node. The new node "discovery.zen.ping.unicast.hosts" in elasticsearch.yml file should point to the other nodes in the cluster. This will allow the new node to join the running cluster without restarting the cluster but you should also update the other nodes "discovery.zen.ping.unicast.hosts" so that the setting survives the restart. This can be ignored if the scale out is only temporary. You may also want to manage the "minimum number of master nodes" to avoid split brain based on your cluster change. "minimum number of master nodes" should be (n+1)/2 where "n" is master eligible nodes. By default all nodes are master eligible. Use "transient" instead of "persistent" if the change is only temporary. "transient" doesn't survive cluster restart. The below snippet sets the "minimum number of master nodes"  for  a 5 nodes cluster.

curl -XPUT host:9240/_cluster/settings -d '{
    "persistent" : {
        "discovery.zen.minimum_master_nodes" : 3
    }
}'

As you add more nodes , you may increase the number of replicas of an index to improve the search and availability. The default value is 1. API Gateway ships 7 indices. They are gateway_default,gateway_default_analytics,gateway_default_dashboard,gateway_default_license, gateway_default_audit, gateway_default_cache, gateway_default_log . The following command will increase the replica to 2 for "gateway_default_analytics" index.

curl -XPUT host:9240/gateway_default_analytics/_settings -d '{
{
  "number_of_replicas": 2
}'

Kibana

Refer here to host kibana in a dedicated machine. To scale out, put a Loadbalancer in front of Kibana nodes. Refer here to avoid single point of failure for Elasticsearch.

Filebeat

No special configurations needed as this scales up along with API Gateway 

Scale down

API Gateway

  • Put the node in "Quiesce" mode. This will start rejecting the requests and LB routs the request to other healthy nodes. Allow some cooling period for in-flight transactions to complete. Bring the instance down and remove the same from LB.
  • Scaling down is not straight forward for Paired Gateway because of P2P communication.
    • To scale down the DMZ nodes, remove it from LB.
    • To scale down the greenzone nodes in paired gateway setup, disable the internal ports using REST API. Bring the instance down. In flight transactions would fail as the communication channel is closed.

Elasticsearch (InternalDataStore)

  • Instruct the cluster to exclude the node using the code snippet below. You can also _ip for IP and _name for Node name.
curl -XPUT host:9240/_cluster/settings -d '{
     "transient" : {
    "cluster.routing.allocation.exclude._host" : "mynode"
  }
}'
  • Allow some cooling period for cluster to re-balance the shards. Monitor the cluster health until it turns "green" using http://host:9240/_cluster/health?pretty=true. Bring the instance down.
  • Make sure re-adjust the "minimum number of master" nodes if required as stated in scale up procedure.
  • You may have to adjust the "replicas" if required as stated in scale up procedure.