Concurrency and cluster in wM IS

In another thread there was a discussion related to concurrent processing within wM IS.

  • Transformers in a map step.
    Previous versions of SAG docs implied that the transformers can run concurrently. There are observers who indicate this never happens, and others who indicate they’ve seen it.

Relatedly, there was discussion about 2 behaviors related to IS cluster:

  • Execution of a given service moving from one node to another in the middle of execution.
    There are observers who indicate this never happens, and others who indicate it might do that.

  • Processing of messages from UM where one IS instance gets the message from a durable subscriber topic and dispatches that to another IS instance in the cluster.
    There is interest in learning how this would be configured as it seems counter to observed “UM client cluster” behavior where each node independently retrieves and processes messages from the queue for itself, not sending it to another node.

This thread created to continue the conversations if desired and not continue to overload the original thread that was for a completely different topic.

Side note: I cannot recall how to mark a topic as “this is a discussion so don’t pester me to mark a response as the ‘solution’”. Or if this needs to be in a different topic/forum area. I want to avoid being messaged/emailed and having the forums constantly prompting me to “mark it solved”. Any guidance about this would be appreciated.

2 Likes

This happens in a stateful cluster. It uses database to store the pipeline and terracotta to store the session info. It basically splits the pipeline to run simultaneously on different integration servers.

This was happening because of the acknowledge mechanism of IS on UM. It used to retrieve another message even though there was only 1 active trigger on a stateful cluster of integration servers. This should not be the case anymore. When I had this issue latest version was 9.12 and I remember it is possible to receive 1 message from 1 queue right now. I will add the documentation link if I find it.

Found a KB article about serial message processing. Its not mine but it is the same use case.
https://empower.softwareag.com/sl24sec/SecuredServices/KCFullTextASP/viewing/view.asp?prdfamily=webMethods&KEY=132242-11840677&DSN=PIVOTAL&DST=TCD

JMS serial triggers are not supported in an IS cluster.

Clustered IS serial triggers are not supported with UM JMS. 

It is not part of the JMS 1.1 specification, and no other JMS provider supports it.

Software AG did add support for this in v9.9 and up, but it was for native IS triggers only. 

This was done because Broker had a feature called Order By Publisher that can be used to ensure serial processing across multiple clients. We needed to enable this support for Broker replacement. 

This isn't clearly documented unfortunately. However note the documentation as stated below:

"Using webMethods Integration Server to Build a Client for JMS",
section "Consuming JMS Messages in Order with Multiple Consumers",
has an outline only for Broker as the JMS provider. 

There isn't a way to do it with UM.

To get serial processing in a cluster, the triggers will have to be native triggers and not JMS.

This should not be the case anymore, as IS should be able to process messages in serial manner even in a stateful clusters.

This is the interesting part in my opinion. There is a common misconception about running services as transformers will make them execute in paralel. This is even a common job interview question. Some webMethods experts expect you to run services as transformers instead of map steps in order to increase parallelism.

Personally I don’t think it is a good work around even if it executes services in parallel. A workaround is a workaround and if there is a proper way of implementing a feature, workaround must be avoided. Proper parallelism is implemented using publish and wait service.
https://documentation.softwareag.com/webmethods/integration_server/pie10-15/webhelp/pie-webhelp/#page/pie-webhelp%2Fto-publishing_documents_8.html

Even if transformers execute in paralel, it will still block the thread and wait for the reply until the last transformer finishes executing. In other terms, if a service call executes in 30 sec and another executes in 5 sec, it will occupy a thread while waiting for a response. Again, I am not implying that this works. I don’t know if transformers execute in parallel or not.

Thanks for the additional info!

As noted before, I’ve never seen this. I’ve never seen it described such that a service execution thread can move between nodes. As @Rupinder_Singh described for a long-running conversation or as @Percio_Castro1 for checkpointed services in the face of failures, execution may go to any node.

Jumping nodes in the middle of a service execution makes little sense to me. The context switch and the handoff, then presumably the activity MUST come back to the original node so that the caller can get a response from the machine it connected to for execution in the first place.

But I remain open to the possibility that this is possible given the right configuration. Would very much like to know what that configuration might be. Would need more details beyond “IS cluster” – IS clustering has a lot of common misconceptions too so if anyone has more details that would be great.

The differences in observations may be due to using UM via “native wM Messaging” vs JMS. I’ve never used the JMS connectivity with Broker nor UM. Complete pain compared to the simplicity of “native”. Fortunately, we’ve never had any need to connect to any other JMS provider, messaging is implemented entirely within our wM installation, so we never use JMS.

But still, your description stated that one IS node retrieved the message from UM and handed it off to another IS node. I’m not aware of any configuration where IS does that. Do you have more detail, beyond IS stateful cluster?

IMO/IME, it is indeed a misconception. I have seen interview question templates that have this too. Based upon my observations, this is just incorrect.

Appreciate the openness on this. I would offer that folks consider that there have been 3 people in the other thread that have been working with IS a very long time that unequivocally state that it does not do this. I don’t say this to mean longevity is proof but more that if this were the case, it would seem it would have been encountered at some point. Of course if anyone has docs/evidence that transformers in a MAP step are executed in parallel, there are many people that would be very interested.

1 Like

I got my lesson that time and promised myself to never use it unless there is a strong reason. We were using UM as entry point to our integrations and I was at the beginning of my career.

It has been several years and I don’t work for that company anymore and I don’t even live in that country either so other then documentation I have no proof. I might squeeze a POC in my current work but unfortunately not anytime soon.
https://documentation.softwareag.com/webmethods/integration_server/pie10-15/webhelp/pie-webhelp/#page/pie-webhelp%2Fto-clustering_overview_2.html%23

You know, everything(almost) is a webservice in integration server and as long as you have the pipeline you can resubmit/trigger any service payload at any given time. Stateful clusters keep the session data in Terracotta and keep the pipeline information in database, like when we enable the audit and save the pipeline upon failure/always etc. It keeps the pipeline (basically inputs of the service to be executed) and from there, any integration server can pick up that pipeline and execute it. It doesn’t necessarily mean they run in paralel again. If the flow of the service is iterative, it won’t execute in paralel but rather it will execute the pipeline whichever node is available at any time. This caused us a problem when consuming messages from UM. We set 1 trigger enabled 1 trigger disabled so that we would limit the active triggers for that queue. But since any integration server can take over the message and continue processing, the other node that we disabled the triggers took over the execution for that message, and the trigger enabled node consumed another message because it didn’t have an active thread for that queue at that time. I tried changing acknowledgment mechanism to fix the issue, but it required deep manual development that time. If the enabled trigger would wait to send ack or wait to consume another message before sending the ack my problem would be solved. I don’t remember which was the case though. It might be either. It certainly felt like too much engineering would be required this in that architecture, and we were using integration server only as message consumer back then, so we just disabled 1 node to fix that issue. It wasn’t an elegant solution, but it was the simples and it required the least work.

We may have to decompose this topic into even smaller subtopics :laughing:

I hope I comprehed what is being discussed well enough not to throw this topic into an unrelated tangent, but here are my 2 cents when it comes to messaging:

As long as two (or more) Integration Servers have the same client prefix when connecting to the messaging provider (UM or Broker), it will be seen as a single client (whether stateful clusters are used or not). This essentially creates a single queue in the messaging provider for that given client. Two (or more) Integration Servers should never be able to pick up the same message at the same time because the message can only be dequeued once since it’s only one queue being shared. Of course, if one IS picks up the message and fails to acknowledge it within the specified ack window, that message will become available to be picked up by other IS’s with that same client ID. If I understand correctly, however, you’re saying that an IS, whose trigger was disabled, somehow had its trigger service executed with a message that was originally subscribed by that same trigger on a different IS, is that right?

When it comes to delivering message to specific IS’s, it is possible by having a connection that has a different client prefix in each IS. You can then use pub.publish:deliver to deliver to a specific IS, but I have personally not seen a use case that made this option worthwhile. Similarly, you can broadcast a single message to all IS’s in a cluster via a publish as long as the IS’s use a connection with a different client prefix. Again, this is possible because a different client prefix results in separate queues being created in the messaging provider and it’s not necessarily a function of a stateful IS cluster. The Process Engine used this approach via the PE_NONTRANSACTIONAL_ALIAS connection to broadcast messages to different Integration Servers in a stateless cluster.

When it comes to the behavior of stateful clusters:

I honestly haven’t heard of the feature that allows for different steps of a given service to be executed in different servers in the cluster. Yes, I’m aware that session objects are stored in the distributed Terracotta cache, allowing perhaps for a service to persist information into a session object that can be retrieved by another IS in the cluster (e.g. a shopping cart use case). I’m also aware of the “checkpoint restart” pattern that the documentation recommends but which does not come in handy in the real world very often (if ever). However, automatic storage of the pipeline into the database and automatic orchestration of the execution of different steps across different servers is news to me. Sounds interesting… but also very expensive in terms of performance. I would love to hear details so I can give it a try.

Percio

2 Likes

Yes. If you check the IS clustering guide, there are steps that you need to configure in order to have a stateful cluster. One of them is having a distributed cache to store sessions data, but this has nothing to do with the service payload transfer. It is just to keep expensive session data shared across the cluster. The other is configuring the JDBC pools so that they can keep their session data in database. I believe they were ISCoreAudit and ISInternal but only one of them might be enough. If you check the architecture of the stateful cluster, you will see there is a database component. This database is used as layer between Integration Servers to share pipeline data. It is not that expensive since it already does it with internal DB same way. You only add a network layer so that load will be distributed equally and external RDBMS usually have better performance than a internal DB and if you are already saving pipeline upon failure, drawback is almost none.

If you don’t configure the database or terracotta server array, that means you don’t have a stateful cluster. An Integration Server that doesn’t have stateful cluster configuration configured is a stateless one. The part about the prefixes and such are also a requirement but they are irrelevant. The issue was caused because the active node didn’t have a process running in the back ground hence, it assumed it finished processing the message and consumed another one. The other node that had trigger disabled was just running the payloads that it was able to run. So whenever inactive trigger node took over the payload, the other one thought that it finished processing and consumed another message. This was 8 years ago and I believe this behavior causing threads mixing together was solved already. It has nothing to do with JMS or prefixes. It was a multi-threading problem.

This is not entirely true since internal database also consume resources from disk I/O, memory and cpu time. Configuring the external DBs are good for performance. The extra overhead that is caused by network layer can be ignored since load balancing is more efficient with stateful clusters.

I guess we can agree to disagree on whether storing input pipelines for services is expensive, afterall, “expensive” is a relative term that depends on pipeline size, physical resources, and how long your service spends doing other stuff. I will say though that if it wasn’t expensive, it would be on by default and everyone would do it, but that’s not the case.

But I digress… let’s pretend for a second that it’s not expensive. I configure two IS servers to be in a stateful cluster with a shared database and a shared Terracotta cache. I then configure auditing so that the input pipeline for all services is saved. So far, so good. The piece I’m missing is: what do I do next to cause a single service execution to be shared across the two different servers?

Percio

1 Like

Using stateful clusters has benefits and drawbacks as well. It certainly isn’t easy to configure, that’s one thing. The benefit of having it is reliability. Assume there is a crush in production. Since its a production environment, there will be in progress service runs when it crushes. So if it is a stateful cluster, the request will be completed on another node and no client will be effected, no data will be lost.

This is not a valid argument. Half of the world is driving on the opposite direction according to the other, but this doesn’t make driving on the left or right is any more true then the other. It is not default because we need 2 extra components to implement it. Using stateless clusters is much more easier to configure.

From the documentation:

Blockquote
Failover Support for Stateful Clusters
Failover support enables recovery from system failures that occur during processing, making your applications more robust. For example, by having more than one Integration Server, you protect your application from failure in case one of the servers becomes unavailable. If the Integration Server to which the client is connected fails, the client automatically reconnects to another Integration Server in the cluster.
Note:Integration Server clustering provides failover capabilities to clients that implement the webMethods Context and TContext classes. Integration Server does not provide failover capabilities when a generic HTTP client, such as a web browser, is used.
You can use failover support with stateful clusters.
Reverb

I would love for this to be true but it isn’t. There’s nothing in a stateful cluster that causes the transaction to be automatically restarted in another node in the cluster, let alone restarted from where the other server left off. State information has to be maintained by the code along the way for this to be true (either in the session object or in some other object in the distributed cache or via some other shared component like a database) and the client has to retry the transaction to cause it to be transferred to the other server. If the client is a Java client using a TContext or Context class, the failover will be done automatially, but again, it happens in the form of a retry from the client. It’s not something that happens on the server.

The only automated retry that I’m aware of that does gracefully cause a transaction to move from one server to another in the event of a crash, for example, is if messaging is used. But then again, this is not a feature of the stateful cluster but a feature of the messaging infrastructure, given that if the message is not acknowledged by one IS, it will eventually be made available for pick up by a different IS using the same client prefix. Even in this scenario though, execution will restart from the beginning of the trigger service. There’s nothing that will cause the transaction to restart from the point of failure unless the code maintains and checks for state.

Percio

2 Likes

As noted in the doc, failover occurs ONLY with HTTP clients that use the wM IS API. Which IME, is nothing – I’ve never seen anyone use the IS client libraries. No other clients support doing that and the server has no role in this failover. It is entirely up to the client using Context/TContext to see the error and resend, which will go to the other node. It has been this way ever since I can remember.

That description does not indicate that a service that is in the middle of executed can suddenly jump to running on another node. It is akin to the checkpoint and resume behavior @Percio_Castro1 describes.

This check point is any step in the flow. After any step finishes a new pipeline is created and saved into database.

This post is pretty old. Back then they were using (I was still in college that time) Oracle Coherence as cache layer if my memory is accurate. After 20 years I wouldn’t assume everything is still the same. Sometimes I see workarounds implemented decades ago because of a product incapability, and I see people still implementing the same workaround even that incapability is long gone. Its like applying windows 95 patches to windows 11. There weren’t universal messaging, terracotta back then. not even windows 7.

This claims the opposite of the documentation, which has the same information for decades. Are you absolutely sure it wasn’t a bug or configuration error, and if this is a bug (since you claim it doesn’t work as the documentation it should be a bug) is not fixed for so many years? I am pretty sure it continues processing if the configuration is done properly.

Below is the chart what clustering does and what it doesn’t. If you claim it doesn’t do something from this chart, you should create a support ticket and ask them to fix it IMO.
https://documentation.softwareag.com/webmethods/integration_server/pie10-15/webhelp/pie-webhelp/#page/pie-webhelp%2Fto-clustering_overview_10.html%23

I can absolutely confirm that service inputs and outputs are not stored in the database for sharing across multiple servers. You can selectively save parts or all of the pipeline for audit logging but that is also not recommended as its a performance issue, unless used selectively. And that feature is available irrespective of the cluster being present or not.

Its not technically possible for very step in a service to serialize its input and output as that would be terribly slow. And even if that wasn’t so, keeping track of what executed where wouldn’t be so simple. Stateful clusters only allow for session state to be shared, which means if there is data needed to be shared between multiple calls, for example a call to add to cart followed by another call to checkout, then they can get access to the same state even when the second call goes to a different service because of load balancing.

Object Serialization is one of the most expensive operations in java. And the pipeline is a very complex object with multiple object references. Serialization of all that is so expensive that its never recommended. Serialization of that to a database would mean that the server would not even be usable, if it was done at every step.

I am making these arguments to just convince everybody. Otherwise, I know for sure that this doesn’t happen.

Rupinder

1 Like

I’ve seen nothing that indicates each step is check-pointed and that execution can jump back and forth between nodes during a single invocation. Multiple calls, within or not within a stateful conversation can certainly bounce between nodes as the LB will route to a node per its config (round robin, least busy, etc.). To my knowledge, IS itself does nothing explicit to manage which node executes a given request. But that is indeed what this thread is for – to fill any gaps I may have. I appreciate your continued input.

A reasonable point. I am not making any assumption. I’ve not seen anything that indicates this is something that is done. The docs you’ve shared do not describe what you describe. There is nothing about storing the pipeline in the DB after every FLOW step. Nothing about each step being a checkpoint – only that the FLOW service must explicitly invoke a service to establish a checkpoint. In the doc link you shared in the response to @Percio_Castro1 there is one thing that sticks out for me:

The client automatically retries the service…

This is mentioned in most of the lines. And where it isn’t mentioned, it indicates manual restart of the service is needed. What is not emphasized here, but noted on another page, is that the only client type to which this is available is HTTP clients that use the Context/TContext classes. Not sure how many in the SAG wM universe actually use that. This caveat/limitation has been in the documentation since the beginning. They have never changed it.

Of course, if I’m missing or misreading what has been shared please chime in.

I recall when Terracotta replaced Coherence. I was hopeful about the possible additional capabilities. If memory serves, it was just a different implementation of the same capabilities. Possibly simply to get away from Oracle. (Replacing Broker with UM IMO was a similar “lateral” move (save for the HA/clustering feature) but that’s a different topic. :slight_smile: )

Just a quick note to share my view that this exchange is awesome! I really appreciate the sharing of info and experiences from everyone!

I think we’re all after the same thing – correct understanding of the capabilities of wM IS and IS clustering.

1 Like

In case of a Process Engine cluster configured across multiple IS, the execution of the BPM works with Subscription trigger and Transition trigger. The QOS setting (Optimize locally) determines and publish transition document to UM/Broker, and the next step of the BPM model execution could be from different IS that’s part of the PE cluster.

In the process instance monitoring view, can see which step executed in which IS.

This happens only because there is a pub/sub happening in between those steps with a messaging product and not by IS by itself.

When a series of steps part of a flow service executes, it just executes in the same server where it is invoked.

1 Like

I don’t see anyone showing any proof that it doesn’t do what the documentation indicates. Everybody is referencing their past “experiences”. The problem with experience is, it is not documented and it subjective. Engineers don’t rely on experiences, they rely on proof of concepts. It may very well be a configuration error that you couldn’t observe this node switching behavior. I might be wrong as well, so is the documentation, but I refuse to believe the documentation is providing inaccurate information for decades. Someone would have tried this feature and failed by now and created a ticket and demanded SAG to fix this by now.

Since all off you strongly disagree to me, can you tell me when was the last time each of you used stateful cluster and observed its behavior? I used stateful clusters with version 9.6 and 10.5 and working on a POC for version 10.15 on kubernetes cluster. This is one of my test cases. I will update this thread once I have that test data available.

If what that line meant was client’s manual retries, it wouldn’t make sense would it? If it relied on clients manual retries, F5 would be enough for that feature and there wouldn’t be any benefit of having a stateful cluster. Any operation failed in mid execution can be retried anyway as long as you have an F5 as cluster IP. F5 would know a server is offline and would deliver that request to another and it would always start from the beginning, not from mid level.

Engin,

I’ve gone through the documentation and nothing I have stated contradicts it. In fact, the link you provided as a response to my post reiterates what I’ve stated. The words “the client automatically retries/restarts” are used over and over. At the same time, I’ve looked for information that supports your claim, but I’m failing to find it.

One thing I think we can safely establish is that the documentation could be a bit clearer. Certain sections leave room for (mis)interpretation, which leads to this type of confusion. For example, the point that a couple of us have been trying to make that automatic “failover” is a responsibility of the client is just briefly touched upon the section “Failover Support for Stateful Clusters” and they do so in the form of a note. Given the title of that section, they could have gone into more detail.

Reverb.

The authors do elaborate a bit more on those two Java classes later on though: Reverb

Now, regarding checkpointing after each step and automatic failover, it sounds like a really cool feature. However, let’s consider the daunting technical details of the implementation for two very simple use cases:

(1) A Flow service with a LOOP. Are the steps in the LOOP also checkpointed so execution can recover mid-LOOP? If so, is the pipeline for each step and each iteration of the LOOP stored in the database? Consider a LOOP that loops hundreds or thousands of time and the impact that would have on a simple mapping service.

(2) A service that calls multiple adapter services related to a single LOCAL_TRANSACTION. What happens if a service begins to execute on one node, the JDBC transaction is initiated there, an adapter service is executed there, but then it fails over to another node. How could the transaction be continued from another node?

These are just two simple use cases that make the concept fairly impractical.

Having said all this, I completely agree with you that as engineers, we can take a break from talking about documentation and hypotheticals and we can put this thing to the test. If you will soon have a stateful cluster available that you can play with, I’ll wait for your results. Otherwise, I’d be happy to spin one up.

Thanks for your contributions here,
Percio

@engin_arlak the problem with your ask is that you are asking people to prove a negative. I can confirm that I am not speaking from guesswork. I don’t know which documentation says that each step is executed in FLOW by serializing the pipeline to disk. And I have been doing webMethods for 25 year out of which some were as the head of the webMethods product line. And I can confirm this was never the intended behavior or a side effect of anything planned. The only way to have services restart on another server are by using the Process Engine, Messaging or Guaranteed Delivery. And none of those have the capability to restart in the middle of a service. The only way to do that is to custom code it using checkpoints, which are hardly ever used.

1 Like

My POV is the documentation does not state the behavior you have described. Nowhere have I seen it state that it records the pipeline for every FLOW step*** and the IS nodes in cooperation somehow determine which node is going to run a step to move the FLOW service execution forward. Nor have I seen that IS itself will dispatch a message from a queue to another IS instance.

*** the PE/PRT “steps” as noted by @Senthilkumar_G are different, and use messaging to “hop” nodes – and I assume you’re not referring to this

I have not used them in 10+, maybe 15+ years except for PE/PRT where it is used because it required for multi-node operation of PE/PRT. Other than that never had a need. But in the spirit of “engineers don’t rely on experiences” I’m still hoping to find documentation that supports your descriptions. Side note: “proof of concepts” are experiences, no? :slight_smile:

It does not say nor mean client manual retries. It means client automatic retires – but only if the client is using Context/TContext. It applies to nothing else. These too, will start at the beginning on retry, unless the service being called has explicitly implemented checkpointing and has logic to pick up from the last checkpoint.

Stating the obvious and I know we all get this, but just to be clear: Stateful cluster is useful for state sharing, when there are multiple calls for a given interaction from a client. Call one goes to node 1, it stores the state in session or pub.storage, returns, call two, perhaps to a different service, goes to node 2 (or 3 or 4) and that node can read the state from the session or pub.storage, etc. The stateful cluster is for managing the multiple interactions of a client with the server cluster, not for the clustered IS nodes to bounce the execution around themselves.

If memory serves, IS clustering did indeed try to provide load balance when it first came out. That feature was removed long ago.

Correct. To continue mid-level requires the service be coded in a way to explicitly support checkpoints. If there is only 1 IS (which there never is for production, but just mentioning as a thought exercise) then the service can resume mid-level as needed without IS stateful cluster. If there are multiple nodes, IS stateful cluster supports the subsequent call to be routed to any node in the cluster and the service using the shared Terracotta stored state to know where to resume. But again, based upon my understanding of the documentation, this requires the service to be explicitly coded to support this. It is not automatic with the runtime tracking each step.

I think everyone is very open to learning that this behavior does exist. Just have not seen any docs that provide that so far.

1 Like

By the way, also stating the obvious but I want to make sure we’re all on the same page: even when the “checkpoint restart” pattern is used, execution of the service still starts from the beginning of the top-level service. As far as I know, there’s nothing in the platform that allows for execution to start at an arbitrary step in the service. It’s just that the service implements a bunch of if…then…else… statements (or BRANCHes in our world) to determine where to start from. See diagram here: Reverb

Also, this pattern is not dependent on a stateful cluster. The “checkpoint” can be implemented in a number of different ways. Using pub.cache or pub.storage are two options, but a custom implementation using some other shared persistence layer would work just the same. The pattern is not even platform-dependent for that matter.

Last but not least, for what it’s worth, I never found this pattern to be useful in the real world. If a service is complex enough that it requires checkpointing multiple times in its execution, then in my experience, it should be refactored or redesigned.

Percio

1 Like