Transformers run concurrently or in parallel

engin_arlak · February 20, 2024, 4:27pm

How to connect to the SQL DB from inside of a Java Service?

Java is only a nightmare without a secret ingredient . Some time ago I wrote myself a helper that can analyze pipeline’s sig and generate Java classes for all input and output document types, with preset code to read and write into the pipeline (similarly to what stock “Generate code for implementing” option provides), so that only the data transformation itself needs to be done manually. It’s somewhat crude, and generated class names can get messy for complex structures, but the ability to manipulate data in any ways that Java allows is the reason I prefer it to flowservices for any serious task. And as I said above, this particular task does require some juggling with received data, additional SQL calls not being the bigger of inconveniences.

Most of the data transformations are automatic, you shouldn’t need custom Java service for that purpose.

I usually don’t like overdoing java services. It can be a real nightmare if you need to make changes on that service years later. Is there supposed to be external calls between database queries? If not you can still use sql code to implement your logic. I don’t like overdoing sql queries either, but for your requirement it might make sense.

reamon · February 20, 2024, 5:38pm

Transformers are not executed in parallel. The order of their execution is undefined, so the documentation says treat them as though they run in parallel, but they do not actually do so.

Edit: I cannot find existing docs that state “can view transformers as if they run in parallel” so perhaps that was removed at some point in the past.

reamon · February 20, 2024, 5:47pm

Lots of discussions on the boards about Java vs. FLOW. I usually object to the generalization that Java is “better” for complex (or “serious”) activity and FLOW is for simple stuff. I disagree. Java services are certainly a necessity in the Integration Server space but IMO, these are to be minimized. YMMV.

engin_arlak · February 20, 2024, 5:48pm

But it won’t be sequential, will it? It won’t wait for the previous service call to be completed to execute the next one. If that’s not the case, its a very common misconception. They even ask this questions in job interviews frequently.

Personally I don’t prefer using services as transformers anyway. There is a proper mechanism for executing services in paralel. Using transformers for that purpose is a workaround IMO, and all workarounds should be avoided if there is a proper solution for any requirement. I use the below mechanism to execute services in paralel.

https://documentation.softwareag.com/webmethods/integration_server/pie10-15/webhelp/pie-webhelp/pie-webhelp/to-publishing_documents_7.html#

reamon · February 20, 2024, 6:11pm

From my observations, yes, it is sequential. From the docs:

Transformers act as collection of INVOKE steps embedded in a single MAP step. However, transformers in a MAP step are independent of each other, do not execute in a specific order, and might not execute in the same order each time the MAP step runs.Consequently, the output of one transformer cannot be used as the input to another transformer

There was a thread many moons ago that covered this briefly. In that thread it has a quote from older docs:

When inserting transformers, assume that webMethods Integration Server concurrently executes the transformers at run time.

This is undoubtedly the source of the apparent misconception and is no longer in the docs. I have not come across any evidence that indicates transformers run concurrently. If any one has such evidence I would be very appreciative if it can be shared.

I too use published docs (sometimes locally) to introduce parallelism or async activity when needed.

engin_arlak · February 20, 2024, 6:17pm

I didn’t know where that rumor originated from. I never liked that approach even when I thought it was running in paralel. Now I have a very good reason to strongly object. Thanks for the info.

reamon · February 20, 2024, 6:37pm

I agree that using transformers should be limited. The most useful scenario is what the docs describe: “perform multiple data transformations on the pipeline contents in a single flow step” – mapping elements in doc A to elements in doc B with representation changes. E.g. data object to a string. true/false or 1/0 to yes/no.

They are useful for limiting the scope of the pipeline to the called service too – but that’s usually a rare need and can be managed with the scope property for the FLOW step.

Holger_von_Thomsen · February 20, 2024, 7:01pm

Hi,

additionally, remember that transformers only will be executed when theirs output is mapped to the output pipeline of the map step in which they are invoked.

When transformers are running concurrently, this does not imply that they are running really in parallel.
This only means, that some sets of the transformers might be invoked in parallel, but these sets do not need to be same for each invocation of the map step.

Regards,
Holger

reamon · February 20, 2024, 8:27pm

Do you have any docs or other information that indicates that this ever happens? Based upon the info I’m aware of, this never happens. To my knowledge, transformers are invoked serially (order is not defined) within the same thread as the service containing the MAP step.

engin_arlak · February 20, 2024, 8:35pm

He says it doesn’t imply that …

It is still possible though. If you implement stateful clusters, some of the work can be passed down to another node. I remember back in 9.6 I wasn’t able to consume messages in their respective order. They were always passing some of the work to another node.

reamon · February 20, 2024, 9:04pm

@engin_arlak Perhaps I’ve misunderstood @Holger_von_Thomsen comments, but I took it to mean, for example, of 5 transformers in a MAP, perhaps 3 of them are invoked in parallel as a set. I’ve never seen anything along those lines.

Can you imagine trying to manage that within the run-time code? And waiting for the thread joins? And never having a thread-lock due to contention? If transformers were run in parallel, I would think we’d see a bunch of discussions on the forums about various issues related to that (other than the common “transformers can’t loop over a list”).

I assume you mean messages from Broker or UM. If so, that’s due to trigger configuration, not transformers.

Service execution does not jump between IS instances mid-execution. Once a server starts a service, it runs there to the end. I think that was another common misconception of IS clusters – the each step in an executing service could run on any node. That is not the case.

engin_arlak · February 20, 2024, 9:22pm

This is not true in a stateful cluster, at least it wasn’t when I had the issue at 9.6.

Broker was capable of consuming messages in order, UM wasn’t. Let me explain what made me strongly think it wasn’t possible to consume in their respective order.

When we subscribed to a queue, (it was a topic with durable subscriber but its not related to the issue) we tried disabling the trigger in one node, and kept it active in another. We had stateful clusters with 2 nodes and we set the trigger type to serial, then we overloaded the queue. What surprised us was the active node was consuming the message and was passing the payload to the other node right away and consuming another message. We couldn’t limit this to a single message at any given time. Even with a disabled trigger they passed their workload to each other. This is a feature of stateful clusters. I created a support ticket and asked for a fix and they told me that integration server 9.6 doesn’t support consuming messages in serial order when it subscribed to UM. They told me to create a feature request, which I did. I believe it is possible to consume messages in serial manner now, I think it was first introduced at 10.5 or a version later. It still passes workload to other nodes but now it should be possible to limit the number of active threads per queue across the cluster. I didn’t need to implement the same use case again. But I remember its written in documentation.

This was my what the heck moments back then.

reamon · February 20, 2024, 11:10pm

Interesting. I’ve never seen nor experienced this. In any type of clustering. But I guess if he service fails midway, and the client “resumes” it could go to another node. But I’ve never implemented anything like that. All I’ve ever worked with is stateless services (IS clusters too).

But even so, I’d be surprised if transformers were part of the reason. Guess I’d need to see more of what you had implemented to understand.

I have never seen that behavior. But we don’t have IS clusters (we 2 nodes in an LB cluster, IS cluster not enabled). In previous lives there were IS clusters but never saw that then either. But I guess I can see how it might do that – the messaging client on any node retrieves from Broker/UM and dispatches it to any node in the IS cluster. But that’s speculation on my part on how it works. I’d be interested in learning more about the setup you had.

But I still am skeptical of an execution of a FLOW service changing nodes mid-execution. You’ve described how things seem to get dispatched for messaging, which makes sense. But not in the middle.

Yes, we have this implemented with UM. We waited to move off of Broker until it supported that. The trigger can be serial or concurrent with 1 thread – in both cases only one thread in the cluster (not IS clustering, but 2 identically configured IS instances) is active for a given durable subscriber. We use concurrent with 1 so that with EM we can browse the queue. When serial, cannot do that.

Perhaps we can continue this exchange in personal messages. I’m very interested in learning more about what you experienced.

jahntech.cj · February 21, 2024, 8:19am

I would be very interested, too!

engin_arlak · February 21, 2024, 2:43pm

We can create a new topic for this since both of you are interested, some other people might be as well.

Service wasn’t failing when this was happening. It is not directly related to transformers but stateful clustering make integration servers act as a single integration server, and they both can work on same payload. In theory (didn’t have any reason to observe this behavior) they can work on same payload and start from different segments. That segment can be a transformer since they are called when their output is needed.

Percio_Castro1 · February 21, 2024, 3:24pm

I know the topic has moved away from the original subject a bit, but for what it’s worth, in 20+ years of dealing with the Integration Server, my experience is the exact same as Rob’s. Unless you specifically implement checkpoints into your services that allow the transaction to be picked up from where it was left off, there’s nothing about a stateful or stateless cluster that gives you this functionality. When a transaction starts executing on a node in the cluster, it will complete on that node unless:

(1) the service explicitly transfers execution to another node via, for example, a publish/deliver, JMS send, HTTP call, or a pub.remote call, or;

(2) the service keeps checkpoints throughout the execution, and upon failure, the client resubmits the transaction causing it to be routed to another node in the cluster via a load balancer, and the other node uses the checkpoint to pick up the transaction from where the first node left off.

The documentation suggests that checkpoints can be created by using pub.storage or pub.cache service, but I’ve never seen this implemented in the field and I would personally discourage this practice:

https://documentation.softwareag.com/webmethods/integration_server/pie10-15/webhelp/pie-webhelp/index.html#page/pie-webhelp/to-cache_folder.html

https://documentation.softwareag.com/webmethods/integration_server/pie10-15/webhelp/pie-webhelp/index.html#page/pie-webhelp%2Fto-clustering_overview_9.html%23

When it comes to transformers, my experience is again the same as Rob’s in that they are not executed in parallel. I’d say I’m 100% sure but I want to leave room for growth so I’ll say 99.99% Their graphical representation gives this false perception, but they are executed in sequence. It’s a fairly simple thing to test.

HTH,
Percio

rupinder1 · February 21, 2024, 3:45pm

This thread kind of went off the rails so I don’t want to add to confusion by jumping into something I did not completely comprehend, but wanted to make sure there is no confusion on the execution of transformers in a flow service. Transformers do not get executed in parallel and are always executed in the same thread that the parent flow is executing in. The only exceptions to transformer behaviors are:

Multiple transformers in a single MAP step are not executed in a predictable order.
They work on a subset of the pipeline variables that are explicitly mapped into their inputs and out of their outputs.
They don’t get executed if no mapping exists into their input or out of their output.

Rupinder

Holger_von_Thomsen · February 21, 2024, 4:45pm

Hi,

the output of a transformer MUST be mapped, otherwise it won´t be executed.
Services, which do not have a defined output signature, cannot be used as a transfomer.
Transformer services do not require a defined input signature, but can have one if neccessary.

Regards,
Holger

engin_arlak · February 21, 2024, 5:12pm

It doesn’t have much to do with transformers. Its a stateful clusters feature. From the documentation:
In a stateful cluster, the session state for clients connected to cluster nodes is stored in a distributed cache in a Terracotta Server Array. The cache makes state available to all servers in the cluster, which enables multi-step transactions in a conversation to begin on one node in the cluster and continue on other nodes.

So in theory, a transformer can be executed in one node and other one on another node in a stateful cluster. I observed this behavior in the past. (Again, its not directly related to transformers)
https://documentation.softwareag.com/webmethods/integration_server/pie10-15/webhelp/pie-webhelp/#page/pie-webhelp%2Fto-clustering_overview_3.html%23

Need to clarify this, I didn’t observe transformers executing in parallel, but rather I observed that any workload can be transferred to another node at any given time in stateful clusters. In fact it is impossible to predict which one will take over that payload in a stateful cluster.

rupinder1 · February 21, 2024, 5:17pm

@Holger_von_Thomsen, I stand corrected. The output of the transformer is all that matters. And services without defined signature can be invoked as transformers as long as their output is mapped during execution.

Rupinder

Topic		Replies	Views
Concurrency and cluster in wM IS discussion	31	1148	September 18, 2024
webMethods Developer Feedback & ideas Discussions-on-this-Forum-System , New-User-Welcome-and-Introductions	1	797	September 3, 2007
Is Parallel flow processing possible? If-it-doesn-t-fit-anywhere-else	9	3372	September 3, 2021
webMethods Flow Tutorial - No.2 Create and Run a Flow Service Knowledge base tutorial	0	8761	April 7, 2014

Transformers run concurrently or in parallel

Related topics