losing time on the broker

Hi,

I have 4 IS plugged on the same broker. Some flows are using services on differents IS (flows designed via the logical_server).
While we were doing some stress tests, we noticed that the flow loses time every time it changes of IS. So it looks like the main losed time (approx. 70% of the total flow time) is on the broker.

We tried to play with the extended setting :
watt.server.broker.producer.multiclient=10
watt.server.broker.replyConsumer.fetchSize=20
watt.server.broker.replyConsumer.multiclient=10
watt.server.broker.replyConsumer.sweeperInterval=30000

But it is not yet satisfying…
So here’s my question : is there anything else to set to make the IS to broker access really quicker ?

thanks
Sylvain

Sylvain,

What do you mean by “the Flow loses time every time it changes of IS”?

Mark

For example : in the modeler, we design a 2 steps process.
Step A on the IS A and step B on the IS B.
When we look in the wmprocessstep table, we can see that going from A to B takes a few second, even if nothing is invoked in A or B.

In real-life, for a 10 steps process (let’s say 5 on each IS, and changing IS in the middle off the process) :

  • time beetween each step (1to2, 2to3…): 5ms
  • time beetween step 5 to step 6 (changing IS) : 5sec

So I guess (wrong?) that the transition take place on the broker…
We are able to reproduce this problem on several flows.
thanks for your help

“So it looks like the main losed time (approx. 70% of the total flow time) is on the broker.”

“So I guess (wrong?) that the transition take place on the broker…”

“We are able to reproduce this problem on several flows.”

Step transitions happen through the Broker, not on the Broker. The PRT determines that it needs to publish a step transition document. It creates the document and publishes it to the Broker. The Broker places the document on the appropriate IS B queue. IS B gets the transition document and the PRT does what it needs to establish the runtime environment (the pipeline) and invokes the appropriate model step.

Be careful about what component you attribute the time spent to. It’s most likely that the majority of the time is not actually on the Broker, but on the two IS instances as they write state to disk, publish the doc, receive the doc, read state as needed, etc.

I’d offer that unless there is some explicit benefit to splitting this process across multiple IS instances that you don’t do so. Splitting them just introduces overhead, which is apparently at an unacceptable level in this case.

Thanks for your response.

Step transitions happen through the Broker, not on the Broker.
Yes, sorry about my english! :slight_smile:

I agree, the major lost-time is during the publish from the IS to the broker and then when the second IS has to subscribe to this document from the broker. So is there an mysterious parameter to set, to make the IS requesting to the broker more often? The extented setting I tried didn’t change anything…

I’m new on this project and was surprised by their choice of splitting the processes. Apparently, they think it’s a good way for load-balancing and ram consuption management. It’s the first time that I’m working on such architecture, so, for now, I don’t really know…

Splitting a process among IS instances certainly distributes the load but I’m not sure it’s a good load-balancing or fault tolerant approach. If there are more than one “A” and “B” instances then it’s probably okay. But if there is only one of each, then a failure of either one of them will cause all processes to fail or at least be delayed until service is restored.

Another approach to consider is to use an external load balancing device which routes incoming documents to an active instance in the cluster. Once on that instance, processing of the document completes entirely on that server. Each active instance in the cluster can be doing work. If one fails or is taken out of service for maintenance the others will still function.

There is a tech note on Advantage named “Optimizing Publish/Suscribe Solutions” which covers the configuration parameters that can be adjusted to improve pub/sub performance. Be advised that this document talks about client-side queueing, a feature which has been deprecated.

Rob,

yes there are several instances, not just A and B. One server works with 3 instances, another one with 4.
I’ll look at the wM pdf and let you know about my investigation.
For now, I’ve tried the following extended settings :

watt.server.broker.producer.multiclient=20
watt.server.control.triggerInputControl.delayIncrementInterval=600000
watt.server.control.triggerInputControl.delays=50,1000,1000,5000

During the new stress test we noticed a 5% gain. So nothing really convincing…
Thank you for your answer