To use or not to use the Broker?

I have an architecture decision to make and wanted to see if you folks out there agree with my line of thought…

One of the general guide line we follow is to use the broker as a means of decoupling the source and target in an integration. After performing some basic mapping on the data received from a source system, we publish it (usually in a canonical form at this point) to the broker. The subscribing service (residing on either local or remote IS server) then takes the data and performs the necessary data manipulation before sending it off to the receiving system. This provides:

  1. a temporary queing machanism incase the receiving system cannot process the data as fast as the source is sending in. This helps avoid waiting threads/sessions on the IS server.
  2. load balancing - our IS servers are clustered with 2 or more instances and we try to do concurrent processing where ever possible so documents would get distributed more evenly between the IS instances.

The integration I’m looking into right now has a JMS provider as the target system. Since sending data into a JMS topic/queue already provides a decoupling mechanism, there might not be much benefit of introducing the broker into the integration? No broker = a more straight forward integration = less points of failure.

Appriciate any comments/feedback. Thanks!

IMO, you already have a decoupling mechanism by using IS. End points don’t connect directly to each other but instead communicate via IS as an intermediary.

Broker is useful for pub/sub needs, but IMO the vast majority of business integrations do not need pub/sub.

There have been a few threads about this over the years. A search of the forums will yield some useful and varying schools of thought on the topic.

reamon, thanks for the quick response.

I guess I’m looking at decoupling in terms of the processing that happens within webMethods as opposed to the entire end to end process.

I do agree that the traditional use of the Broker (or any messaging engine) are primarily for cases where there are multiple consumers for the same set of data. However over the years of running into performance issues with webMethods, we have adopted and even promoted the use of the Broker for the reasons I mention earlier.

The Broker is heavily used here at our shop and there are times I question this practice… hence my question in the first place. I am considering going against our usual practice for this particular integration I am currently reviewing.

Luckily, the Borker I have to admin is pretty well built and have been working very well for us. Most problems we encounter usually stem from the IS server or the source and target systems.

Decoupling of the decoupling mechanism is getting a bit esoteric, no? :wink:

Can you describe the value you’ve seen from using the
pub/sub model? How often have new targets been added with zero change to the source and the (supposed) canonical document? What additional facilities have you had to put in place to track what’s going on because the doc in the middle simply disappears?

The threads that I mention provide my viewpoints on the usefulness of Broker. IMO, the default approach would be to not use Broker and justify why an integration needs it (decoupling the communication level isn’t a compelling reason, IMO). Others have the opposite approach.

I concur that most issues encountered occur in IS. Broker rarely has issues–IMO this is mainly because it doesn’t do much of anything. It does its one thing well. All of the hard work is generally done at the end points or in IS.

We are planning to use the same model. This will be useful when you want to set timeout for the response. We need to meet 2 sec SLA. If I make direct soap/http, the service will be hanging there. As per webMethods Architect they recommend IS-broker-IS. This separates front end service and the back end services.

The timeout on a request/response is a decent capability. Hopefully there are more capabilities than this that you’re after though! :slight_smile:

Using watt.net.timeout should achieve the same thing–though it’s an IS global setting unfortunately.

As for separating services via a publish, they are really just coupled in another way. Assume you didn’t have Broker. How might you dynamically select which responder to use? An IS local publish is one way. A config file that indicates the service to call is another.

This won’t address load balancing or load sharing though, right? That can be addressed at appropriate points by invoking services through HTTP instead of directly–sending the request through a hardware load balancer.

I’m not sure introducing another component, and the polling time associated with IS pulling docs from Broker, helps with a 2 sec SLA.

My point isn’t to bash Broker, but instead that we need to recognize what Broker really does and that a solution must consider all the components along the entire path. Using Broker doesn’t magically cure end-point timeouts, it doesn’t add all that much more in terms of decoupling, and adds not insignificant complexity to a solution.

There are times when using Broker is a perfect fit. IMO, most times it is not. The real value in the platform is in IS.

I’d be interested in the criteria the wM Architect used in recommending IS-broker-IS. I hope it is something more than the company line of “that’s the way to do integration.”

I used to be in the “always use Broker” camp. Over time and some projects, I backed off of that. The complexity most often isn’t offset by benefits.

I’m sure others will contribute their views and experiences!

reamon, sorry I was out a couple of days… had a production issue to attend to.

So going back to the benefits we see or hope to see in a pub/sub model:

  1. As samwmusers mentioned timeout is one of the issues it helped us with because there are occations where the target system takes longer than expected to process the data and either our IS or the source system times out the session unnecessarily flagging failures. Also as you mentioned the timeout setting on the IS server is global and I don’t think it should be set to anything more thant 2 minutes anyways. Don’t want too many sessions/threads lingering around.

  2. In terms of load balancing, lets say a single incoming request generates multiple outgoing request, the pub/sub model helps distribute the work between IS instances in the cluster. Instead of looping over and processing every line item on the one instance(and possisibility causing performance problems in that one instance), we publish the line items out and let the trigger service on all IS instances process them concurrently(albeit I have observed its almost always not a 50/50 distribution).

  3. Another benefit is utilizing the broker’s guranteed delivery machanism to help recover from transient errors on the IS server but this is something I have yet to really confirm in action. We hope that by publishing the data (as soon as possible in the process), the broker would redeliver the doc to the other trigger instance if the first instance does not acknowledge for some reason. So we are looking for pub/sub to give us an “automatic retry” feature… within a limited capacity.

I think you are on the right track with the broker however there are a couple of things to clear up:

-the broker is not delivering anything, the IS Dispatcher pulls the docs/messages out of the broker. The broker is a fairly dumb switch as Rob pointed out. The IS Dispatcher is the real brains. The interaction between the dispatcher and broker is very robust and gives you a lot out of the box whether you are doing a pub/sub or a 1-1 integration. Stuff you would otherwise have to account for in code.

-Automatic retry is kind of misleading and it’s behavior should be well understood before becoming reliant on it. It does work very well but you do need to configure your triggers and flow services correctly in order for it to work. A lot of folks get confused on the difference between a service exception and a system exception(runtime) which is important when it comes to retries.

-The broker is very fast as storage and switch mechanism. The polling time can be adjusted for retrieving docs from the broker. On average I’ve found about an extra 50ms added to an integration, which really isn’t very much.

Rob and I don’t really agree :p: when it comes to the broker usage but that’s okay. That’s a really nice thing about the webMethods platform. There are a lot of different ways to do the same thing.

I think as long as you have consistent repeatable patterns that are fairly easy to support, agile for the business and resistent to change then you are on the right path regardless of the technology components. I personally have had a lot of success using messaging with our implementations. But I could take those same design principals and apply them to just about any technology although it would be more work. :wink:

  1. Seems to me that a system that is taking too long will still cause an IS session to linger. The requester IS instance may timeout waiting for a response via the Broker, but the responder IS instance will be stuck in its SOAP/HTTP call, right? IMO, request/reply over Broker isn’t a good thing to do. If a wait for response timeout is needed, I think there are ways to work that without resorting to Broker.

  2. This is exactly the type of scenario where the thought about inserting HTTP calls instead of publishes would apply. Instead of calling publish, call an IS service via HTTP through a virtual hostname/IP through a load balancer. Structure the entry point on the target service to spawn a processing thread and return immediately. Same net result as with using Broker. (Just pointing out alternative approaches.)

  3. As Mark pointed out, the triggers and trigger services must be configured correctly. And the trigger services need to be coded correctly. Guaranteed delivery is between the Broker and its clients. Once in a client, such as IS, you’re on your own to make sure it doesn’t get dropped. IS guaranteed delivery assures that a service gets invoked. How that service behaves is up to you.

I guess my main point is that the bulk of the “don’t lose it”, “keep track of what happened”, and “don’t wait forever” work is at the edges. IS is where you’ll need to do this and Broker doesn’t bring all that much to the table. It can be used to load balance message traffic to the IS instances but that was never its intended use. It can be used to facilitate retries but you need to be diligent about its use since one bad message, retrying forever, can gum up the works.

Broker is great for pub/sub, fire-and-forget type of operations. Beating a dead-horse here but most business integrations don’t fit that model. So you end up creating additional integrations and facilities to turn fire-and-forget into track-retryIfNeeded-and-report-back operations–things that are most often easier to do if fire-and-forget wasn’t introduced into the picture to start with.

griffima/reamon… I understand both sides of the argument… and yes this is also an endless debate here among my colleagues.

Basically, we have a set of guidelines/best practices to follow but at the same time evaluate each integration individually and design them accordingly.

So on this particular integration I am currently reviewing, the target is a JMS provider. I don’t particularly like the idea of having 2 messaging systems in a single integration.

It is a fun debate! But then one must ultimately make a choice and go. I really don’t have any problems with using Broker. It is using Broker blindly (which isn’t the case here) that causes me to get worked up! Too many folks use Broker simply because it’s there and feel it is the only thing that will decouple the source and target systems.

As for the use of 2 messaging systems, I can see the initial “blech” reaction. But if the use of Broker is the standard approach to connect the source side of the integration to the target side, then what is on the other side of the target shouldn’t much matter. The IS target service would get the message from Broker and pass it on to it’s target system, regardless of using HTTP, FTP, JMS, CORBA, RMI, et al. I assume that the JMS component isn’t part of the standard integration layer, so the fact that JMS happens to be used for communication/transport in this case shouldn’t cause much concern.

Hi
I am facing a problem related to “Broker server”.
It did a core dump and it is not starting now.
the following error i am getting :

Opening QS data stores.
Phase 1: Configuration QS instance “qs:///wmbadv1/Broker/data/BrokerConfig.qs” ok.
Phase 1: Runtime Data QS instance “qs:///wmbadv1/Broker/data/BrokerData.qs” ok.
Phase 1: QS Complete.

Phase 2: Opening Broker Storage Abstraction Layer (SAL).
Phase 2: Checking Broker Server Configuration Data.
Phase 2: Checking Broker “BABrokerDev”.
Phase 2: “BABrokerDev” has 7 Client Groups.
Phase 2: “BABrokerDev” has 76 Event Types.
Phase 2: “BABrokerDev” has 38 Clients.
Phase 2: Broker SAL Configuration Session “qs:///wmbadv1/Broker/data/BrokerConfig.qs” ok.
[BRC.300.1082] Phase 2: ERROR: Cannot create recovery session for “qs:///wmbadv1/Broker/data/BrokerData.qs”: 1122: Invalid argument
Phase 2: ERROR: Broker SAL Runtime Data Session “qs:///wmbadv1/Broker/data/BrokerData.qs” has errors.
[BRC.300.1077] Phase 2: ERROR: Broker SAL Non-Recoverable Failure! Further fixing may be required!
[Aborting]

if anyone knows how to resolve this,pls let me know
thanks
mallika