The publish/subscribe approach to data delivery/dissemination is a powerful one. I won’t go into any descriptive detail about what pub/sub is here (we can do that separately if needed for folks not familiar with the concept).
Rather, I’m interested in learning what other people’s views are regarding when to use a pub/sub approach in an integration.
My intent isn’t to focus on the differences between ES and IS, but rather explore the “value-add” of pub/sub itself. For sake of discussion, let’s assume that IS itself has a pub/sub facility, and thereby avoid product comparisons.
What are the guidelines and rules-of-thumb for the “appropriate” application of a pub/sub solution?
Based on volume?
Based on within the firewall and outside the firewall?
Based on number of potential subscribers and the stability of the subscriber list?
Based on the nature of the process being automated?
Pub/sub is a great decoupling mechanism. I contend, however, that there are a number of common practices that defeat this decoupling. Request/reply is one of them. Not using canonical docs is another. Publishers tracking the progress of subscribers is yet another. I further contend that most integrations based on pub/sub do not need to use pub/sub–but do so simply because that’s how the tool they used works.
What business processes fit the pub/sub model? Which don’t?
Rob, forcing Request/Reply functionality into the Publish/Subscribe model is one my biggest architectural no-nos.
The entire premise of Publish/Subscribe is based on the fundamental notion that each application operates in ignorance. Data is data and that is all that is relevant. Doesn’t matter from where it comes, it is only data.
Pub/Sub fits best when a data source has multiple targets. A data stream spit from a socket, for example, that must be digested by two or more applications somewhere else. Where the value of Pub/Sub really comes in, though, is when daisy-chained applications stuck together with bubble gum and thumbtacks get decoupled and attached to the brokering hub.
The most difficult part of Pub/Sub is enforcing it. Business users must recognize that “that way it’s always been done” wasn’t always the way it was done, if you know what I mean. With less people and more tasks, though, it is understandable (but not forgivable!) why business users often spend much less time than required to sort through the nasties of Pub/Sub and making it work.
To illustrate the advantages of a string Pub/Sub model, I like to take a group of business owners/users into a conference room for a game of “Whisper Down the Lane”. Funny how the fifth person in line gives a message that does not even resemble the original sentence.
I end the exercise by saying “Why play ‘Whisper Down the Lane’ when you can broadcast the same message over the PA system to everyone at once?”
That seems to get the point across… for 30 minutes, in any case!
I am curious what others think. Thanks for starting this thread, Rob.
NB: Apologies to bump up an old thread here, but this topic really tempted me to do so.
I will outline the benefits of pub sub model to below:
Multiple targets – When a data source has multiple targets involved, the pub sub model finds its best usability. The publisher can publish the document, all the system requiring similar data can subscribe to this document and fetch.
Decoupled and flexible to Scale – The software built on pub sub model are loosely coupled (publisher & subscriber are decoupled) and hence flexible for any future change. Adding or changing functionality won’t send ripple effects across the system, because Pub/Sub allows you to flex how everything uses everything else.
Guaranteed delivery – It has several checkpoints that allows – when the intermediatory platform/ system are not available it will re-attempt to process the data.
The Cons that I see in this:
Infrastructure overhead and regular UM patching required - Messaging provider eg: Universal Messenger (UM) which acts as the intermediary for this process must be regular patched for issues and as well be frequently monitored and addressed over in case of issue - otherwise we can have a lot of data losses happening, which can be quite impact-full especially in Banking transactions.
An almost 20 year old post. Never thought it would still be around after so much time.
Things I would change about it:
“Decoupled”: - pub/sub is not a “decoupling mechanism.” It is a “loose-coupling” mechanism. Publishers and subscribers are loosely coupled in a number of ways. But there is still a level of coupling. Discontinue the publisher and the subscribers will notice. Or change it in significant ways (frequency, scope, etc.) and the subscribers will notice. Certainly the level of change possible without impacting others is high (depending upon the details of course), but it is not unlimited. Decoupled systems don’t interact at all.
“Canonical” - as with so many terms in IT, this one got diluted to become meaningless. The idea is still good – a “standard” (canonical) definition of a document (or data model) that is intended to support but not tied to participating end points to the degree possible. But that morphed into a view by many of any document definition that was published. E.g. simply by virtue of being published made it a “canonical” document, which was never the intent. These days I avoid the use of the word canonical at almost all costs.
Reviewing this thread has made me realize I probably need to be an advocate for the use pub/sub more often than I’ve been doing the past few years. The ubiquity and focus on application APIs, particularly so-called “REST” APIs, have led to point-to-point solutions more often than perhaps is prudent in many cases.
Regarding Himanshu’s note about UM, things seem to have calmed down/stabilized. I was disappointed when Broker was “deprecated” in favor of UM. We ended up going through similar growing pains with UM that we went through with Broker many years ago. As Himanshu indicated, it is indeed important to have reliable messaging engine.