Take a look at the GEAR material on Advantage. It should be helpful.
Restoring the environment to a point in time can be VERY problematic. Integration data traffic is inherently transient and it is VERY difficult to recover properly at the integration layer. It is usually better to have the end-points resend docs as needed than to try to have the integration layer figure things out. It is usually best to focus your backup and recovery efforts to being able to restore service, not recover data that was in flight.
Of course others may have different experiences and points of view.
GEAR does not tell much about it … it is recommended to stop wM and backup the directory where is reside and the associated database.
But, my client environmemt is 52x7x24 with a 99.9% uptime. Even wM does not seem to know how to do that … I am wondering how their big customer like Motorola are doing backup ?
Yep, we are thinking of having at least two brokers (in territory) that will be attached to a couple IS.
Bkr#1 — IS - A
| — IS – B
|
Brk#2 — IS – C
— IS – D
Both brokers in the same territory.
But, what would happen if while we are performing a backup on, say IS A (which is down for the purpose of the backup) and we experiments a problem with IS B, the clustering will try to go on IS A – but will be unable since it is down for backup …
There must be a way to backup the whole landscape in a hot backup fashion like we do with database.
Yep, we are thinking of having at least two brokers (in territory) that will be attached to a couple IS.
Bkr#1 — IS - A
| — IS – B
|
Brk#2 — IS – C
— IS – D
Both brokers in the same territory.
But, what would happen if while we are performing a backup on, say IS A (which is down for the purpose of the backup) and we experiments a problem with IS B, the clustering will try to go on IS A – but will be unable since it is down for backup …
There must be a way to backup the whole landscape in a hot backup fashion like we do with database.
You client has 99.9% uptime. I think what you described of IS B going down while IS A is being backed up would be considered that .1%. Nothing absolutely nothing is 100% guaranteed.
How will we reprocess the failed transaction, or the document that are published on Bkr#1 and the IS-A is down ? In my past experience we included Trading Network in our architect so that we can resubmit the failed transaction, but if TN is not their, how will we resubmit the documents from broker.
thanks,
What if I get in the .1% ? My client is in the financials banking business, so their fears are to loss trx.
Mark strategies will work, I agree. What if I add a third IS … (Z) – that can act as the « fail over » node whiles either A or B are down for backup? Is that would be a fair statement?
So if IS – A is down for backup (B and Z are up), and I get into trouble, B can fail over Z (or vice-versa). Is the wM clustering mechanism allow 3 IS in cluster or there is a limit of only 2 ?
Bkr#1 — IS - A
| — IS – B
| — IS - Z
|
Brk#2 — IS – C
— IS – D
— IS – Z
I suggested using a hardware load balancer for IS traffic, not configuring an IS cluster. IS software clustering doesn’t do anything to help you with failover in most cases (use of custom java clients using the TContext class to connect is the exception).
IS clustering can help with load balancing across multiple IS servers. It is also the only out-of-the-box way to perform load balancing for non-HTTP IS transactions such as those initiated from third-party messaging solutions such as MQ Series or MSMQ.
I prefer hardware-based load balancing for IS HTTP/s transactions for simplicity’s sake and because of the ease of shutting off traffic to a single IS host using the load balancer’s built-in utilities.
You should set your client’s expectations that high-availability configurations are usually very expensive and attempt to understand their tolerance for the increased costs before spending too much time designing exotic architecture configurations that might not be financially viable.