Monitoring Enterprise Server

I question if it’s possible to effectively monitor Enterprise Server applications. webMethods gives us reasonable ways to monitor brokers themselves. With simple API programs (and I assume tools such as the webMethods Monitor) we can tell if queues are getting too large, or if adapters are not connected.

On the application side, we can trap errors using try-catch blocks. However, sometimes applications throw errors that we don’t catch. For these untrapped errors, all we get is adapter:errorNotify, which (up to v5, at least), only doesn’t even tell us which adapter gave rise to the error. All you get is an error message, such as:

(316) Could not execute SQL statement “SELECT tri
m(t1.WMEDBT), trim(t1.WMEDTN), trim(t1.WMEDUS), trim(t1.WMAN8), trim(t1.WMDL01),
trim(t1.WMCTYA), trim(t1.WMAN81), trim(t1.WMDL10), trim(t1.WMSTATE), trim(t1.WM
STATEDES), trim(t1.WMCTPE), trim(t1.WMDL02), trim(t1.WMMKT4), trim(t1.WMMKT5), t
rim(t1.WMCTR), trim(t1.WMACFL), trim(t1.WMEDSP), t1.wm_rowid FROM TESTDTA.F0000
194 t1 ORDER BY wm_rowid”.
(42000/904) ORA-00904: invalid column name

java.sql.SQLException: ORA-00904: invalid column name 

Here at J&J, we have brokers with many integrations from several organizations within our enterprise that operate completely independently. It may not be easy for a human to tell the source of an error from the information above. And it would be very difficult to automate the process of monitoring such errors. For example, we have an integration that writes all adapter:errorNotify documents to a file. Someone should get an e-mail if this file gets an entry. But who should get the e-mail? We have perhaps 50 developers, scattered across 10 different companies within J&J.

If only had a clean way of determining which integration this error was associated with, I could filter the errors by integration, and establish an owner for each integration.

This is the only issue I’ve found so far with managing applications. As I’ve said, errors that we trap can be handled nicely.

Thanks in advance for any help/thoughts.

Best Regards,

Mike

The error message field in an Adapter::errorNotify event will have a simple error message as you’ve described. However, the _env structure of that document contains lots of other information, including the publication time, adapter name, and (I think) integration component that generated the error. This should be plenty of information for you to route the error to the appropriate administrator.

Having said that, it still would be nice to have an easier way to monitor complete integrations that may spread across multiple servers/adapters/brokers. The webMethods Modeller tool (aka Business Integrator) is an attempt to build such a monitoring structure. I suggest you look into using this tool in the future for enhanced monitoring capabilities.

Agrre with Skip.

Here a couple of general suggestions regarding Error Handling:

  1. For adapter errors, include a try/catch step to catch Adapter Errors and publish the errorNotify or any other known Document Types. In addition to the exception message/stack-trace received by the Exception itself, you might want to use java introspection APIs to figure out which particular Integration Component is throwing the error message and make this part of the “body” of the exception (or you could hardcode the name of integration component). Using the webMethods API you can also get some more platform related details.

  2. At steps when you think you have a custom business logic which might throw exception, try/catch them and throw custom instantiated Adapter Exceptions (which get published as Adapter::errorNotify documents and follows previous point).

  3. “Log the Exception”. You have a couple of options, You can either send an email notification upon receiving Adapter::errorNotify or log into the file or pass it to an external “error management/monitoring” application which can have business rules as to who to send emails to based on the body of the exception and other environment properties etc. based on the content of the error message (maybe even utilize some pattern matching kind of stuff here). (On a side, we have actually built an interesting add-on application around this; send me an email if you need more info.)

  4. If possible, log the integration using the Enterprise Logger Adapter. Then you can “correlate” the Activation Id which is published with the error:Notify document with the Activation Id which is logged in the logging database, which using the webMethods Integration Monitor or the Integration Log database itself can be used to get the integration component and instance which caused the exception. In some scenarios it is even possible to resubmit the document using the webMethods Integration Monitor as well. The activation Id is published as part of the envelop (_env) structure.

Best Regards

Hitesh

This is useful feedback. Looking at the complete adapter:errorNotify document, I see that we have access to the adapter which published the error:

event Adapter::errorNotify {

unicode_string adapterType = "JDBC"; 

unicode_string errorCategory = "Adapter"; 

unicode_string errorText = "(316) Could not execute SQL statement \"SELECT   trim(t1.WMEDBT), trim(t1.WMEDTN), trim(t1.WMEDUS), trim(t1.WMAN8), trim(t1.WMDL01), trim(t1.WMCTYA), trim(t1.WMAN81), trim(t1.WMDL10), trim(t1.WMSTATE), trim(t1.WMSTATEDES), trim(t1.WMCTPE), trim(t1.WMDL02), trim(t1.WMMKT4), trim(t1.WMMKT5), trim(t1.WMCTR), trim(t1.WMACFL), trim(t1.WMEDSP), t1.wm_rowid  FROM TESTDTA.F0000194 t1 ORDER BY wm_rowid\".\n(42000/904) ORA-00904: invalid column name\n\n    java.sql.SQLException: ORA-00904: invalid column name\n"; 

long eventId = 0;     

struct { 

    date enqueueTime; 

    date recvTime; 

    unicode_string pubId; 

    int age; 

} _env = { 

    enqueueTime = "04/25/2003 08:47:12.362"; 

    recvTime = "04/25/2003 08:47:12.362"; 

    pubId = "MDDUSN084_GSCJDESrcJDBCAdapter"; 

    age = 0; 

}; 

};

I still don’t see how I can know the integration, but at least this tells me what company created the integration and possibly the application.

Mike,

You may want to create an errorUDM to capture all information you want such as Integration, Application, Intetegration Components and so on…we did this and it works fine.

Has anyone tried BMC Patrol for webMethods? We at Dell Computer Corporation are trying to get away from monitoring webMethods with webMethods. BMC has “knowledge modules” (KMs) for enterprise server, adapters, IS, etc.

-Roy

Roy,

Wells Fargo is using this too…
FYI, there is another J2EE/JMX product you may want to take a look and it is very nice and provides some insights into your integrations, one of the best is that it is based on open standards.
http://www.intersperse.com

J-