Server Internal Email Notification spamming

We have IS 6.1 server with the Admin Internal Email address, which reports critical log entries; and the address for the Service Email, that sends messages about service failures directed to maillists.
The configuration for email notifications is found on the server settings page at:
Settings > Logging

The reason for using maillists in our enviornment is that we have multiple individuals who need to know this information, however the same problem would exist if we only configured to a single email address.

The Problem:
Last week, we had an episode where a developer had a flow service that errored and generated 19,000 plus entries to the listserves thus taking them down and causing much grief for our email administrator.

We have been requested to disable the email notifications configuration that can be potentially generated by the WMServer. From an WM Administrator standpoint, does anyone know a way to stop such
a spam attempt from the WM Server? The ideal situation would be a single warning that the service/critical error had occurred, and then later a message that 18,999 are pending but suppressed, or some similar solution. We do get valid informational messages from the current configuration , but on occasion we do get these spam problems.
Anyone have any clues? How are you handling this situation? I plan to file a request with WM support to see what is “best practice” from the vendor perspective.
Thanks in advance for any thoughts/ideas you may have.

Randy,
We use a custom common error handler for our flow services. We require each flow service to use it and we trap all of the errors through this. The service is configurable as to what level an alarm is raised ie low, medium, high. So we do not use the internal logging email as it tends to be a little more verbose. So we can tell it , just write to a log, email or page depending on the severity. But we have had these mass emails ourselves.

However, if the flow service that generated 19,000 entries represents a single atomic transaction with each iteration of the flow, welcome to the world of real-time event driven integration. While there are no built-in solutions to handle this, you will probably get some answers from folks here on some customs solutions on how to correlate un-correlated errors. In other words if transaction A fails because a database call fails due to an infrastructure error, transaction B (same flow) also fails for the same error. However, the Integration Server is not going to make that correlation even though we can infer it.

In the above case you could have the flow service use retry logic and keep the other 18,999 in the broker queue until the database became available again. If it is a logic error, have your developers test better.

markg
http://darth.homelinux.net

Randy-
Is this a production system that generated 19k? (Hey I got you beat, we once did one that generated 154,000 e-mails). Anyway, I hope that this developer error was in your dev/test systems only. I would suggest that once you have tested the e-mail notification process, you disable it in non-production systems.

Haven’t we all been threaten by our email administrators?

markg
http://darth.homelinux.net

I’ve seen an approach that is simple but works reasonably well.

The idea is to keep track of the types of e-mail alerts you’re sending and send the same type only 1 in a configurable timeframe.

For example, we had a service that was called whenever an e-mail was to be potentially sent. We used the receiver and the TN doc type as the key of an entry made into the repository. That entry holds the time of the last e-mail submission. The service would first check if the entry existed and if so, check to see if enough time had elapsed. If not enough time had passed, no e-mail would be sent (but a server.log entry made). If the entry did not exist or the time had elapsed, then the e-mail would be sent and the entry added/updated. This prevented e-mail floods.

Maybe this approach would work in your scenario, with keys and time lags that are appropriate.

Thanks for your suggestions thus far. This confirms what I expected, however I do have a case logged with webMethods and will share the official word I get from support with this list.
Still open for all other ideas…and Yes, Roger this was a development system, but it could happen today in production if bad code slipped through our migration methodology and QA processes.

Rob,
Isn’t there any refined way to handle this rather than the approach you above mentioned. I too had this problem but this solution somehow doesnt seem appealing to me.

Regards,
Pradeep

I guess that would depend on your definition of “refined” or “appealing.” The approach has these benefits:

  • Relatively simple.
  • Caller sets the key so can be as fine or coarse grained as desired.
  • Avoids e-mail floods.
  • But still alerts to a problem.

Has these drawbacks:

  • Uses the repository to store the last sent time (this is a minor issue since if you lose the repository for some reason, it doesn’t matter) and adds some I/O overhead
  • Support people need to understand that not every error will be alerted–if an alert is received, support needs to check for all errors of that type, not just the one they received the e-mail for.

I’m not aware of anything out-of-the-box that would achieve the desired behaviour. I’d be interested in hearing about other approaches.

Just for information, this is the response from vendor support with my reply.


Subject: RE: 1-56126286 | Internal Email notification spamming

Randy,

Thank you for your recent support request. This issue is being tracked as SR 1-56126286.

To answer your questions -

Integration Server provides a capability called “Event Handling”, in which you can write services (event handlers), subscribing to specific events.

There are different types of events you can subscribe to, depending on requirement.

For your case, you might be interested in subscribing to “Exception” events.
Further, you can add filter conditions to event subscriptions, which will allow you send notifications selectively.

E.g. If you want to send an email alert to specific set of engineers, if specific exception is encountered.

At a high level, You will need to write a flow service, which uses “pub.client:smtp” to send an email to designated users.

This flow service will subscribe to “Exception” events, and will check the value of “errorType”, in the input it receives, when “Exception” event occurs on IS.

Detailed steps for building Event Handlers, and different types of Events are documented in our “Developer Users Guide” (DeveloperUsersGuide.pdf),

bundled alongwith the product installation - “Chaper 15: Subscribing to Events” - elaborates this feature.

Let me know if this information helps.

webMethods Technical Services
US Phone: 1-888-222-8215

www.webMethods.com


My response dated… Wed, 08 Jun 2005 10:50:37 -0400

Mime-Version: 1.0
X-UID: 2089
To support:
Yes, we are evaluating that as an option.
What we desire is really an escalation process path and from my initiated thread on the wmusers list
http://www.wmusers.com/wmusers/messages/6861/50034.shtml?1117575555 this appears to be a common problem where customers are creating their own solutions to resolve.

We too are looking at a custom solution.
This truly would be beneficial for you as the vendor to provide a “configurable” utility within the product to handle an escalation path for such error events.
Maybe an addition to PSUtilities if the vendor would not support a “pseudo” generic solution?
You have something similar for the excessive login messages errors that are generated.
If you determine to file this to backline support for a Feature request on our behalf,
also mention that the Service Email designation and Internal Email designation cannot be used (at least in 6.1) for multiple email addresses.
This forces one to use an email list or mailbox designation for this configuration to have more than one person notified.
Using mail lists forces email administration ramifications.
Regards,

  • Randy