Service to return top most repeating error log entries

Ok, thanks. Will try it out :slight_smile:

But this will only return the newest fifty lines of the log and not those, which are logged more often than others.

Might be a ask for the ELK stack.

Regards,
Holger

Hi Holger,
Yes, but I assume that there is no max value which you can pass to numlines so I think that if I pass something like 999999999999999999999 should be fine or? Nevertheless, I will test it when I have the time. I will have to implement a logic to pull out from the logEntries String list the most frequent entries. I just hope that it won’t take forever for the flow to complete, since the plan is to call the flow on all prod servers from all clusters.

br,
n23

Hi Niemand23,

Using some sort of scripting is better option as I feel. @Holger_von_Thomsen, @niemand23 , @John_Carter4 , @reamon your opinion?

Be careful with putting to large an upper limit, you could crash your IS with an out of memory error!
Trying to count error messages will be very difficult without doing some kind of query, in which case we go back to @reamon advice, which is to use a DB query.

As an aside in 10.7 we have a new statistics dashhboard and this might be a good candidate to be added to the services tab. We already have something similar for API’s called “Top 5 slow APIs”, so it might be useful to add a “top 5 failing services” to the services tab.
regards,
John.

1 Like

Ok, thanks for the advice!
To give some more context why this is needed, we have a lot of flows which were implemented with absolutely 0 regards to support and monitoring. For some of them, we already implemented elastic search logging but there are just too many of them and we cannot afford to just monitor the error logs on all prod servers which is why in our case the process will assign a task (we are not using the task engine but a custom webMethods solution) to whoever is on call to check most repeating errors and after solving them, the task will be completed (we are also not using the business console for this but a custom dsp as I think that it is better for monitoring because you always know for sure which flow is invoked via Ajax).

A (less-known) nifty option, from a long-term stats collection standpoint, would be to use the Event Manager feature of the IS to create a subscription to Exception events, using Designer. Your subscribing service can then write the error type (and/or message) to a table, that has a counter that you can increment for every re-occurrence of the same type of exception.

If you don’t want to depend on a point of failure (i.e., database), you can write the events to the filesystem or WxConfig (best option) if you are using it.

I understand that this is essentially duplicating the Error Log entries and adding performance overhead, but it’s a reliable option and you can expand to other types of events in the long run. A caveat is that the performance overhead introduced is proportional to the amount of service exceptions (i.e., events).

Check the Service Development Help guide under Designer.

2 Likes

We too use internal services. The intent of my post above is really just to make sure you knew that “you’re on your own” when using these. Sounds like you do!

Good idea,
Forgot about that,
The service is

pub.event:addSubscriber

and you have to set eventType to “Error”, “Error Event” or “Exception” :thinking:
Can’t remember, you will have to experiment.

You then specify a service that implements the spec

pub.event:exception

This is a great feature, that is undersold. We should give it a proper admin UI !!

thnx @Venkata_Kasi_Viswanath_Mugada1 for that
regards,
John.

@John_Carter4, indeed! I rue the fact that this useful feature is seldom used or publicized. It is a great aid for operational support.

I had leveraged this feature to build a monitoring solution for a customer 9 years ago (on v8.2) and have never seen it being used elsewhere - even in our Software AG delivery engagements.

A subscriber can be added via Designer and the pipeline can be saved to see the message contents. The documentation lists all the event types and the usage guidelines, so we don’t have to resort to trial and error :wink:

Yes,
I’m going to keep a close eye on brainstorm in the future. I have also used this feature actively in the past and find it very useful to implement non-invasive error handler and monitoring tools.
regards,
John.

@sai-krishna.atmuri , “better” always needs qualification. What are the pros and cons? Those within the context of a given environment will help determine which option suits the need for the given constraints. There is some appeal to scripting depending upon a number of factors – which scripting language? where does it run? does it provide capabilities beyond what other options provide? Is the talent pool sufficient to support it long term, or will the knowledge disappear when the original developer leaves the company?

Good call on leveraging the events facility.

Events used to feed ElasticSearch might be a great combination. Could use the spiffy ES tools to visualize and do all sorts of other cool things.

Edit: Of course LogStash would be an alternative to consider, connecting directly with file-based or DB-based server, error, stats, audit, etc. logs that the wM components produce.

1 Like

Many thanks guys! Now I think that I have enough solution proposals :slight_smile: I did not know that I can handle all exceptions using pub.event:addSubscriber. I think that it will be useful to send the exceptions to elasticsearch.

2 Likes

You can do more than just Exception events; give the documentation a check.
Can you pick a solution, so that this is topic closed as solved?

KM

Sort of interesting and a bit odd to view the community chats the same as support tickets – something to be “closed as solved.” Why would that matter in any way?

I’ve always found that skipping directly to a “solution” (or the best fit) saves time. This is especially true with posts that are often revived by people asking “I have the same problem, how did you solve it?” without going through all the responses.

No harm, no foul eh!

1 Like

Ah, okay. I guess I was too focused on the “closed” part. Marking a response as “Selected Answer”, or "Solution, or “Solved”, etc. is definitely useful.

Indeed, I didn’t word it right with “closed” :slight_smile:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.