Service to return top most repeating error log entries

niemand23 · April 22, 2021, 1:18pm

Hi there,
Is it possible to implement a service which would return top most repeating error log entries from the webMethods intergration server. This is needed for adding another step to our daily checks process which automates checking various components on all prod servers such as adapter connections, schedulers, triggers… From my point of view, it would be helpful to get a report of top most repeating error log entries from prod servers.

Thank you in advance,
n23

reamon · April 22, 2021, 1:49pm

If you’re storing the error data in a DB (rather than a file) then this becomes an exercise in creating the right SQL query.

If in a file, the job is a bit more complex. And may be more appropriate for other tools such as a log watcher or something suited for processing semi-structured/unstructured data.

niemand23 · April 22, 2021, 2:01pm

Hi reamon,
Thanks for the answer! It seems that our ISs are configured to store the error log entries in WMERROR* files inside the /logs directory. I also checked table WMERROR but it does not contain any entries. I would rather not involve any third party apps for this. In this case, I will check if I can somehow define a schema for the file and try to parse it.

Best regards,
n23

John_Carter4 · April 22, 2021, 2:23pm

If you upgrade to 10.7 there is an official API for accessing the server logs.
If older take a look at this service.

wm.server.query:getPartialLog?log=error&numLines=50&startLine=0

Other possible arguments
descendchecked
startDate
endDate

Better than going via the DB, db schema can change between version and the above works for both file or db destinations.
regards,
John.

reamon · April 22, 2021, 5:59pm

Downside: customer use of internal services is not supported.

John_Carter4 · April 22, 2021, 9:10pm

true, but neither is direct DB access and service is less likely to change

reamon · April 22, 2021, 9:29pm

Apologies for inferring that DB access would be preferable over the internal services. Definitely should try to use the services instead. Just a caveat to the OP that SAG support won’t help – with either approach – should issues be encountered.

John_Carter4 · April 23, 2021, 6:02am

No problems Rob

You’re right, neither are supported, but the services behind the admin DSP pages rarely change. Also with our migration to the new admin UI and Admin API (documented) means these services will now never change as they will be replaced. The biggest risk is that in the future they might disappear, but that is unlikely (unless you choose to, we might make them optional in the future) as we rarely remove existing features to ensure backward compatibility.
regards
John.
PM

niemand23 · April 23, 2021, 6:26am

Thank you guys for your answers!
@John Now we are using version 9.9 but we plan to upgrade to 10.7. I will try out service wm.server.query:getPartialLog. I assume that I will have to use the same hack that I use when calling internal services to use service invoke instead of directly calling the service.
@reamon I know that it is not one of the best practices to use internal services but unfortunately there are no better alternatives. For instance, for checking the state of the triggers we are using an internal service: wm.server.triggers:getTriggerReport.

John_Carter4 · April 23, 2021, 7:00am

You can call it via http, I tested it using an external client with

http://localhost:5555/invoke/wm.server.query/getPartialLog?log=error&numLines=50&startLine=0

or if you want to call it from a flow service you can cheat by adding a debugLog step to the flow, then look at the service properties and replace the service attribute with “wm.server.query:getPartialLog”

boom, you can now use the service just like a normal service, complete with all the inputs and outputs.

regards,
John.

niemand23 · April 23, 2021, 7:59am

Ok, thanks. Will try it out

Holger_von_Thomsen · April 23, 2021, 8:16am

But this will only return the newest fifty lines of the log and not those, which are logged more often than others.

Might be a ask for the ELK stack.

Regards,
Holger

niemand23 · April 23, 2021, 8:22am

Hi Holger,
Yes, but I assume that there is no max value which you can pass to numlines so I think that if I pass something like 999999999999999999999 should be fine or? Nevertheless, I will test it when I have the time. I will have to implement a logic to pull out from the logEntries String list the most frequent entries. I just hope that it won’t take forever for the flow to complete, since the plan is to call the flow on all prod servers from all clusters.

br,
n23

anon39243546 · April 23, 2021, 8:29am

Hi Niemand23,

Using some sort of scripting is better option as I feel. @Holger_von_Thomsen, @niemand23 , @John_Carter4 , @reamon your opinion?

John_Carter4 · April 23, 2021, 8:54am

Be careful with putting to large an upper limit, you could crash your IS with an out of memory error!
Trying to count error messages will be very difficult without doing some kind of query, in which case we go back to @reamon advice, which is to use a DB query.

As an aside in 10.7 we have a new statistics dashhboard and this might be a good candidate to be added to the services tab. We already have something similar for API’s called “Top 5 slow APIs”, so it might be useful to add a “top 5 failing services” to the services tab.
regards,
John.

niemand23 · April 23, 2021, 9:09am

Ok, thanks for the advice!
To give some more context why this is needed, we have a lot of flows which were implemented with absolutely 0 regards to support and monitoring. For some of them, we already implemented elastic search logging but there are just too many of them and we cannot afford to just monitor the error logs on all prod servers which is why in our case the process will assign a task (we are not using the task engine but a custom webMethods solution) to whoever is on call to check most repeating errors and after solving them, the task will be completed (we are also not using the business console for this but a custom dsp as I think that it is better for monitoring because you always know for sure which flow is invoked via Ajax).

Venkata_Kasi_Viswanath_Mugada1 · April 23, 2021, 2:40pm

A (less-known) nifty option, from a long-term stats collection standpoint, would be to use the Event Manager feature of the IS to create a subscription to Exception events, using Designer. Your subscribing service can then write the error type (and/or message) to a table, that has a counter that you can increment for every re-occurrence of the same type of exception.

If you don’t want to depend on a point of failure (i.e., database), you can write the events to the filesystem or WxConfig (best option) if you are using it.

I understand that this is essentially duplicating the Error Log entries and adding performance overhead, but it’s a reliable option and you can expand to other types of events in the long run. A caveat is that the performance overhead introduced is proportional to the amount of service exceptions (i.e., events).

Check the Service Development Help guide under Designer.

reamon · April 23, 2021, 3:18pm

We too use internal services. The intent of my post above is really just to make sure you knew that “you’re on your own” when using these. Sounds like you do!

John_Carter4 · April 23, 2021, 3:51pm

Good idea,
Forgot about that,
The service is

pub.event:addSubscriber

and you have to set eventType to “Error”, “Error Event” or “Exception”
Can’t remember, you will have to experiment.

You then specify a service that implements the spec

pub.event:exception

This is a great feature, that is undersold. We should give it a proper admin UI !!

thnx @Venkata_Kasi_Viswanath_Mugada1 for that
regards,
John.

Venkata_Kasi_Viswanath_Mugada1 · April 23, 2021, 4:30pm

@John_Carter4, indeed! I rue the fact that this useful feature is seldom used or publicized. It is a great aid for operational support.

I had leveraged this feature to build a monitoring solution for a customer 9 years ago (on v8.2) and have never seen it being used elsewhere - even in our Software AG delivery engagements.

A subscriber can be added via Designer and the pipeline can be saved to see the message contents. The documentation lists all the event types and the usage guidelines, so we don’t have to resort to trial and error

Topic		Replies	Views
webMethods is a corporation not a product If-it-doesn-t-fit-anywhere-else	10	6242	September 3, 2021
SAP adapter message: “Confirm action for nonexisting Transaction TID…” SAP , TID	18	2766	November 4, 2021
Not able to reply to topis in "Community Forum Index » General Category » General Forum" Feedback & ideas Discussions-on-this-Forum-System , Feedback	10	1913	June 12, 2017
How to get on-prem IS to upload a file into Azure 'Blob' storage azure	32	8501	May 2, 2022
Concurrency and cluster in wM IS discussion	31	1117	September 18, 2024

Service to return top most repeating error log entries

Related topics