Monitoring scheduled tasks

I am trying to create a monitor that will generate an alert when a scheduled task has been running too long. By calling:

com.wm.app.b2b.server.scheduler.ScheduledTask schTasks = com.wm.app.b2b.server.scheduler.ScheduleManager.getTasks();

I get all the scheduled tasks and find the ones that are running. By comparing scheduledTask.getNextRun() to the current time I was hoping to determine if the task was supposed to have run but has not because the previous instance is still running. Then using the scheduledTask.getInterval() I can determine how bad it is and whether or not to generate a minor or a major alert.

Unfortunately when I test it using a scheduled service that simply sleeps (calls Thread.sleep(n)), the getNextRun() always returns a future time and is different every time. I would expect that a scheduled service that is scheduled to run say every 30 seconds and has been running for several minutes without ending, to get the same getNextRun() until it ends and starts again. Is my expectation incorrect? Is there a better way to monitor for a runaway scheduled service?

What is your IS version?

Do you see any particular errors in the logs and try to restart the IS and all the schedulers info are stored in the IS_USERTASKS table that is displayed in the Schedulers page.

HTH,
RMG

I have tested this on WM 7.1.2. We do have 8.2 also but I have not tried it there, but I would need it to work on 7.1.2, 8.2 and eventually 9.6.

There are no errors. I log the data and can indeed see that the getNextRun() method is returning a new value every time.

I do see the services in the IS table.

Hi Mehdi,

Can you kindly share the code to better give comments on this as I feel there might be some issue in your code.

Thanks,

Here is the code snippet:


		try {

			// Get all scheduled tasks and their state
			com.wm.app.b2b.server.scheduler.ScheduledTask [] schTasks = com.wm.app.b2b.server.scheduler.ScheduleManager.getTasks();

			for ( int i = 0; i < schTasks.length; i ++ ) {
				com.wm.app.b2b.server.scheduler.ScheduledTask st = schTasks[ i ];

				logMsg(  "[gxs.smg.service.util:monitorScheduledServices] "
						+ "[" + i + "] " + st.getService() + " State: " + st.getState(), JournalLogger.VERBOSE2 );

				// We only care about running services
				if ( com.wm.app.b2b.server.scheduler.ScheduledTask.STATE_RUNNING != st.getState() ) {
					continue;
				}

				Date now = new Date();
				String sn = ( -1 < st.getService().indexOf( ":" ) ? st.getService().substring( st.getService().indexOf( ":" ) + 1 ) : st.getService() );

				logMsg( "[gxs.smg.service.util:monitorScheduledServices] "
						+ "Running service: '" + st.getService()
						+ " Interval: " + st.getInterval()
						+ "\n" + sn + " NextRun: " + st.getNextRun()
						+ "\n" + sn + "     Now: " + now.getTime()
						+ " Running too long? " + ( now.getTime() > st.getNextRun() )
						+ " >> " + st.toString()
						, JournalLogger.DEBUG
						);

				// Alert if the service has been running too long
				if ( now.getTime() > st.getNextRun() ) {

					long diff = now.getTime() - st.getNextRun();

					if ( diff > ( majorAlert * st.getInterval() ) ) {
						logMsg( "[gxs.smg.service.util:monitorScheduledServices] "
								+ "Major Alert: scheduled service: '" + st.getService()
								+ "' is still running when it should have run already! "
								+ " Interval: " + st.getInterval()
								+ " NextRun: " + st.getNextRun()
								+ " Now: " + now.getTime()
								+ " Diff: " + ( diff / 100 )
								+ " >> " + st.toString()
								, JournalLogger.ERROR
								);
						continue;
					}

					if ( diff > ( minorAlert * st.getInterval() ) ) {
						logMsg( "[gxs.smg.service.util:monitorScheduledServices] "
								+ "Minor Alert: scheduled service: '" + st.getService()
								+ "' is still running when it should have run already! "
								+ " Interval: " + st.getInterval()
								+ " NextRun: " + st.getNextRun()
								+ " Now: " + now.getTime()
								+ " Diff: " + ( diff / 100 )
								+ " >> " + st.toString()
								, JournalLogger.ERROR
								);
						continue;
					}

				}

			}
		} catch ( Exception e ) {
			logMsg( "[gxs.smg.service.util:monitorScheduledServices] "
					+ "Unexpected Result: Encountered an exception while monitoring scheduled services: scheduled service: " + e.getMessage(), JournalLogger.ERROR );
			e.printStackTrace();
		}

FYI, I ran this service on WM 8.2 and got the same result.

In 8.2, the getTasks() method has been changed and requires two new parameters:

com.wm.app.b2b.server.scheduler.ScheduleManager.getTasks(internalTasksOnly, thisISOnly)

After setting those ( getTasks( false, true ) ), the getNextRun() continues to get updated every time the call is made!

ok make sense and thanks for the update. :smiley: