Scheduler Threads are in waiting because of trigger task

We are running 8.2 version on Unix box.
for the past few days schedulers were in running state for very long period of time.
once restarted everything went normal.

On analysing the thread dumps at the time of issue we found out following is causing the scheduler threads into waiting state.

Schedulers are in waiting for monitor entry as the following TriggerTask as a lock on the resource.

“TriggerTask:4:FXrLXXX.SubUpdxxxxxxxxxx.trigger:subXXXXXXXXXs” prio=7 tid=6000000003abf800 nid=107 lwp_id=991486 in Object.wait() [87fffffec76ff000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <87fffffef3866098> (a com.wm.driver.data.fs.FSData$Hash$Ent)
at com.wm.driver.data.fs.FSData$Hash$Ent.getEntry(FSData.java:3733)
- locked <87fffffef3866098> (a com.wm.driver.data.fs.FSData$Hash$Ent)
at com.wm.driver.data.fs.FSData$Hash.get(FSData.java:3653)
at com.wm.driver.data.fs.FSData.getEntry(FSData.java:2085)
at com.wm.driver.data.fs.FSDEntry.getReferencedEntry(FSDEntry.java:455)
at com.wm.driver.data.fs.FSDEntry.getIntRef3(FSDEntry.java:461)
at com.wm.driver.data.fs.FSDEntry.getNext(FSDEntry.java:355)
at com.wm.driver.data.fs.FSData.getNullEntry(FSData.java:2204)
- locked <87fffffee197fae8> (a java.lang.Object)
at com.wm.driver.data.fs.FSData.allocDataBlock(FSData.java:2497)
- locked <87fffffee197fad8> (a java.lang.Object)
at com.wm.driver.data.fs.FSData.allocDataBlock(FSData.java:2452)
at com.wm.driver.data.fs.FSData.createDataChain(FSData.java:2425)
at com.wm.driver.data.fs.FSDirCursor._encode(FSDirCursor.java:630)
at com.wm.driver.data.fs.FSDirCursor._setKey(FSDirCursor.java:625)
at com.wm.driver.data.fs.FSDirCursor._insertAfter(FSDirCursor.java:747)
- locked <87fffffee1a74b30> (a com.wm.driver.data.fs.FSDEntry$FSDEntryRef)
at com.wm.driver.data.fs.FSDirCursor.insertAfter(FSDirCursor.java:483)
at com.wm.util.data.TxnData$Element.backingCommit(TxnData.java:2295)
at com.wm.util.data.TxnData$Txn.commitTXN(TxnData.java:4219)
- locked <87fffffee1d4fd98> (a java.lang.Object)
at com.wm.util.data.TxnData$Txn.commitTXN(TxnData.java:4116)
at com.wm.app.store.impl.TSConsumer._persist(TSConsumer.java:460)
at com.wm.app.store.impl.TSConsumer.persist(TSConsumer.java:383)
at com.wm.app.b2b.server.dispatcher.PersistenceManager.persistToTriggerStores(PersistenceManager.java:409)
- locked <87fffffee0d61cc0> (a com.wm.app.store.impl.TSConsumer)
at com.wm.app.b2b.server.dispatcher.trigger.TriggerManager.deliverToTriggers(TriggerManager.java:522)
at com.wm.app.b2b.server.dispatcher.LocalProducer.deliverToTriggers(LocalProducer.java:33)
at com.wm.app.b2b.server.dispatcher.LocalProducer.persist(LocalProducer.java:51)
at com.wm.app.b2b.server.dispatcher.Dispatcher.publish(Dispatcher.java:266)
at com.wm.app.b2b.server.dispatcher.DispatchFacade.publish(DispatchFacade.java:223)
at com.wm.app.b2b.server.dispatcher.DispatchFacade.publish(DispatchFacade.java:186)
at wm.server.publish.publish(publish.java:304)
at sun.reflect.GeneratedMethodAccessor182.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.wm.app.b2b.server.JavaService.baseInvoke(JavaService.java:443)
at com.wm.app.b2b.server.invoke.InvokeManager.process(InvokeManager.java:643)
at com.wm.app.b2b.server.util.tspace.ReservationProcessor.process(ReservationProcessor.java:41)
at com.wm.app.b2b.server.invoke.StatisticsProcessor.process(StatisticsProcessor.java:44)
at com.wm.app.b2b.server.invoke.ServiceCompletionImpl.process(ServiceCompletionImpl.java:243)
at com.wm.app.b2b.server.invoke.ValidateProcessor.process(ValidateProcessor.java:51)
at com.wm.app.b2b.server.invoke.PipelineProcessor.process(PipelineProcessor.java:171)
at com.wm.app.b2b.server.ACLManager.process(ACLManager.java:276)
at com.wm.app.b2b.server.invoke.DispatchProcessor.process(DispatchProcessor.java:30)
at com.wm.app.b2b.server.AuditLogManager.process(AuditLogManager.java:363)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:547)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:386)
at com.wm.app.b2b.server.ServiceManager.invoke(ServiceManager.java:234)
at com.wm.app.b2b.server.ServiceManager.invoke(ServiceManager.java:94)
at com.wm.app.b2b.server.Service.doInvoke(Service.java:652)
at com.wm.app.b2b.server.Service.doInvoke(Service.java:540)
at pub.publish.publish(publish.java:146)
at sun.reflect.GeneratedMethodAccessor181.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.wm.app.b2b.server.JavaService.baseInvoke(JavaService.java:443)
at com.wm.app.b2b.server.invoke.InvokeManager.process(InvokeManager.java:643)
at com.wm.app.b2b.server.util.tspace.ReservationProcessor.process(ReservationProcessor.java:41)
at com.wm.app.b2b.server.invoke.StatisticsProcessor.process(StatisticsProcessor.java:44)
at com.wm.app.b2b.server.invoke.ServiceCompletionImpl.process(ServiceCompletionImpl.java:243)
at com.wm.app.b2b.server.invoke.ValidateProcessor.process(ValidateProcessor.java:51)
at com.wm.app.b2b.server.invoke.PipelineProcessor.process(PipelineProcessor.java:171)
at com.wm.app.b2b.server.ACLManager.process(ACLManager.java:276)
at com.wm.app.b2b.server.invoke.DispatchProcessor.process(DispatchProcessor.java:30)
at com.wm.app.b2b.server.AuditLogManager.process(AuditLogManager.java:363)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:547)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:386)
at com.wm.app.b2b.server.ServiceManager.invoke(ServiceManager.java:234)
at com.wm.app.b2b.server.BaseService.invoke(BaseService.java:194)
at com.wm.lang.flow.FlowInvoke.invoke(FlowInvoke.java:324)
at com.wm.lang.flow.FlowState.invokeNode(FlowState.java:584)
at com.wm.lang.flow.FlowState.step(FlowState.java:444)
at com.wm.lang.flow.FlowState.invoke(FlowState.java:409)
at com.wm.app.b2b.server.FlowSvcImpl.baseInvoke(FlowSvcImpl.java:1057)
at com.wm.app.b2b.server.invoke.InvokeManager.process(InvokeManager.java:643)
at com.wm.app.b2b.server.util.tspace.ReservationProcessor.process(ReservationProcessor.java:41)
at com.wm.app.b2b.server.invoke.StatisticsProcessor.process(StatisticsProcessor.java:44)
at com.wm.app.b2b.server.invoke.ServiceCompletionImpl.process(ServiceCompletionImpl.java:243)
at com.wm.app.b2b.server.invoke.ValidateProcessor.process(ValidateProcessor.java:51)
at com.wm.app.b2b.server.invoke.PipelineProcessor.process(PipelineProcessor.java:171)
at com.wm.app.b2b.server.ACLManager.process(ACLManager.java:276)
at com.wm.app.b2b.server.invoke.DispatchProcessor.process(DispatchProcessor.java:30)
at com.wm.app.b2b.server.AuditLogManager.process(AuditLogManager.java:363)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:547)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:386)
at com.wm.app.b2b.server.ServiceManager.invoke(ServiceManager.java:234)
at com.wm.app.b2b.server.BaseService.invoke(BaseService.java:194)
at com.wm.lang.flow.FlowInvoke.invoke(FlowInvoke.java:324)
at com.wm.lang.flow.FlowState.invokeNode(FlowState.java:584)
at com.wm.lang.flow.FlowState.step(FlowState.java:444)
at com.wm.lang.flow.FlowState.invoke(FlowState.java:409)
at com.wm.app.b2b.server.FlowSvcImpl.baseInvoke(FlowSvcImpl.java:1057)
at com.wm.app.b2b.server.invoke.InvokeManager.process(InvokeManager.java:643)
at com.wm.app.b2b.server.util.tspace.ReservationProcessor.process(ReservationProcessor.java:41)
at com.wm.app.b2b.server.invoke.StatisticsProcessor.process(StatisticsProcessor.java:44)
at com.wm.app.b2b.server.invoke.ServiceCompletionImpl.process(ServiceCompletionImpl.java:243)
at com.wm.app.b2b.server.invoke.ValidateProcessor.process(ValidateProcessor.java:51)
at com.wm.app.b2b.server.invoke.PipelineProcessor.process(PipelineProcessor.java:171)
at com.wm.app.b2b.server.ACLManager.process(ACLManager.java:276)
at com.wm.app.b2b.server.invoke.DispatchProcessor.process(DispatchProcessor.java:30)
at com.wm.app.b2b.server.AuditLogManager.process(AuditLogManager.java:363)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:547)
at com.wm.app.b2b.server.invoke.InvokeManager.invoke(InvokeManager.java:344)
at com.wm.app.b2b.server.dispatcher.trigger.Trigger.invokeService(Trigger.java:411)
at com.wm.app.b2b.server.dispatcher.trigger.Trigger.processMessage(Trigger.java:302)
at com.wm.app.b2b.server.dispatcher.trigger.DefaultTriggerTaskHelper.process(DefaultTriggerTaskHelper.java:253)
at com.wm.app.b2b.server.dispatcher.trigger.TriggerTask.run(TriggerTask.java:286)
at com.wm.util.pool.PooledThread.run(PooledThread.java:131)
- locked <87fffffee1da1d00> (a com.wm.app.b2b.server.TMPooledThread)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None

Please suggest why it is happening.

Thanks

Regards
Hari

Hi Hari,

can you provide a list of applied fixes for your installation (best from UpdateManager).

Please check the file system of your installation for the following directories under IntegrationServer:
DocumentStore
WmRepository4
XAStore

Are there any messages in server.log or error log related to document store draining?

This might be realted to document size being subscribed by the trigger.

Please take 3 threads dumps with 1 minute interval before restarting next time this issue occurs.
These can visualized with a ThreadDump analyzer like Samurai (侍 - ログ , スレッドダンプ解析ツール).

Regards,
Holger

Dear Holger,

Thanks for your reply.

We are using

Version 8.2.1.0
Updates IS_8.2_SP1_Core_Fix10
Build Number 315

There are no errors related to Document Store Draining.

But we foundout later that EXTENDED Settings are not loaded at all on IS Admin Page.
Are these two issues related?

Hi Hari,

do you have SCG_Audit_Fix, SCG_DataDirect_Fix and SCG_LWQ_Fix applied to your IntegrationServer installation?

They are involved in auditing and might solve some issues in auditing context.

Is there an error message regarding the extended settings not being loaded?

The Extended settings list is empty by default but this does not mean that the settings have not been loaded.
Their visibility can be changed under the “Show and hide” link or by explicitly edting the list.

Please check under IntegrationServer/config if your server.cnf is intact.
There might be other config files affected too.

Regards,
Holger

Dear Holger,

Thanks again for you reply.

No we don’t have any fix applied for SCG_Audit_Fix, SCG_DataDirect_Fix and SCG_LWQ_Fix.
There is no error regarding the extended settings.

We have some custom defined Extended setting values for Ex
watt.server.trigger.interruptRetryOnShutdown=true
watt.server.trigger.monitoringInterval=600
These were not visible after the restart. we manually entered these.

Regards
Hari

Hi Hari,

looks like you are hitting some sort of a deadlock on the local document stores for the triggers.

Please check if it is possible to apply the latest version of the mentioned fixes plus SCG_Core_Fix.

Additionally you should check if it is possible to update your IS to SP2.
This will be neccessary anyway when preparing a migration to a more recent version 9.x.

Regards,
Holger