We have a strange issue that happens sporaidically. Generally, BPM is working fine for 90% of the instances. However, for few instances, while execution moves from one step to another, it just gets stuck (not a data issue). The process status shows STARTED though no activity steps in that particular step. We increased the debug level and captured the following in the log. Could anybody please assist if you are aware of what these messages mean?
we are facing the same issue (IS 6.5.2). We are trying to run two non-clustered IS with PRTs pointing to the same DB and in about 10% cases we see the same issue as described above. If we switch off one of the IS the issue disappear.
I’ve seen this issue before but in 6.1. WM could never give me root cause on it. Query the wmprocessstep table and see if the “hung” step attempted to execute on the IS node that all other steps did not execute on. The only reliable solution we could ever come up with was to enable “optimize locally” - though this has its downsides as well…
I’ve seen the suggestions by arulchristhuraj and jlammers be quite effective at a client were PRT was used extensively. The biggest stability gain was keeping the DB records trimmed to about 2 weeks worth of data. The optimize locally was also effective–and I wonder about the real value of having the steps of a single process bounce around multiple IS instances anyway.
Problem is that we cannot use the optimize locally option as we need to be able to resubmit the process in the middle, in addition the process might run also longer time and we would not be able to recover it in case of server failure.
I’m currently playing little bit with DB performance, but still I assume this will just minimize the possibility of this issue, but not prevent it. Nevertheless I agree that if the possibility will be low it might be an acceptable workaround for us.