Recently, we had some issues regarding with Integration Server Job Scheduler.
We are using wm 10.3 with Integration Server Core Fix #4
We have about ~40 scheduled jobs in our Integration Server and, with a not found reason yet, a scheduled job gets “Stopped” randomly with the following message:
([is-host]:[is-port]) Unknown Service: [package].[folder].[flow-service]. Scheduled task will not run
But, the referenced service in “[package].[folder].[flow-service]” does exist and I have to simply re-enable the job again to get it running, without any issue.
Please, is there anything that I should do to fix this issue? It’s randomly happening at random times. Today I had to start my shift earlier due to an issue that my co-worker reported to me.
To quickly resolve you have to restart the job again.
Verify your jdbc pool for IS core alias is serving enough connection and verify the resource → thread allocation is sufficient when next time issue occurs.
does not sound like a threading/pooling issue to me, as this shoul result in different error messages.
More likely the scheduler tried to run a service which was not available at this moment, i.e. due to deployment activies or the package in which the service resides was just reloading at the moment, when the service should run.
Thank you for your advices. After reading Holger_von_Thomsem answer, I remembered that this type of issue used to happen a some time after we deployed one of our flow services projects from QA to PRD.
And, I think that this makes sense because yesterday we deployed some packages to PRD, but this issue got detected only today some hours ago - it was stopped since yesterday afternoon.
I will let you guys know whether I figured out something else.
Hi @Renan_Lopes1 ,
As the exception “Unknown Service: [package].[folder].[flow-service]. Scheduled task will not run” says, at the moment when Scheduler Task tried to invoke the service , that service was not found. It could be that during that time, the package got reloaded . May be checking the server log for any package loading related messages might help.
Another reason could be that some other IS node is also pointing to the same Database (that is specified in the ISInternal pool) and that particular task’s target is “Any Server”. May be you can double check that no other nodes are using the same database unless it is intentional.
there is an option in Deployer to suspend scheduled tasks during deployments and reactivate them once the deployment is finished.
In this case the scheduler want attempt to run while the deploying is ongoing.
This is a good reminder why it is important, especially in production systems, to gracefully suspend schedulers, listeners, triggers before doing a maintenance.
Otherwise when offloading packages you risk that system loose in-flight-transactions and respond to new transactions with unexpected errors.
Also even you have retry and queueing implemented properly for such case, many people forget about the side effects of timeouts (e.g. on the sending system side,
so your of our control) that can worst case cause not only transactions lost, but duplicate executions (you complete the original feed and sending
system just try again because it timed out). Such timeout