Webmethods 8.2 package inconstancy and wrong status

Hello,
We are using webMethods Integration Server 8.2.1.0 (build 315) on a Solaris 10 platform for our client and we have a very strange behavior not reproducible and not frequent !

I’ve tried to find something relevant in the Webmethod tech forum but I don’t succeed in so I create this post.

I hope I use the good forum to post my message.

We have 4 I.S. distributed on 2 solaris 10 servers. And we have some trouble with one of those 4 I.S. on only one server (and always the same I.S., we don’t have notice the trouble on the other I.S. deployed on this server).
Many packages are deployed on these I.S. (and some of them are deployed on the 2 I.S. in the same time).
Those packages are some OOB Webmethod packages and some others are our owned packages.

Sometimes (very rare, no more than one time per year) our client create a troubleshooting ticket because of services not available anymore.
It was not the same package each time the trouble occured (but each time one of our owned package).

We have checked the server.log and no weird logs appeared.

We have tried disable then enable or unload then load package but nothing works.

We have also noticed that package status on GUI are not consistent with real package status.

When the trouble appears, the only way we have found to get a steady behavior is to stop and restart the I.S.

Is someone has noticed such behavior and find a way to fix it on its integration server ?

In order to be efficient, the next time it will arrive, what are the best practices to find the reason why it happens when we notice such a weird behavior ?

Thanks in advance for your answer ?

M. Cheve

a few questions:

  1. Can you get the exact error the client system receive when they claim "because of services not available anymore. "?
  2. What do you mean by "noticed that package status on GUI are not consistent with real package status. "?

Hi Manuel,

additonally to the questions of Tong, there are some more:

Is the affected package specific for one IS, or is it one which is spread across several ISes?
For later one is it always failing on the same IS, or does this occur on either IS of the set too?

Are all IS running the same set of Fixes?
Are all IS running the same version of the jvm?
Are the IS communicating with each other?
If so, in which way? (Shared Broker, Remote Invoke?)

Are the Packages dependent on each other?
In this case there might be an loading order race condition causing the affected package to fail the reload or reenabling due to circular dependencies or missing dependencies.

Regards,
Holger

Thanks for your answer.
My additions :

1- indeed we noticed the trouble with applicative error. I’ve checked the tickets the client has created but there was no webMethods error logs. So we don’t have exact error.

2- the package which deploy the unavaible services is shown as enabled and loaded and the service browser shows all services whereas one or several of thoses services are not available.
On the other hand, when have noticed something like “status inconstancy” because we have tried to disable a package PACK_XXX_1 which is necessary for another package PACK_XXX_2 (the one on which we had trouble and shown as disabled) and we have a message such as “Cannot disable PACK_XXX_1. Enabled packages PACK_XXX_2 depend on PACK_XXX_1”

Hello Holger,
Here my answers :

Is the affected package specific for one IS, or is it one which is spread across several ISes?

For later one is it always failing on the same IS, or does this occur on either IS of the set too?

No it is a specific one and each time it occured (very rare) it was on the same IS.

So after thinking about that and checking the tickets our client has created, each time the problem occurs it was also the same package (I have written some wrong in my first post).

Are all IS running the same set of Fixes? Yes, I think so, but I will check it tomorrow on our client servers.

Are all IS running the same version of the jvm? Yes I’m quite sure of that all ISes are running on a Solaris 10 server with java version “1.6.0_31”. But I’ve to check it on the client server, it should have something like minor release difference between our servers and theirs.

Are the IS communicating with each other? Generally speaking yes, but I’m not sure this is the case with the one we are talking about.

If so, in which way? (Shared Broker, Remote Invoke?)

Are the Packages dependent on each other? Same answern generally speaking yes, but I’m not sure this is the case with the one we’re talking about

In this case there might be an loading order race condition causing the affected package to fail the reload or reenabling due to circular dependencies or missing dependencies. Thanks for this observation we will check it.

Still very confusing.
If there is no error on the IS side, how can you say the issue only happened one instance, not others?

Sometimes, service unavailable can be generated by firewall. So the original error the client system receive is critical for understanding this issue.

Hi all,

I’ve taken some time to complete my previous answers.

Holger : Is the affected package specific for one IS, or is it one which is spread across several ISes?
For later one is it always failing on the same IS, or does this occur on either IS of the set too?

Me : No it is a specific one and each time it occured (very rare) it was on the same IS.
So after thinking about that and checking the tickets our client has created, each time the problem occurs it was also the same package (I have written some wrong in my first post).

Holger : Are all IS running the same set of Fixes?
Me : Yes, I think so, but I will check it tomorrow on our client servers.
The four ISes are all deployed on the same webMethods Integration Server 8.2.1.0 build 315.

Holger : Are all IS running the same version of the jvm?
Me : Yes I’m quite sure of that all ISes are running on a Solaris 10 server with java version “1.6.0_31”. But I’ve to check it on the client server, it should have something like minor release difference between our servers and theirs.
I’ve checked it and the Jvm used is a 1.6.0_24 (50.0) 64 bits (so there is a difference with the one we are using on our dev and integration server “1.6.0_31”.
The four ISes are all deployed with this same JVM.

Holger : Are the IS communicating with each other? If so, in which way? (Shared Broker, Remote Invoke?)
Are the Packages dependent on each other?
Me : Generally speaking yes, but I’m not sure this is the case with the one we are talking about.
Some of our packages publish some documents in a broker and some others process those documents.
But it is not the case of the package mentionned in my trouble.
This package is depending on other packages (on the same IS), but when this package did not work anymore, the others packages (with dependances) continued to work as well.
In our case, this package invoke some java code (ServicesTec class).

Holger : In this case there might be an loading order race condition causing the affected package to fail the reload or reenabling due to circular dependencies or missing dependencies.
Me : Thanks for this observation we will check it. Indeed this package is depending on 3 others packages, I will ask to our client to be careful with the 3 depending packages.

Thanks a lot for your tries to understand our problem !

Regard
Manuel

Tong : If there is no error on the IS side, how can you say the issue only happened one instance, not others?
Me : I’ve checked all information we have about our trouble, and the only errors we can notice on IS side are some “Unknown service” log and we noticed also that no service are deployed on this package (whereas it should have some).
As all the services of this package are undeployed (but we don’t understand why), our application does not work as well and we have noticed a lot applicative errors.
But I repeat, it happens twice during 2 years. The first time we have decided to stop and start our IS and all has worked fine after.

Tong : Sometimes, service unavailable can be generated by firewall. So the original error the client system receive is critical for understanding this issue.
Me : it should be a good information, I will check with our client if there was something wrong with network infrastrucure (firewalls, proxies, …)

When you see the unknown service error, do you see something like:
Unloading **** package
Loading **** package
in the server log around that time period?
I can only think of package being reloaded (by code or manually) that can cause a transient Unknown service error.

you can rule out the firewall case, since if you see Unknown service error on your IS, that means the client reached your IS already.

Hi Manuel,

You can check ACL level setting of service which is missing.

Sometime this has been noticed services aren’t visible to user who doesn’t have READ/WRITE/Execute ACL permission for particular service.

Regards,
Kuldeep

Hi Manuel,

You can check ACL level setting of service which is missing.

Sometime this has been noticed services aren’t visible to user who doesn’t have READ/WRITE/Execute ACL permission for particular service.

Regards,
Kuldeep