Timeout on v8.2 WSDL retrieval casuses ws consumer call to fail

First symptom is this when you call an 8.2 web service consumer service and get this sort of message:

Then if you go and hit the WSDL address directly you find that the WSDL generation takes quite a long time (e.g. a minute perhaps…).

Seems like this problem only just started occurring recently. Not sure what the cause is.

Generating a thread dump while the WSDL is waiting to return yields:

Trying to sort out whether there’s some low level disk I/O type issue on that box, but has anyone got any ideas? :confused:

So we figured the problem out.
Warning: this may come and bite others if they have large installations (e.g. lots of packages/big classpath etc).

The problem was down to xerces’s behaviour of looking to multiple locations before defaulting to certain behaviour.

Needed this added to the system properties (via the watt.config.systemProperties extended setting):

org.apache.xerces.xni.parser.XMLParserConfiguration=org.apache.xerces.parsers.XIncludeAwareParserConfiguration

See http://xerces.apache.org/xerces2-j/faq-xni.html#faq-2 for the technical details

And the steps it goes through to work out what default parser to use.

So if you put in the property you’ll get to step 4 without steps 2 & 3 that were the slow bit. So I think the problem was more that it was accessing LOTS of files/locations rather than just one slow one.

Now it might be that the classloader it was looking through all the packages or else it might have been some critical point with patches/jars and the like or maybe some interaction with tomcat in there… Point is that this eliminates the problem altogether anyhow.

So without this property set you will get a slow down in WSDL generation over time up until you reach a point where it becomes noticeable or impacts web service consumer services (with 8.2 mode on) as well.

Hope that helps. It was a pretty rough problem to have: an integration layer that couldn’t do web services.

Hmm. Seems odd that 2 relatively simple steps would add much time. How much time are we really talking about here? Is it under load when this occurs? (Threads backing up waiting their turn to read a file?)

Are you talking this issue with webservices on IS 8.2 SP2 in place? It really wonders me why with lot of packages/big classpath is the issue for timeout with it.

But it sounds issue with socket/network timeout related?

Rob -
Time? Well, we’re talking from minimum 4-5 seconds and was up to around a minute when it was under some load.

Add in the setting and it went to instant. We thought initially it was our shared disk on our old box, but then it “infected” our VM server version as well.

RMG:
It was repeated or very large class path searching on disk that was the problem, that then caused the timeout as the web service connector tried to retrieve the WSDL via URL.
We didn’t have SP2, but we did have some various corefixes. But I think it was that it was searching through each package classloader as well (we have a large number of packages)

We got a debug classloader from Software AG that was printing out what it was doing and then it became obvious what it was doing.

Anyhow, the problem is resolved - it was confusing and didn’t seem to make sense at the start, but we went through the process and got it solved.

If you’re on an 8.2 install I’d be keeping an eye on how long WSDL generation is taking after any patch or periodically as you add new packages etc. You don’t want this cropping up in PROD.

Thanks for elaborating/ cautioning users on this issue:

Can we know what was the process change you guys made:

Do you think 8.2 SP2 should take care of the issue? what was SAG’s stake to get rid of this timeout issue?

TIA,
RMG