VMWare Experience

Per Advantage, the official support policy for webMethods on VMWare is:

Is your company using VMWare in production? If so, what is your experience with stability, performance, uptime, etc.? Are you using VMWare in any non-production environments?

Mark

I know many individuals that have routinely run their own version of 6.x on VMWare while on their laptop… I don’t recall them having any issues with it. I think I came across a customer or two that had looked at doing this but do not recall if they actually moved forward with it.

wM tech support (particularly TS Managers), or your local webM SE, would probably be the best source of this kind of information.

-Dan

I have heard today that some customers report that VMWare may not always allocate the desired amount of CPU and memory resources to a virtual machine in high volume scenarios. This is apparently the reason WM takes the position stated above related to supporting production environments that use VMWare. Having done my share of performance-tuning and troubleshooting, I can actually understand this position.

So, is anyone using VMWare in high-volume production environments? If so, what is your experience?

Mark

We use VMWare for dev and test, but use direct hardware for UAT and live.

I’ve also seen what was almost certainly a VMWare bug on our test server where VMWare didn’t correctly allocate memory to the OS instance. Ultimately, we had to rebuild the VMWare instance which made the problem go away.

However, WM’s support policy for VMWare sounds a bit after-the-fact and is probably invalid. In WM’s own words, they only certified their software “to run on specific operating systems and JVMs, but not on particular hardware configurations.” But then they say VMWare is “categorized as part of the hardware infrastructure”. So they should not be concerned by hardware since they didn’t previously put conditions on hardware. For instance, I don’t recall WM support policies that read something like this: “IS 6.1 is certified to run on Intel Socket 360 Pentium IIIS 1.26 Ghz, SuperMicro Chipsets, blah blah…”

We’ve mentioned this to WM in an SR when they first informed us about this policy - we didn’t hear back from them.

WM’s VMware support statement only has a leg to stand on if the specific operating systems and JVMs that they support, in turn, have hardware support policies that exclude VMWare.

Sonam,

What to do you mean by rebuild the VM? We are currently experiencing problems in our production system - on a near daily basis the java.exe slowly climbs to 99% and then we have to bounce the system. There seems to be no evidence that would proof this to be a problem due to volume in the system. Starts at about 30% and throught the day it climbs to 99%, slowly increasing and never decreasing, until it hangs. IS is running on NT machines. Any ideas?

Bryan,

Couple of things to check. First off, within the IS service usage, see if you can establish which service if causing the trouble. Also, depending on the version of IS, check the JVM version. Starting from the command line and taking a thread dump would also be useful. There were some weird bugs on some older 1.3 JVMs, particularly the 1.3 from IBM they distributed for the longest time. It might be worth swapping that out if you get truly desperate.

Tate

Bryan - your problem may be unrelated to VMWare - perhaps a wM adapter or custom code is causing the issue.

Do what Tate has suggested - check the ‘Service Usage’ screen and ensure no long-running threads exist and get a thread dump, so you can report it to wM.

In our case rebuilding the VMWare ‘test’ instance involved reinstalling the OS and copying over the IS instance from ‘UAT’ - this got test working again.

You might try Stack Trace http://tmitevski.users.mcs2.netarray.com/stacktrace.jsp to capture thread dumps when IS is running as a Windows service.

That you all for the response! I will bring up the stack trace process to the team, it sounds helpful. I believe an app called Wiley is being installed to dig into the JVM metrics.

We also have been running WMWare in our production environment for the past 4 years. We didn’t encounter any problem until last week. Our set up is IS6.1 and Broker6.1 are hosted on the same VMWare instance. The host OS is Linux: ES2.1

Last week the Integration server process “disappeared” from the server (VMWare). There are no messages in any of the logs (server.log or OS level logs) to indicate what has happened. The IS was functioning normally, there was no abnormal load. Also there was no “core dump” to be found in the server. Basically there is no trace of what has caused the IS process to be
killed?

Do you know of any WMWare or Linux OS level bug that would cause the java process (IS) to be killed ? On what condition the IS process will get killed with out generating any messages in any of the logs?

To clarify things, what VMWare Product / Version / HostOS / GuestOS are people talking about here? It is quite possible that certain combinations of HostOS + GuestOS could cause problems with IS, especially if you’re running versions of GuestOS / HostOS not explicitly supported by VMWare (and by extension, versions of GuestOS not supported by IS/Broker).

In the hands of inexperienced sysadmins, I can see how VMWare would destablize IS running on it, especially transient problems (network, I/O, Disk, # of processes, somebody running Virus Scanner on your host Windows, whatnot). VMWare ESX would probably be much less prone to these problems, but I haven’t played with that at all.

But, that’s not to say I don’t like VMWare with IS/Broker. Since there are many combinations of Product/Version/HostOS/GuestOS, with even more coming (i.e. IS 6.5 SP1 supporting Solaris 10 on AMD64), I think wM is wise not to officially support VMWare. It’d be a support nightmare. But for people who know what they’re doing (like Sonam’s configuration), they’re a great tool for the sysadmins.

BTW, I think the best virtualization platform for supporting commercial applications (in particular, wM IS/Broker) would be Solaris 10’s “zones”. Each zone would be running the same OS as the host (or "global zone), so technical support, whether in-house or wM, might have less issues regarding “compatibility”. As zones is pretty much a standard part of the OS, it’d be hard for wM to not support it.

Also zones are not “full-virtualization”, in that they aren’t emulating hardware, but just provide the privilege separation required to keep the zones separate (which is why BSD’s “jail”, on which the Solaris zones is based, is called such). That means there’s also much less performance hit from having to emulate network, disks, etc.

At least that’s the way things stand right now. When intel’s Vanderpool and AMD’s Rev F (Virtualization technologies) are supported by VMWare ESX, performance parity might not be as significant. 'til then, I would much prefer Solaris 10 (on SPARC or AMD64).