Entire timeout error 00740074

Hi,

Our system is experiencing intermittent timeout errors on a particular web service (SOAP XML).

This service request originates from a Natural program and calls out through EntireX to the service running on a Solaris system.

95% of the time it works with no issues but occasionally it times out. Between the mainframe and Solaris systems lie a Datapower appliance, SAML token service, TIBCO layer and the obligatory networking infrastructure. None of these appear to be showing any timeout

The timeout settings in our EntireX broker on the mainframe are at least 180 seconds up to 15 minutes.

When this particular call fails it always returns the timeout error exactly 55 seconds after the request. None of our timeout settings on the mainframe Broker or midrange RPC server are set to 55 seconds so I’m at a loss as to where this request is timing out.

We have many other web services running through the same broker and RPC server with no issues.

I can only assume that the issue is not with the mainframe or RPC server as it works fine for so many other web services which run at much higher throughput and frequency.

Any thoughts?

Thanks,
Chris

Here is the RPC server config (XXX’d out the sensitive stuff):

Example server configuration

parameter for xml stream parser

entirex.sdk.xml.runtime.xmlparserfactory=com.ctc.wstx.stax.WstxInputFactory

xmlruntime configuration file

entirex.sdk.xml.runtime.configurationfile=entirex.xmlrpcserver.configuration.xml

Basic properties

entirex.server.brokerid=XXXXXXXXXX
entirex.server.serveraddress=RPC/XXXXXXX/CALLNAT
entirex.server.userid=XXXXXXX

Workers and attach server

entirex.server.fixedservers=no
entirex.server.minservers=1
entirex.server.maxservers=5
#entirex.server.restartcycles=15

modifying restartcycles to 1440 – attempt a connection every minute for # 1,440 minutes (24 hours) – if the Mainframe is down for more than 24 hours, # we have bigger problems!

entirex.server.restartcycles=1440

Security and encryption

entirex.server.password=
entirex.server.security=
entirex.server.encryptionlevel=0

Location Transparency

entirex.server.logicalservice=
entirex.server.logicalbrokerid=
entirex.server.logicalsetname=DefaultSet
entirex.location.transparency.config=
entirex.location.transparency.ini=

Internationalization

LOCAL added by IP Australia

entirex.server.codepage=ISO8859-1

Data compression

entirex.server.compresslevel=0

Timeouts

entirex.server.waitattach=600S
entirex.server.waitserver=300S
entirex.timeout=0

Tracing

entirex.server.verbose=y
entirex.server.logfile=/data/logs/EntireX/trace.log
entirex.trace=3

Monitoring

entirex.server.monitorport=

#Dynamic Deployment
entirex.server.allowdevelopment=true

And the mainframe broker attributes:

***********************************************************************    
** **                                                                      
** ETB Attribute File **                                                   
** **                                                                      
***********************************************************************    
** ENTIRE Broker attributes *******************************************    

DEFAULTS = BROKER
BROKER-ID = XXXXXX
* TRANSPORT = TCP * Starts TCP communicator
* TRANSPORT = TCP-SSL * Starts TCP and SSL communicator
TRANSPORT = TCP-NET * Starts TCP and Entire Network comm.
* TRANSPORT = TCP-SSL-NET * Starts TCP, SSL and Entire Network comm.
* TRANSPORT = SSL * Starts only SSL communicator
* TRANSPORT = NET * Starts only Entire Network communicator
ACCOUNTING = NO
AUTOLOGON = YES * LOGON required ?
SECURITY = NO * running with SECURITY ?
SERVICE-UPDATES = YES * reread ATTRIBUTE file ?
NUM-WORKER = 6
TRACE-LEVEL = 0
* CMDLOG = YES * command logging on ?
CLIENT-NONACT = 15M
DYNAMIC-MEMORY-MANAGEMENT = YES
MAX-MESSAGE-LENGTH = 10000000
CONV-DEFAULT = UNLIM * CONVERSATIONs per SERVICE
LONG-BUFFER-DEFAULT = UNLIM * LONG BUFFERs per SERVICE
SHORT-BUFFER-DEFAULT = UNLIM * SHORT BUFFERs per SERVICE
NUM-UOW = 100 * maximum umber of UOW’s
* TRAP-ERROR = 0088
* TRMODE = WRAP
* TRBUFNUM = 5
PSTORE = HOT
PSTORE-REPORT = NO
PSTORE-TYPE = ADABAS
PSTORE-VERSION = 3
UWTIME = 1D
UWSTATP = 1
WAIT-FOR-ACTIVE-PSTORE = YES
* ------------- some example TCP parameters ----------------------
DEFAULTS = TCP
CONNECTION-NONACT = 15M
* HOST = IP-address
PORT = XXXXX
RESTART = YES
RETRY-LIMIT = 20
RETRY-TIME = 60S
REUSE-ADDRESS = YES
* STACK-NAME = stack-name
* ------------- ADABAS Defaults -----------------
DEFAULTS = ADABAS
BLKSIZE = 4000
DBID = XXX
FNR = XX
FORCE-COLD = NO
MAXSCAN = 3000
OPENRQ = N
SVC = 249
* ------------- some example NET parameters ----------------------
DEFAULTS = NET
ADASVC = 249
EXTENDED-ACB-SUPPORT = YES
FORCE = YES
IUBL = 8192000
LOCAL = NO
NABS = 200000
NCQE = 100
NODE = 555
TIME = 180

* ------------- ICU Code Page defaults for IP Australia -------            

DEFAULTS = CODEPAGE
* Broker Locale String defaults *
* DEFAULT_ASCII = ISO8859-1
* DEFAULT_EBCDIC_IBM = ibm-1047
DEFAULT_EBCDIC_IBM = ibm-1047_P100-1995
* ------------- NATURAL RPC services --------------------------
DEFAULTS = SERVICE
CONV-NONACT = 15M
SERVER-NONACT = 15M
NOTIFY-EOC = NO
* TRANSLATION = SAGTCHA
CONVERSION = (SAGTCHA,TRACE=0,OPTION=STOP)
* TRANSLATION = USRTCHA * user specific translation
CLASS = RPC, SERVER = XXXXXXXX, SERVICE = CALLNAT
* ------------- XML RPC services --------------------------
DEFAULTS = SERVICE
CONV-NONACT = 15M
SERVER-NONACT = 15M
NOTIFY-EOC = NO
* TRANSLATION = SAGTCHA
CONVERSION = (SAGTCHA,TRACE=0,OPTION=STOP)
* TRANSLATION = USRTCHA * user specific
CLASS = RPC, SERVER = XXXXXXX, SERVICE = CALLNAT
DEFERRED = YES
* ------------------------------------------------------- END -

I should also add that bouncing the RPC server does seem to stop the errors occurring for a while. We bounced it about 2 weeks ago now and haven’t had an error since. Prior to that, bouncing the RPC server seemed to fix it but eventually the errors came back and increased in frequency until we bounced it again.

Hi Chris,
it appears that the timeout is caused by the RPC-specific Natural profile parameter TIMEOUT. This parameter specifies the time the client wants to wait for an answer for its request.
Please increase TIMEOUT accordingly.
Best regards, Wolfgang

Hi Wolfgang,

Thanks for your prompt and helpful reply.

That certainly appears to be why it’s timing out at 55 seconds.

Now we need to work out why it’s timing out at all as this service certainly shouldn’t be taking that long. When things are going well, it replies in less than 1 second so the 55 second limit should be ample.

I suspect that the real issue may be further downstream. The intermittent nature of it makes me think it is a performance issue of some sort.

We’re now looking to see if the there are enough threads available in the jBoss connection thread pool for this web service.

Regards,
Chris

The newest version of EntireX has an Application Monitoring functionality that could help you see where in the request/response process you are finding the most delay causing your timeout. Eric Wood gave a webinar on this, and you can watch a repeat of it by going to softwareag.com/us and then clicking on Customers->User Groups and then the Webinars tab, and finally clicking on the link to ‘View recordings from past webinars’. This will take you to a YouTube channel where you can find the “Application Monitoring with EntireX” webinar.

Hi Brian,

Thanks for that. That looks like a very useful tool.

The presentation only looks at transactions which originate externally calling into the mainframe. I wonder if it can provide any monitoring for calls which originate within a mainframe Natural application calling to external systems (which is the service we’ve been having trouble with).

Presumably it can’t give database monitoring for such calls but even some fairly generic round trip performance stats would still be useful.

It also mentions that it only monitors syncronous calls. We have a couple of pseudo asyncronous calls (no response payload but still receives a HTTP 200 response) originating from the mainframe using Reliable RPC. I wonder if this monitoring tool would be of any use for those.

Regards,
Chris

Chris,

Actually, my understanding is this will work with any RPC Server process, not just Natural RPC Servers. The only thing was db time could only be counted for Adabas, so other db time would be counted as part of the RPC program runtime.

You define application name where you name specific RPC, server, service definitions in the Broker attribute file, so say you have XML RPC Servers, you can define the app name there that will be used by the application monitoring process and will be able to collect and display metrics related just to that XML RPC Server.

-Brian

Chris,

I neglected to answer the second part about asynchronous calls via reliable RPC. We have those as well. Since there is no expected response, my guess is you really do lose any ability to do monitoring with those. I could be wrong, but I would bet what you say is true.

Maybe a future version of this will be able to put it’s hooks into one-way traffic, though it may be more appropriate to pursue that in the SOA layer using BAM.

-Brian

Brian, many thanks for posting this information.

I suggest to take a look at the documentation:
http://techcommunity.softwareag.com/ecosystem/documentation/webmethods/wmsuites/wmsuite9-9/EntireX/9-9_EntireX/appMon/overview.htm
especially this part:
http://techcommunity.softwareag.com/ecosystem/documentation/webmethods/wmsuites/wmsuite9-9/EntireX/9-9_EntireX/appMon/components.htm#components

EntireX 9.9 has a nice enhancement to the Data Collector, you can write your own processing logic for the events in Java. So for example you can store them in a database or you can send alerts for certain events.
See http://techcommunity.softwareag.com/ecosystem/documentation/webmethods/wmsuites/wmsuite9-9/EntireX/9-9_EntireX/relnotes/v99.htm#v99_other_appMon

To use Application Monitoring there is no need to upgrade your whole environment to 9.9 or 9.7 (although that’s probably a good idea). EntireX components support this since version 9.0.
So it’s sufficient just to install the version 9.9 Data Collector (select Core only).

We’ve found the solution to this problem.

The maxservers parameter in the EntireX server properties file was too low.

It was set to 5. The default is 32.

The following error was in the RPC server log at the same time our request was sent to the web service:

Unfortunately, the error message in the mainframe was a timeout. It would be more useful if the error message stated the real reason why the call failed. The request simply disappeared into the ether.

We’ve now set maxservers to 50 and the timeouts have stopped.

We’ve recently had a substantial increase in web service calls originating from the mainframe due to some new processes being brought online and this is why our RPC server count was exceeding the maximum available.

Thanks all for your advice on our issue.

Regards,
Chris

The 0074 0074 timeout error message is actually the “real” error message. The only thing the Broker knows is that a client does a send and requests a reply specifying the maximum wait time. If this wait time expires and no reply from a server arrived in the Broker the Broker will send back the 0074 0074 to the client. There can be several reasons why no reply arrived in the Broker.

It is not an error if the maximum number of server replicates is reached. The Broker queues outstanding client requests until a free server is available.

We have a mix of synchronous calls and pseudo asynchronous calls which use an ADABAS based persistent store managed by the broker.

The pseudo asynchronous calls do queue up (up to 24 hours) if there’s a problem. The synchronous calls may queue up but only within the constraints of the time out.

We’re having more problems as of today. The XML RPC server shows the settings I initially posted in this thread but it has stopped behaving like it can replicate server instances. Instead it is trying to process everything as a single queue.

When our persistent store calls start they hog a server instance in the XML RPC server so any other synchronous calls end up timing out again. This has happened before and we just bounced the XML RPC server and it was all good again.

Then even that wouldn’t work and we’d have to change the server properties file and then bounce to get it to behave. Today even that isn’t working. We’ve bounced the broker, Natural RPC server and XML RPC server and our synchronous and pseudo asynchronous calls will not coexist.

The only recent changes are an ADABAS upgrade we did a couple of weeks ago (from 8.2.4 to 8.2.6) and the midrange guys say they upgraded Java to 1.6.0_111.

I’m stumped. Need to fix this by the morning.

Please help.

Regards,
Chris

It’s difficult to make an educated guess without any additional data (logs, version numbers,…). The best thing is to contact support and provide all the data you have. I assume you have a rather old version of the XML RPC Server because newer versions require Java 7 or even Java 8. What version are you using and do you have the latest fix?

You should check 1) if you have defined a SERVER-LIMIT in the Broker attribute file and 2) if there is another Server active which is using the same server name.

Hi Rolfe,

Here are the XML RPC Server properties contents and broker attribute file (I’ve sanitised IP addresses and server names for security):

2016-06-15 16:25:21.730 Start of EntireX XML RPC Server, Version: 8.1.3.00, Date: 19 May 2011, EntireX XML RPC Server 8.1.3.00 SunOS 5.10 sparcv9.
Using property file /opt/softwareag/EntireX/config/entirex.xmlrpcserver.properties
Using the following properties:
entirex.location.transparency.config:
entirex.location.transparency.ini:
entirex.sdk.default.trace.propertiesfile: system properties
entirex.sdk.xml.runtime.configurationfile: /opt/softwareag/EntireX/config/entirex.xmlrpcserver.configuration.xml
entirex.sdk.xml.runtime.xmlparserfactory: com.ctc.wstx.stax.WstxInputFactory
entirex.server.allowdevelopment: true
entirex.server.brokerid: xxx:99999
entirex.server.codepage: ISO8859-1
entirex.server.compresslevel: 0
entirex.server.customclass:
entirex.server.encryption:
entirex.server.encryptionlevel: 0
entirex.server.environment:
entirex.server.fixedservers: no
entirex.server.logfile: /data/logs/EntireX/trace.log
entirex.server.logicalbrokerid:
entirex.server.logicalservice:
entirex.server.logicalsetname: DefaultSet
entirex.server.maxservers: 50
entirex.server.minservers: 1
entirex.server.monitorport: 0
entirex.server.monitorremote: no
entirex.server.name: EntireX XML RPC Server, Version: 8.1.3.00, Date: 19 May 2011
entirex.server.password: ********
entirex.server.passwordencrypt: yes
entirex.server.properties: /opt/softwareag/EntireX/config/entirex.xmlrpcserver.properties
entirex.server.restartcycles: 1440
entirex.server.security:
entirex.server.serveraddress: RPC/XXX/CALLNAT
entirex.server.serverlog:
entirex.server.usecodepage: no
entirex.server.userid: XXX
entirex.server.verbose: y
entirex.server.waitattach: 600S
entirex.server.waitserver: 300S
entirex.timeout: 0
entirex.trace: 3

2016-06-15 16:25:21.738/main Trace started:
2016-06-15 16:25:21.738/main EntireX XML RPC Server, Version: 8.1.3.00, Date: 19 May 2011
2016-06-15 16:25:21.738/main Java version=1.6.0_111
2016-06-15 16:25:21.738/main Java vendor=Sun Microsystems Inc.
2016-06-15 16:25:21.738/main Java class version=50.0
2016-06-15 16:25:21.738/main OS name=SunOS
2016-06-15 16:25:21.738/main OS version=5.10
2016-06-15 16:25:21.738/main OS architecture=sparcv9
2016-06-15 16:25:21.738/main Default encoding=UTF8
2016-06-15 16:25:21.738/main Country=AU Language=en
2016-06-15 16:25:21.738/main Language code=en
2016-06-15 16:25:21.738/main Country code=null
2016-06-15 16:25:21.738/main Timezone=Australia/NSW
2016-06-15 16:25:21.738/main File encoding=UTF-8
2016-06-15 16:25:21.738/main Do not start ServerMonitorDispatcher (port:0).
2016-06-15 16:25:21.753/Worker-1 Start of Worker-1
2016-06-15 16:25:21.783/Worker-1 BrokerId xxx:99999?poolsize=0 resolved to xxx:99999 (no socketpool)
2016-06-15 16:25:21.790/Worker-1 Running on an ASCII machine.
2016-06-15 16:25:21.790/Worker-1 Using API version 3 for Broker xxx:99999
2016-06-15 16:25:21.790/Worker-1 KERNELVERS(BID=mvs:18027,UID=PHOENIX,PU=0000000000000000000000000000)



** ETB Attribute File **



** ENTIRE Broker attributes *******************************************
DEFAULTS = BROKER
BROKER-ID = ETB999

  • TRANSPORT = TCP * Starts TCP communicator
  • TRANSPORT = TCP-SSL * Starts TCP and SSL communicator
    TRANSPORT = TCP-NET * Starts TCP and Entire Network comm.
  • TRANSPORT = TCP-SSL-NET * Starts TCP, SSL and Entire Network comm.
  • TRANSPORT = SSL * Starts only SSL communicator
  • TRANSPORT = NET * Starts only Entire Network communicator
    ACCOUNTING = NO
    AUTOLOGON = YES * LOGON required ?
    SECURITY = NO * running with SECURITY ?
    SERVICE-UPDATES = YES * reread ATTRIBUTE file ?
    NUM-WORKER = 6
    TRACE-LEVEL = 0
  • CMDLOG = YES * command logging on ?
    CLIENT-NONACT = 15M
    DYNAMIC-MEMORY-MANAGEMENT = YES
    MAX-MESSAGE-LENGTH = 10000000
    CONV-DEFAULT = UNLIM * CONVERSATIONs per SERVICE
    LONG-BUFFER-DEFAULT = UNLIM * LONG BUFFERs per SERVICE
    SHORT-BUFFER-DEFAULT = UNLIM * SHORT BUFFERs per SERVICE
    NUM-UOW = 100 * maximum umber of UOW’s
  • TRAP-ERROR = 0088
  • TRMODE = WRAP
  • TRBUFNUM = 5
    PSTORE = HOT
    PSTORE-REPORT = NO
    PSTORE-TYPE = ADABAS
    PSTORE-VERSION = 3
    UWTIME = 1D
    UWSTATP = 1
    WAIT-FOR-ACTIVE-PSTORE = YES
  • ------------- some example TCP parameters ----------------------
    DEFAULTS = TCP
    CONNECTION-NONACT = 15M
  • HOST = IP-address
    PORT = 99999
    RESTART = YES
    RETRY-LIMIT = 20
    RETRY-TIME = 60S
    REUSE-ADDRESS = YES
  • STACK-NAME = stack-name
  • ------------- ADABAS Defaults -----------------
    DEFAULTS = ADABAS
    BLKSIZE = 4000
    DBID = 999
    FNR = 99
    FORCE-COLD = NO
    MAXSCAN = 3000
    OPENRQ = N
    SVC = 249
  • ------------- some example NET parameters ----------------------
    DEFAULTS = NET
    ADASVC = 249
    EXTENDED-ACB-SUPPORT = YES
    FORCE = YES
    IUBL = 8192000
    LOCAL = NO
    NABS = 200000
    NCQE = 100
    NODE = 999
    TIME = 180
  • ------------- some example SECURITY parameters -----------------
  • DEFAULTS = SECURITY
    • TRACE-LEVEL = 0
    • ERRTXT-MODULE = NA2MSG0
    • SAF-CLASS = NBKSAG
    • INCLUDE-CLASS = YES
    • INCLUDE-NAME = YES
    • INCLUDE-SERVICE = YES
    • UNIVERSAL = NO
    • CHECK-IP-ADDRESS = NO
    • WARN-MODE = NO
    • IGNORE-STOKEN = NO
    • TRUSTED-USERID = NO
    • PROPOGATE-TRUSTED-USERID = NO
    • SECURITY-LEVEL = AUTHORIZATION
      ** ENTIRE Broker Services *********************************************
    • ------------- ICU Code Page defaults for IP Australia -------
      DEFAULTS = CODEPAGE
    • Broker Locale String defaults *
    • DEFAULT_ASCII = ISO8859-1
    • DEFAULT_EBCDIC_IBM = ibm-1047
      DEFAULT_EBCDIC_IBM = ibm-1047_P100-1995
    • ------------- NATURAL RPC services --------------------------
      DEFAULTS = SERVICE
      CONV-NONACT = 15M
      SERVER-NONACT = 15M
      NOTIFY-EOC = NO
    • TRANSLATION = SAGTCHA
      CONVERSION = (SAGTCHA,TRACE=0,OPTION=STOP)
    • TRANSLATION = USRTCHA * user specific translation
      CLASS = RPC, SERVER = NATRPCP1, SERVICE = CALLNAT
    • ------------- XML RPC services --------------------------
      DEFAULTS = SERVICE
      CONV-NONACT = 15M
      SERVER-NONACT = 15M
      NOTIFY-EOC = NO
    • TRANSLATION = SAGTCHA
      CONVERSION = (SAGTCHA,TRACE=0,OPTION=STOP)
    • TRANSLATION = USRTCHA * user specific
      CLASS = RPC, SERVER = XXX, SERVICE = CALLNAT
      DEFERRED = YES
    • ------------------------------------------------------- END -

There are no other servers active in this environment.

Thanks,
Chris

Hi Chris,

I’m in a conference this week so it’s difficult to help here.

From the log I see that the first worker is starting, there should also be the start of the attach thread.

You can set entirex.timeout=10 to see if you have network issues. Are there any exceptions in the log?