keepAliveTimeout in pub.client:http

Hi Experts,

IS 9.12 is consuming the REST API of target system to invoke. And the target system’s idle time to close connection is 30 seconds.

I am trying to achieve HTTP TCP connection close within 20 seconds before my target system issue FIN packet (close connection request), so I tried to set the keepAliveTimeout to 20, however based on TCPDUMP, there was no FIN ACK initiate from IS after 20 seconds.

Please help to understand how keepAliveTimeout works. Thanks in advance.

Note: For now setting maxKeepAliveConnections to 0, so IS send request to target system and immediately close the connection.

keepAliveTimeout is for inbound connections and is now set at the port level.
For outbound connections you can configure timeout of the service pub.client:http directly, however it is in milliseconds, NOT seconds.

A global property watt.net.timeout can be set if you don’t want to set each outbound http call separately.
regards,
John.

2 Likes

Thanks for your response @John_Carter4

keepAliveTimeout is for inbound connections and is now set at the port level”, if its for inbound connections, how does setting keepAliveTimeout in pub.client:http work exactly when I am calling a target system? Sorry, may be my understanding about the setting is wrong and need to understand it clearly.

Thanks,
Niranth

The timeout in the input for pub.client:http service, is as the documentation says pub.client:http
timeout

String Optional. Time, measured in milliseconds, that Integration Server waits for a response from the remote server before timing out and terminating the request. If you do not specify a value, Integration Server uses the value of the watt.net.timeout server configuration parameter.

The watt.net.timeout works at the global level , from the admin guide .
Specifies the number of seconds the server waits for an HTTP request to be fulfilled before the request times out.

You can try setting the input timeout in the pub.client:http to 20000 and try if it works for you.
-NP

Hi @Nagendra_Prasad

Thanks for your response. The timeout is working. My question is on keepAliveTimeout input for pub.client:http service :slight_smile:

keepAliveTimeout is for inbound connections and is now set at the port level”, if its for inbound connections, how does setting keepAliveTimeout in pub.client:http work exactly when I am calling a target system? Sorry, may be my understanding about the setting is wrong and need to understand it clearly.

Regards,
Niranth

The keepAliveTimeout in relation to pub.client:http is rather obscure and is related to how we manage idle outbound threads. It effectively decides how long to keep these threads around when not in use. You should not need to set it and it isn’t relevant to your use case.
regards,
John.

1 Like

I’d be interested in understanding the utility of doing this. In the various HTTP interactions we have this particular item has never come up. We have never needed to directly manage connection closing. We use a timeout but that’s so that our pub.client:http call doesn’t wait forever for a response from the target.

What is the concern if the target sends a FIN to IS?

Ah sorry, I see what you are asking now. You don’t want the idle connection to be kept, but closed almost immediately. Are you returning the response as a stream ? If so have you closed the stream, because the http connection will not be released until you do irrespective of the value for keepAliveTimeout.

Hi @reamon,

You are right, we usually do not use keepAliveTimeout and this particular scenario haven’t faced.

Recently we observed that there were duplicate HTTP requests sent to target system, so analyzed TCP DUMP and observed below pattern causing the duplicate request

  • Target System —> if connection is idle for 30 seconds, Target system will request to close connection (FIN,ACK)
  • IS —> ACK
  • There is a gap of more than 30 seconds before IS send the FIN,ACK and during this, there is a HTTP request A to Target System [as per TCP thing I understand from google, target system should not process this request and must reset the connection by sending RST,ACK but some how its processing the request (may be abnormal behavior)]
  • IS —> later after the gap IS sends the FIN,ACK
  • Target System ----> ACK
  • Now IS will open new connection and send again the same HTTP request A and its causing duplicate request at target system

Not understanding why target System processed the HTTP request A when it sent close connection already during the gap above mentioned.

So as a temporary solution, to avoid duplicate requests we close connection from IS before target close the same. (until target system rectify this behavior)

Note: As we could not achieve this by setting keepAliveTimeout,we set maxKeepAliveConnections to 0 where IS send HTTP requests to target system and immediately close the connection. But need to understand how does setting keepAliveTimeout in pub.client:http work exactly when I am calling a target system.

Hope I explained this well :smiley:

Regards,
Niranth

Hi @John_Carter4 ,

Thank you for your response. Yes right, we able to achieve close connection immediately by setting maxKeepAliveConnections to 0 where IS send HTTP request to target system and immediately close the connection.

Just wanted to understand on keepAliveTimeout, as I tried few scenarios to understand its behavior but not getting it.

Note: We are returning the response as bytes not as stream.

Regards,
Niranth

Can you post your service that is using pub.client:http and indicate how it is being initiated. I will investigate this with our R&D team.
thanks,
John.

Thanks for the detail!

testPdfZipFileTosendTarget.zip (3.7 KB)

Hi @John_Carter4 ,

Attaching the service that is calling pub.client:http… this service created by me to reproduce the scenario we were facing. Note: The issue is not with all the target system.

What we observed from TCP dump is when the target system is idle for say 30 seconds, it will request to close connection (FIN,ACK) and IS will acknowledge with ACK. But before IS issue close connection (FIN,ACK), there is a HTTP request to target System where there will be connection reset from target system and IS issue the same HTTP request again to target. I guess this is expected behavior from TCP protocol. Want to understand who initiate the new connection, is it pub.client.http or the TCP protocol ?

Also, wanted to understand more on how keepAliveTimeout is used and in what scenarios.

Below is the IS server log with trace level:

Invoking service pub.client:http :-

2022-06-15 11:22:35 IST [ISS.0015.0005D] Invoking service pub.client:http.
2022-06-15 11:22:35 IST [ISS.0014.0016T] Invoking service pub.client:http
**2022-06-15 11:22:35 IST [ISC.0064.0002T] HTTP Get http://localhost:5554/rest/flow/restt/writePdfZipFile?json=NZ+7** ---> **[u]this is the http request after target system request TCP close connection and before IS close connection[/u]**
2022-06-15 11:22:35 IST [ISC.0064.0015T] **Opening Socket http://localhost:5554/rest/flow/restt/writePdfZipFile?json=NZ+7** ---> **[u]opening the socket[/u]**
2022-06-15 11:22:35 IST [ISC.0064.0021T] Successfully connected to  host: localhost  port: 5554
2022-06-15 11:22:35 IST [ISS.0142.0001D] Registered Lifecycle Listener Socket[addr=localhost/127.0.0.1,port=5554,localport=59533]
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> GET /rest/flow/restt/writePdfZipFile?json=NZ+7 HTTP/1.1
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> User-Agent: Mozilla/4.0 [en] (WinNT; I)
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> Accept: image/gif, */*
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> Host: localhost:5554
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> Content-Type: application/x-www-form-urlencoded
2022-06-15 11:22:35 IST [ISS.0142.0002D] Unregistered Lifecycle Listener Socket[addr=localhost/127.0.0.1,port=5554,localport=59533] is true
2022-06-15 11:22:35 IST [ISC.0064.0015T] **Opening Socket http://localhost:5554/rest/flow/restt/writePdfZipFile?json=NZ+7** --> **[u]Again opening new socket[/u]**
2022-06-15 11:22:35 IST [ISC.0064.0021T] Successfully connected to  host: localhost  port: 5554
2022-06-15 11:22:35 IST [ISS.0142.0001D] Registered Lifecycle Listener Socket[addr=localhost/127.0.0.1,port=5554,localport=59590]
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> GET /rest/flow/restt/writePdfZipFile?json=NZ+7 HTTP/1.1
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> User-Agent: Mozilla/4.0 [en] (WinNT; I)
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> Accept: image/gif, */*
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> Host: localhost:5554
2022-06-15 11:22:35 IST [ISC.0038.0002D] --> Content-Type: application/x-www-form-urlencoded
2022-06-15 11:22:35 IST [ISC.0040.0001D] HTTP/1.1 200
2022-06-15 11:22:35 IST [ISC.0038.0002D] <-- Content-Type: text/html; charset=UTF-8
2022-06-15 11:22:35 IST [ISC.0038.0002D] <-- Content-Length: 75
2022-06-15 11:22:35 IST [ISS.0142.0002D] Unregistered Lifecycle Listener Socket[addr=localhost/127.0.0.1,port=5554,localport=59590] is true
2022-06-15 11:22:35 IST [ISC.0049.0008D] Sequence : index=2 depth=2
2022-06-15 11:22:35 IST [ISC.0049.0008D] Sequence MAIN: index=2 depth=1
2022-06-15 11:22:35 IST [ISC.0049.0005D] Invoke : index=1 depth=1

TCP DUMP:

Target system issue close connection after its idle for 30 seconds

|836|11:22:31.778008|127.0.0.1|127.0.0.1|TCP|44|5554 → 59533 [FIN, ACK] Seq=155 Ack=203 Win=2619648 Len=0|
|---|---|---|---|---|---|---|
|837|11:22:31.778026|127.0.0.1|127.0.0.1|TCP|44|59533 → 5554 [ACK] Seq=203 Ack=156 Win=2619392 Len=0|

IS send http request to target system before IS close connection (there will be connection RST,ACK from target) :

|887|11:22:35.202978|127.0.0.1|127.0.0.1|HTTP|239|GET /rest/flow/restt/writePdfZipFile?json=NZ+7 HTTP/1.1 |
|---|---|---|---|---|---|---|
|888|11:22:35.203017|127.0.0.1|127.0.0.1|TCP|44|5554 → 59533 [RST, ACK] Seq=156 Ack=398 Win=0 Len=0|
|889|11:22:35.224904|127.0.0.1|127.0.0.1|TCP|56|[TCP Retransmission] [TCP Port numbers reused] 59581 → 6849 [SYN] Seq=0 Win=65535 Len=0 MSS=65495 WS=256 SACK_PERM=1|
|890|11:22:35.224933|127.0.0.1|127.0.0.1|TCP|44|6849 → 59581 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0|
|891|11:22:35.315526|127.0.0.1|127.0.0.1|TCP|56|59590 → 5554 [SYN] Seq=0 Win=65535 Len=0 MSS=65495 WS=256 SACK_PERM=1|
|892|11:22:35.315574|127.0.0.1|127.0.0.1|TCP|56|5554 → 59590 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=65495 WS=256 SACK_PERM=1|
|893|11:22:35.315598|127.0.0.1|127.0.0.1|TCP|44|59590 → 5554 [ACK] Seq=1 Ack=1 Win=2619648 Len=0|
|894|11:22:35.315703|127.0.0.1|127.0.0.1|HTTP|239|GET /rest/flow/restt/writePdfZipFile?json=NZ+7 HTTP/1.1 |
|895|11:22:35.315717|127.0.0.1|127.0.0.1|TCP|44|5554 → 59590 [ACK] Seq=1 Ack=196 Win=2619648 Len=0|
|896|11:22:35.317187|127.0.0.1|127.0.0.1|HTTP|198|HTTP/1.1 200 OK  (text/html)|
|897|11:22:35.317211|127.0.0.1|127.0.0.1|TCP|44|59590 → 5554 [ACK] Seq=196 Ack=155 Win=2619392 Len=0|

Hi @Niranth_NS,
Can you please do the following and send post results here

  1. Set watt.net.socketpool.debug=true to get more details about each outbound call from IS. This output goes to wrapper.log and this property takes effect (turning on/off) within few seconds when configured from the IS Admin extended settings page.

  2. If pub.client:http KeepaliveTimeout parameter doesn’t work, try setting the global parameter watt.net.clientKeepaliveTimeout=20

  3. Additionally, try setting these parameters to a higher value to ensure they are not interfering:
    watt.net.clientKeepaliveUsageLimit=100000 and
    watt.net.clientKeepaliveAgingLimit=1800000.

Depending on this I may then ask you raise an incident with our support team, so that I can engage some R&D people to look at this more fully.
regards,
John.

Hi @John_Carter4 ,

I saw you reply jus now before I posted my service. Sure, noted on this and try with all these settings and update you the results here.

Regards,
Niranth

Can you confirm what you said here, do you mean “target systems
i.e. is this problem only occurring with this application ?
Also what you mean by idle, are you sure that the target system is not forgetting to close the request, is the service thread on the IS blocked during the period ?

It could be that it is the target application that is not completing the request properly, the connection is then forced closed by the target system, which would in turn trigger webMethods to retry. Do you think this is a possible scenario ?

regards,
John.

Hi @John_Carter4 ,

Thank you so much for your advise on doing the extended settings change to get detailed logs on sockets.

I understood from the logs that there will be a socket pool sweeper executes every 60 seconds and If we set pub.client:http KeepaliveTimeout say for 20 seconds.

So, for example there is a socket pool sweeper executed at 3:00:00 PM, at same time IS send http request to target system (New socket is created) and this TCP socket connection is expired at 3:00:20 PM (as per keep alive timeout set to 20 seconds), but socket is not closed at 3:00:20PM. It will get closed at next socket pool sweeper execution i.e. at 3:01:00 PM where sweeper looks for all stale connection to close. Please correct me if my understanding is wrong.

wrapper.log bit below for 278130337:

INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.594 (tid=1228792482) : *** <NetURLConn> starting request; (retryAvailable=true) (connected=false) (outputopen=false) (postOutput=null)
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.594 (tid=1228792482) : <SockPool> sock request for key http@5554@localhost
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.594 (tid=1228792482) : 	socket pool NOT found... creating a new one for key http@5554@localhost; handshake is : true
INFO   | jvm 7    | 2022/06/15 15:05:13 | >>>>> Connection attempt SUCCEEDED in 0milli-seconds
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.594 (tid=1228792482) : 	Socketpool is empty... created new socket 278130337
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.594 (tid=1228792482) : <NetURLConn> received socket 278130337
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.594 (tid=1228792482) : *** <NetURLConn> connected=true (isRetry=false)
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.594 (tid=1228792482) : *** <NetURLConn> posted request; starting to read response; ISresponseValid=true
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.596 (tid=1228792482) : *** <NetURLConn> response code is : 200
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.596 (tid=1228792482) : <put> sock request for key http@5554@localhost
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.596 (tid=1228792482) : <SocketPool> adding socket 278130337 to the cache
INFO   | jvm 7    | 2022/06/15 15:05:13 | SocketPool> 2022-06-15 15:05:13.596 (tid=1228792482) : *** <HttpURLConn> releasing socket 278130337
INFO   | jvm 7    | 2022/06/15 15:05:38 | SocketPool> 2022-06-15 15:05:38.046 (tid=1697219554) : <SocketMonitor> sweeping for stale sockets
INFO   | jvm 7    | 2022/06/15 15:05:38 | SocketPool> 2022-06-15 15:05:38.046 (tid=1697219554) : <SocketMonitor> Now processing pool http@5554@localhost (socket count=1)
INFO   | jvm 7    | 2022/06/15 15:05:38 | SocketPool> 2022-06-15 15:05:38.051 (tid=1697219554) : <SocketPool> adding socket 278130337 to the cache
INFO   | jvm 7    | 2022/06/15 15:05:38 | SocketPool> 2022-06-15 15:05:38.051 (tid=1697219554) : <SocketMonitor> socket 278130337 is NOT Stale
INFO   | jvm 7    | 2022/06/15 15:05:38 | SocketPool> 2022-06-15 15:05:38.051 (tid=1697219554) : <SocketMonitor> Finished processing pool http@5554@localhost
INFO   | jvm 7    | 2022/06/15 15:06:38 | SocketPool> 2022-06-15 15:06:38.051 (tid=1697219554) : <SocketMonitor> sweeping for stale sockets
INFO   | jvm 7    | 2022/06/15 15:06:38 | SocketPool> 2022-06-15 15:06:38.051 (tid=1697219554) : <SocketMonitor> Now processing pool http@5554@localhost (socket count=1)
INFO   | jvm 7    | 2022/06/15 15:06:38 | SocketPool> 2022-06-15 15:06:38.051 (tid=1697219554) : <SocketMonitor> socket 278130337 is Stale; closing it
INFO   | jvm 7    | 2022/06/15 15:06:38 | SocketPool> 2022-06-15 15:06:38.051 (tid=1697219554) : <SocketMonitor> Closed socket=278130337)
INFO   | jvm 7    | 2022/06/15 15:06:38 | SocketPool> 2022-06-15 15:06:38.051 (tid=1697219554) : <SocketMonitor> Finished processing pool http@5554@localhost
INFO   | jvm 7    | 2022/06/15 15:07:38 | SocketPool> 2022-06-15 15:07:38.052 (tid=1697219554) : <SocketMonitor> sweeping for stale sockets

Yes and that explains why the keepAlive value is not being respected, the sweeper interval has to be lower than the keepAlive, so I would recommend changing the default from 60 via the extended setting.

watt.net.socketpool.sweeperInterval=10

regards,
John.

1 Like

Thanks @John_Carter4 . It works and understand how pub.client:http’s keepAliveTimeout works. :slight_smile:

Regards,
Niranth