Hi! Every couple of days, my Apache web server stops responding. When I
look at the connections with netstat, I see that there are a large
number* of TCP connections in the CLOSE_WAIT state. From my limited
understanding of TCP, I believe this means that the HTTP client has
indicated that it is done with the connection and has asked to terminate
it, but my server has not agreed to terminate the connection. I’m guessing
that the problem is more likely to be with my Tomcat or Lucee setup than a
bug in Apache, hence my asking for help here. My current workaround is to
restart Apache when this occurs.
Are there any error conditions that could result in Tomcat/Lucee leaking
TCP connections?
Are there any settings I can change in Lucee or Tomcat to prevent
connection leakage?
Thanks!
“large number” means that there are about 10x the normal number in
nominal operation
Maybe you have a bug in your app that prevents connections from closing, e.g. an unhandled exception that leaves these connections in a zombie state. Might be worth reviewing app/error logs to see if something obvious appears.> On Dec 9, 2015, at 8:40 AM, Leon Miller-Out <@Leon_Miller-Out> wrote:
Hi! Every couple of days, my Apache web server stops responding. When I look at the connections with netstat, I see that there are a large number* of TCP connections in the CLOSE_WAIT state. From my limited understanding of TCP, I believe this means that the HTTP client has indicated that it is done with the connection and has asked to terminate it, but my server has not agreed to terminate the connection. I’m guessing that the problem is more likely to be with my Tomcat or Lucee setup than a bug in Apache, hence my asking for help here. My current workaround is to restart Apache when this occurs.
Are there any error conditions that could result in Tomcat/Lucee leaking TCP connections?
Are there any settings I can change in Lucee or Tomcat to prevent connection leakage?
Thanks!
“large number” means that there are about 10x the normal number in nominal operation
Thanks, Robert. I had seen that SO post, and I believe I upvoted that great
answer!
Since CFML apps aren’t generally responsible for handling their own HTTP
connections, I wonder what kinds of things one could do in CFML that could
cause connection leakage.
I could try to correlate the last requests from each hung connection (by
looking up their IP in the apache access log), but I’m not sure if their
would be lines in the access log for these requests gone bad. It’s worth a
shot, I suppose.
LeonOn Sun, Dec 13, 2015 at 3:19 AM, Robert Munn <@Robert_Munn> wrote:
Maybe you have a bug in your app that prevents connections from closing,
e.g. an unhandled exception that leaves these connections in a zombie
state. Might be worth reviewing app/error logs to see if something obvious
appears.
On Dec 9, 2015, at 8:40 AM, Leon Miller-Out <@Leon_Miller-Out> wrote:
Hi! Every couple of days, my Apache web server stops responding. When I
look at the connections with netstat, I see that there are a large
number* of TCP connections in the CLOSE_WAIT state. From my limited
understanding of TCP, I believe this means that the HTTP client has
indicated that it is done with the connection and has asked to terminate
it, but my server has not agreed to terminate the connection. I’m guessing
that the problem is more likely to be with my Tomcat or Lucee setup than a
bug in Apache, hence my asking for help here. My current workaround is to
restart Apache when this occurs.
Are there any error conditions that could result in Tomcat/Lucee leaking
TCP connections?
Are there any settings I can change in Lucee or Tomcat to prevent
connection leakage?
Thanks!
“large number” means that there are about 10x the normal number in
nominal operation
Our Apache proxies requests to a couple of other services (one being
Lucee). We suspected that one of these services (probably the non-Lucee
service) might be misbehaving and failing to disconnect in some cases,
leaking connections. We added ProxyTimeout 60 to our Apache config, and
it seems like the server has been more stable since that change.
If you’re not already doing this, you can probably find the culprit
(slightly laboriously) by adding timeouts line by line, then checking
netstat. Something like this should be supported in (I think) Apache 2.4+:
KenOn Wednesday, January 6, 2016 at 2:31:57 PM UTC-5, Leon Miller-Out wrote:
Update:
Our Apache proxies requests to a couple of other services (one being
Lucee). We suspected that one of these services (probably the non-Lucee
service) might be misbehaving and failing to disconnect in some cases,
leaking connections. We added ProxyTimeout 60 to our Apache config, and
it seems like the server has been more stable since that change.