Lucee freeze all websites

Hello! It happens that lucee freeze all websites randomly with no apparent reason, apps are well tuned, this is what I see in catalina.log.
Someone can help? This drive me crazy.

21-Feb-2019 00:52:36.093 SEVERE [http-nio-8888-exec-25] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun
 java.lang.ThreadDeath
        at java.lang.Thread.stop(Thread.java:853)
        at lucee.commons.io.StopThread.run(SystemUtil.java:1549)

21-Feb-2019 01:28:59.845 SEVERE [http-nio-8888-exec-53] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun
 java.lang.ThreadDeath
        at java.lang.Thread.stop(Thread.java:853)
        at lucee.commons.io.StopThread.run(SystemUtil.java:1549)

21-Feb-2019 01:29:54.707 SEVERE [http-nio-8888-exec-59] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun
 java.lang.ThreadDeath
        at java.lang.Thread.stop(Thread.java:853)
        at lucee.commons.io.StopThread.run(SystemUtil.java:1549)

System Information

Version Lucee 5.2.9.31
Version Name Velvet
Release date Sep 17, 2018
ColdFusion® compatibility version 2016.0.03.300357
OS Linux (4.9.0-8-amd64) 64bit
Servlet Container Apache Tomcat/8.5.24
Java1.8.0_192 (Oracle Corporation) 64bit
Architecture 64bit

Same problem here, started today:

Lucee 5.3.2.15-SNAPSHOT
Apache Tomcat/8.5.32
OpenJDK 11.0.1
Windows Server 2012 (6.2) 64bit

25-Feb-2019 11:41:46.978 SEVERE [ajp-nio-8009-exec-5] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun 
 java.lang.ThreadDeath
	at java.base/java.lang.Thread.stop(Thread.java:942)
	at lucee.commons.io.StopThread.run(SystemUtil.java:1491)

System property [org.owasp.esapi.opsteam] is not set
System property [org.owasp.esapi.devteam] is not set

In my case it seems related to pages containing cfhttp tag. Perhaps there is a problem in how lucee handle https errors/timeout. Anyway I use a timeout of 10 seconds for cfhttp tag.

Really, all the thread stop stuff tells you is that Lucee is killing long running threads.

  • Use Fusionreactor to see why the threads are running long
  • add timeouts to all your cfhttp calls
  • raise your page timeout as Lucee is very destructive when it kills long running pages and can leave the threads zombied since a hard kill doesn’t’ clean up the thread.

@Leftbower This line here

System property [org.owasp.esapi.opsteam] is not set

is unrelated. That’s just some cruft from the esapi lib. I actually added some java system props to shut that message up in CommandBox since it was harmless, but leaking to the console.

Do you think it be possible to include the url for the stopped thread in the logs?

It would be great!

bug filed, please vote for it https://luceeserver.atlassian.net/browse/LDEV-2171

2 Likes

Including cfhttp in cftry may help or not? Many thanks

In Tomcat, its usually when GC starts that causes the issue.

If you are not already logging garbage collection, add this to your java options.

-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintHeapAtGC
-Xloggc:YOURDRIVELETTER:\apache\apache-tomcat-8.5.32\logs\gc.log

Additionally if you have not done so, for best practice ALL WEB Applications on Windows should be placed on their own volume.

Putting a cfhttp tag inside a cftry tag would have no affect on whether it times out or runs for a long time.

I have a Lucee server on Debian Linux on a KVM virtual machine. My Applications are located on a NFS mounted volume. May this cause problems of any kind? Aren’t cfm templates compiled on the local machine before execution?
Thanks in advance

NFS is a whole other application into itself. It very well could be the issue with your applications. There are many known issues with NFS, and workarounds , guides and other options on its use.

I avoid NFS when ever possible and will opt for sshfs or smb as both solutions are less problematic. Additionally I try to avoid executing binaries on network shares, instead placing the executable code on the network share and then placing the application server(s) to execute on native drive.