Lucee Service Shutdown problem in Windows


#1

I’ve just installed a fresh Windows Server 2016 Box and installed the 5.2.6.60 Windows Installer / IIS.
All works fine, but… If I szop the Lucee Service from Service Manager (or with net stop lucee from command line), it takes a veery long time until the service is stopped (about 2-3 Minutes!). In the catalina-Log, I see this message:

22-Mar-2018 09:15:33.799 SCHWERWIEGEND [Thread-12] org.apache.catalina.core.ContainerBase.stopInternal A child container failed during stop
 java.util.concurrent.ExecutionException: org.apache.catalina.LifecycleException: Failed to stop component [StandardEngine[Catalina].StandardHost[www.myserver.ch]]
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:999)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:221)
	at org.apache.catalina.core.StandardService.stopInternal(StandardService.java:502)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:221)
	at org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:810)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:221)
	at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:447)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:398)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:485)
Caused by: org.apache.catalina.LifecycleException: Failed to stop component [StandardEngine[Catalina].StandardHost[www.myserver.ch]]
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:233)
	at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1424)
	at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1413)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.catalina.LifecycleException: A child container failed during stop
	at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:1006)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:221)
	... 6 more

22-Mar-2018 09:15:33.812 SCHWERWIEGEND [Thread-12] org.apache.catalina.startup.Catalina.stopServer Catalina.stop: 
 org.apache.catalina.LifecycleException: Failed to stop component [StandardServer[8005]]
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:233)
	at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:447)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:398)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:485)
Caused by: org.apache.catalina.LifecycleException: Failed to stop component [StandardService[Catalina]]
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:233)
	at org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:810)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:221)
	... 7 more
Caused by: org.apache.catalina.LifecycleException: Failed to stop component [StandardEngine[Catalina]]
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:233)
	at org.apache.catalina.core.StandardService.stopInternal(StandardService.java:502)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:221)
	... 9 more
Caused by: org.apache.catalina.LifecycleException: A child container failed during stop
	at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:1006)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:221)
	... 11 more

Any idea, what can cause this? I’ve also tried to update to the latest BonCode connector and also to Tomcat 8.0.50 - no success… JVM is 1.8.0_162-b12.


#2

If it’s a clean install, shouldn’t your Tomcat version be 8.5.28?


#3

Actually, the one bundled with the windows installer was 8.0.32.
I’ve upgraded to 8.0.50.


#4

I got a step further with this issue… In the meantime, I’ve installed everything manually (like described on these pages: http://docs.lucee.org/guides/running-lucee/windows.html. Tomcat is now 8.5.29; Java 1.8.0_162-b12. Everything starts up fine. As soon as I connect to an SQL Server 2016 (with driver from Microsoft), I cannot stop Tomcat. If there isn’t established a Database-Connection; Tomcat can be stopped within seconds; otherwise it can take up to 5 minutes. Also tried the JDTS Driver; same issue.
Googling around I found informations that JDBC drivers a registered in the JVM as singletons. Maybe this causes a problem?
What will stop Tomcat immediately is, if I kill the process “tomcat8.exe” within task manager. But this is probably not the finest solution…


#5

Have you tried out the latest 5.2.7 snapshot?

Which version of the sql drivers extensions are you running?

if it’s happening with both sql drivers, so it’s probably not related to the jdbc driver

if you try and stop the service and then shutdown/restart the mssql server, does the hanging tomcat service also stop?


#6

Just tried the latest 5.2.7-Snapshot. Same issue.
MSSQL-Driver is 6.2.2.jre8. I’ve seen that there’s a 6.4 Release from Microsoft (https://www.microsoft.com/en-us/download/confirmation.aspx?id=56615). But actually don’t know how I can update that lucee will use that version.
I’ve also tries to restart mssql server - still hanging tomcat service…


#7

I’d suggest it’s might be something related to your code, have you tried stripping your code base until it doesn’t happen?


#8

Zac, thanks for your input. I think, I’ve figured out what’s causing the Tomcat hangs…
I’ve deleted the whole “WEB-INF” folder from the webroot; a new, fresh one is created with empty lucee-web-settings. With that I can start - run - stop the site without any issues. As soon as I create an “Event Gateway” (DirectoryWatcher) within the Web Admin for that site, the Gateway actually runs and works, but - this is what leeds to the non-stopping Tomcat instance. I’ve left all parameters for the DirectoryWatcher-EventGateway at it’s defaults. As soon as I remove the Event Gateway in the Web-Admin, the service can be stopped without any issues/delays.


#9

glad i could help, can you file a bug in jira?


#10

Just created bug LDEV-1763…


#11

I had a play and also i can’t reproduce the error either

I did make some changes to the directory watcher to make it more user friendly


#12

Thanks for your investigations, Zac.
Your changes - will those be in the next snapshot? Or how will I get those?

Just a small recap of my issue:
If I enter this line:
<gateway cfc-path="lucee.extension.gateway.DirectoryWatcher" custom="deleteFunction=onDelete&amp;addFunction=onAdd&amp;directory=D%3A%5Ctemp&amp;interval=60000&amp;recurse=true&amp;extensions=*&amp;changeFunction=onChange" id="myTestGateway" listener-cfc-path="lucee.extension.gateway.DirectoryWatcherListener" read-only="false" startup-mode="automatic"/>

in my lucee-web.xml.cfm (between <gateways></gateways>); the Tomcat-Service cannot be stopped anymore from the Windows Service Manager; I have to kill the tomcat8.exe process manually.
Did you do your checks also on WinServer 2016? Can it be a security setting within Windows that prevents stopping tomcat if there’s an Event Gateway installed? Is there some information available, how those Event Gateways are running internally?


#13

the pull request needs to be accepted, nothing stopping you just dropping those files into your local server tho


#14

Ok - I’ve taken your modified “DirectoryWatcher.cfc” and put that into [lucee-web]\WEB-INF\lucee\components\lucee\extension\gateway. I see now a little more information in the DirectoryWatcher.log. But unfortunately nothing leeding to “why-service-cannot-be-stopped”. I’ll try to investigate into Windows Security; otherwise I’m going to reinstall everything again… :frowning:


#15

what’s your listener.cfc look like? how many files in the directory being watched?


#16

For my tests, I took the standard listener; the one that is only logging if something changes within the directory. The files being watched are actually for my tests only a few within D:\temp.
I’ve now really reduced the logic; also create a new iis-site without any special stuff (like coldbox or other frameworks); only with one index.cfm. As soon as I register the Gateway; Tomcat cannot be stopped.
In the “DirectoryWatcher.log”, I just found something strange:

"Severity","ThreadID","Date","Time","Application","Message"
"INFO","w3svc6-startStop-1","03/29/2018","08:48:22","","init g01 [D:\Data\testweb\WEB-INF\lucee\components\lucee\extension\gateway\DirectoryWatcherListener.cfc]"
"INFO","Thread-12","03/29/2018","08:48:22","","start g01 Directory[D:\temp]"
"INFO","localhost-startStop-2","03/29/2018","08:48:45","","stop g01"
"INFO","Thread-26","03/29/2018","08:48:45","","start g01 Directory[D:\temp]"

Line 1 => Header
Line 2+3 => those are written if I start Tomcat
now the interesting thing: If I stop Tomcat from Service manager, Line 4 AND 5 are written… Line 4 says startStop; and Line 5 starts the gateway up again? Maybe this causes the problem? Why does the gateway-thread start again if I tell Tomcat to stop?

Again the Log entries I see in catalina.log. Is there a possibility to get the information WHICH “child container failed during stop”?

29-Mar-2018 08:48:50.783 SCHWERWIEGEND [Catalina-startStop-2] org.apache.catalina.core.ContainerBase.stopInternal A child container failed during stop
 java.util.concurrent.ExecutionException: org.apache.catalina.LifecycleException: Failed to stop component [StandardEngine[Catalina].StandardHost[localhost].StandardContext[/host-manager]]
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.catalina.core.ContainerBase.stopInternal(ContainerBase.java:997)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:226)
	at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1437)
	at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1426)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.catalina.LifecycleException: Failed to stop component [StandardEngine[Catalina].StandardHost[localhost].StandardContext[/host-manager]]
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:238)
	... 6 more
Caused by: org.apache.catalina.LifecycleException: Failed to stop component [WebappLoader[/host-manager]]
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:238)
	at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5473)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:226)
	... 6 more
Caused by: java.lang.NullPointerException
	at lucee.runtime.osgi.EnvClassLoader.load(EnvClassLoader.java:120)
	at lucee.runtime.osgi.EnvClassLoader.getResource(EnvClassLoader.java:51)
	at lucee.runtime.osgi.EnvClassLoader.findResource(EnvClassLoader.java:264)
	at org.apache.juli.ClassLoaderLogManager.readConfiguration(ClassLoaderLogManager.java:461)
	at org.apache.juli.ClassLoaderLogManager$3.run(ClassLoaderLogManager.java:430)
	at org.apache.juli.ClassLoaderLogManager$3.run(ClassLoaderLogManager.java:426)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.apache.juli.ClassLoaderLogManager.getClassLoaderInfo(ClassLoaderLogManager.java:426)
	at org.apache.juli.ClassLoaderLogManager.reset(ClassLoaderLogManager.java:361)
	at org.apache.juli.logging.LogFactory.release(LogFactory.java:235)
	at org.apache.catalina.loader.WebappClassLoaderBase.clearReferences(WebappClassLoaderBase.java:1549)
	at org.apache.catalina.loader.WebappClassLoaderBase.stop(WebappClassLoaderBase.java:1483)
	at org.apache.catalina.loader.WebappLoader.stopInternal(WebappLoader.java:447)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:226)
	... 8 more

#17

i just updated the my pull request for the directoryWatcher, I realised that it scans the dir target twice on start up


#18

I’ve just found a nasty problem which might explain what your experiencing,
you might have a couple of watchers running which you can’t see, hence why
your server could be I/O bound and won’t shutdown

https://luceeserver.atlassian.net/browse/LDEV-1777


#19

I’ve tried to test LDEV-1777. You’ve changed a file called “services.gateway.list.cfm”. I’ve found a file called like that in a WEB-INF-Folder (WEB-INF\lucee\temp\compress\ef…\services.gateway.list.cfm). I think, this file is replaced/checked everytime Tomcat is starting up? Where do I have to replace that services.gateway.list.cfm file with the new one?


#20

the context files only get updated if you upgrade lucee, they get deployed the first time the web context is accessed

have you checked for open file handles?

when i was playing with this, i’d left one open by mistake
and it caused all sort of pain. i.e fileopen without a fileclose?

i.e. does it happen after restarting the windows machine? does it get progressively worse?