New Lucee deployment won't start without restarting Tomcat

I’m using Lucee 5.2.3.35 on Tomcat 8 on an AWS EC2 instance (behind ELB). I only deploy a small number of Lucee files as part of my application:

  1. WEB-INF/lib/lucee-5.2.3.35.jar
  2. WEB-INF/lucee-server/context/lucee-server.xml
  3. WEB-INF/lucee/lucee-web.xml.cfm

As expected, Lucee copies out all the other files it needs when it first runs, but then onApplicationStart does not run. If I restart the Tomcat server, the application starts and everything seems fine.

There are no error messages in application.log and I’m at a loss as to how to diagnose the problem.

The biggest clue I have is that there are errors in the Tomcat log that I do not see on the previous version of Lucee:

2017-09-04 13:20:11.530 copy bundled extension to local extension directory (if never done before)
2017-09-04 13:20:11.530 detected Extensions to install (new;new-fresh):7E673D15-D87C-41A6-8B5F1956528C605F;name=MySQL;label=MySQL;version=5.1.38, 99A4EF8D-F2FD-40C8-8FB8C2E67A4EEEB6;name=MSSQL;label=MS SQL Server;version=4.0.2206.100, 671B01B8-B3B3-42B9-AC055A356BED5281;name=PostgreSQL;label=PostgreSQL;version=9.4.1212, 2BCD080F-4E1E-48F5-BEFE794232A21AF6;name=JDTsSQL;label=jTDS (MSSQL);version=1.3.1, CED6227E-0F49-6367-A68D21AACA6B07E8;name=Admin;label=Lucee Administrator;version=1.0.0.3, D46D49C3-EB85-8D97-30BEC2F38561E985;name=Doc;label=Lucee Documentation;version=1.0.0.2, 17AB52DE-B300-A94B-E058BD978511E39E;name=S3;label=S3;version=0.9.4.113, 87FE44E5-179C-43A3-A87B3D38BEF4652E;name=EHCache;label=EHCache;version=2.10.0.22, D46B46A9-A0E3-44E1-D972A04AC3A8DC10;name=Chart;label=CFChart;version=1.0.19.24, FAD1E8CB-4F45-4184-86359145767C29DE;name=Hibernate;label=Hibernate;version=3.5.5.65, EFDEB172-F52E-4D84-9CD1A1F561B3DFC8;name=Lucene;label=Lucene;version=2.4.1.29, 66E312DD-D083-27C0-64189D16753FD6F0;name=PDF;label=PDF;version=1.0.0.62, FAD67145-E3AE-30F8-1C11A6CCF544F0B7;name=Form;label=Form tags;version=1.0.0.7;since=5.1.0.21
lucee.runtime.exp.ApplicationException: Extension [19m0lm7ck5i6a.lex] was not found at [/var/lib/tomcat8/webapps/ROOT/WEB-INF/lucee-server/context/extensions/installed/19m0lm7ck5i6a.lex]
	at lucee.runtime.extension.RHExtension.toResource(RHExtension.java:670)
	at lucee.runtime.extension.RHExtension.<init>(RHExtension.java:194)
	at lucee.runtime.config.XMLConfigAdmin._hasRHExtensions(XMLConfigAdmin.java:6584)
	at lucee.runtime.config.XMLConfigAdmin.hasRHExtensions(XMLConfigAdmin.java:6574)
	at lucee.runtime.config.DeployHandler.deployExtension(DeployHandler.java:161)
	at lucee.runtime.config.DeployHandler.deployExtensions(DeployHandler.java:140)
	at lucee.runtime.engine.CFMLEngineImpl.<init>(CFMLEngineImpl.java:355)
	at lucee.runtime.engine.CFMLEngineImpl.getInstance(CFMLEngineImpl.java:620)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at lucee.loader.engine.CFMLEngineFactory.getEngine(CFMLEngineFactory.java:1543)
	at lucee.loader.engine.CFMLEngineFactory._getCore(CFMLEngineFactory.java:561)
	at lucee.loader.engine.CFMLEngineFactory.initEngine(CFMLEngineFactory.java:352)
	at lucee.loader.engine.CFMLEngineFactory.initEngineIfNecessary(CFMLEngineFactory.java:279)
	at lucee.loader.engine.CFMLEngineFactory.getInstance(CFMLEngineFactory.java:175)
	at lucee.loader.engine.CFMLEngineFactory.getInstance(CFMLEngineFactory.java:213)
	at lucee.loader.servlet.LuceeServlet.init(LuceeServlet.java:42)
	at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1227)
	at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1140)
	at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1027)
	at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:5038)
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5348)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:753)
	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:729)
	at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:717)
	at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1129)
	at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1871)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
lucee.runtime.exp.ApplicationException: Extension [wnakxokve7b0.lex] was not found at [/var/lib/tomcat8/webapps/ROOT/WEB-INF/lucee-server/context/extensions/installed/wnakxokve7b0.lex]
	at lucee.runtime.extension.RHExtension.toResource(RHExtension.java:670)
	at lucee.runtime.extension.RHExtension.<init>(RHExtension.java:194)
	at lucee.runtime.config.XMLConfigAdmin._hasRHExtensions(XMLConfigAdmin.java:6584)
	at lucee.runtime.config.XMLConfigAdmin.hasRHExtensions(XMLConfigAdmin.java:6574)
	at lucee.runtime.config.DeployHandler.deployExtension(DeployHandler.java:161)
	at lucee.runtime.config.DeployHandler.deployExtensions(DeployHandler.java:140)
	at lucee.runtime.engine.CFMLEngineImpl.<init>(CFMLEngineImpl.java:355)
	at lucee.runtime.engine.CFMLEngineImpl.getInstance(CFMLEngineImpl.java:620)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at lucee.loader.engine.CFMLEngineFactory.getEngine(CFMLEngineFactory.java:1543)
	at lucee.loader.engine.CFMLEngineFactory._getCore(CFMLEngineFactory.java:561)
	at lucee.loader.engine.CFMLEngineFactory.initEngine(CFMLEngineFactory.java:352)
	at lucee.loader.engine.CFMLEngineFactory.initEngineIfNecessary(CFMLEngineFactory.java:279)
	at lucee.loader.engine.CFMLEngineFactory.getInstance(CFMLEngineFactory.java:175)
	at lucee.loader.engine.CFMLEngineFactory.getInstance(CFMLEngineFactory.java:213)
	at lucee.loader.servlet.LuceeServlet.init(LuceeServlet.java:42)
	at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1227)
	at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1140)
	at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1027)
	at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:5038)
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5348)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:753)
	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:729)
	at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:717)
	at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1129)
	at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1871)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

However, those files are there when I look for them and the app starts fine when Tomcat is restarted. Could the copy process be out of order with the startup process?

[Update 5 Sep 2017]

Adding the missing extension files to the distribution makes the Tomcat errors messages go away, but onApplicationStart is still not called. If I restart the Tomcat service, onApplicationStart is called and the application works fine.

The deployment takes just under 30 seconds, during which time there are likely to be requests coming in. Could this be a timing issue?

I can find no relevant error messages in Tomcat’s catalina.out or Lucee’s application.log. Where else should I be looking?

bump

I’m happy to investigate this myself, but I need a steer on debugging the startup process. Can I increase the logging output somehow?

In case it isn’t clear: this problem makes Lucee unusable on AWS ELB and other automatic load balancing environments because new instances will not start without user intervention.

Hey Rupert,

We had a similar issue on AWS as well. We are using Apache but you should be able to get everything up and running with your own environment using:

Disable auto startup for Lucee and Apache:

$ chkconfig httpd off
$ chkconfig lucee_ctl off

Setup your own Lucee and Apache startup:

$ cd /etc/rc.d/
$ vi rc.local
$ /sbin/service lucee_ctl start
$ /bin/sleep 30s
$ /sbin/service httpd start

The sleep command will give Lucee the time it needs to startup before launching your web service. Obviously, you can increase the time if your Lucee install takes longer. Hope this helps! :sunglasses:

1 Like

@wwilliams Thank you - this is very useful. Unfortunately I have limited control over the starting and stopping of the services as they are all part of the Amazon Machine Image, which is copied anew onto each server instance as it spins up. I could provide patches or make a custom image, but working out exactly what to change is hit-and-miss and could take a long time.

It is quite possibly the same pathology you were seeing though: HTTP requests being sent to Tomcat and then to Lucee before Lucee has finished getting started.

How did you work out what was happening and do you know any way to increase the debug output during startup?

[Update]

I don’t think it’s early HTTP requests: even if no requests come in for several minutes, the application still doesn’t start cleanly. It feels like Lucee is taking a long time to start and Tomcat can’t cope for some reason.

I was able to reduce the chance of the problem happening by reducing the start up time. I did this by disabling TLD scanning.

The problem still happens occasionally, which is obviously unacceptable. The best workaround I could find was to restart the Tomcat server straight after deployment had finished using ELB deployment hooks.

This solution is rather brittle as it relies on implementation details of the AWS Tomcat AMI stack. This really needs addressing in Lucee if it’s going to be a reliable platform for scalable cloud-based web servers.