EventGateways silently stop after multiple hours

Version: Lucee 5.3.2.77
Servlet Container: WildFly / Undertow - 1.4.24.Final
Java: 1.8.0_201 (Oracle Corporation) 64bit
OS Linux: (4.14.123-111.109.amzn2.x86_64) 64bit
CommandBox: v4.7.0+00026

I have a 3 EventGateways, each of which simply makes an HTTP request using <cfhttp> back to Lucee every second. They all run fine for multiple hours, but then will simply stop making the request, with no error that I can find. The status of the EventGateway as viewed from the Web Admin remains at Running. If I try to stop/restart any of them through the Web Admin, it just hangs in “Stopping…”. Through some logging and thread dumps I’ve been able to find out the following:

From the thread dump this is the thread running the EventGateway.

"Thread-7" #34 prio=5 os_prio=0 tid=0x00007f0085054800 nid=0x1183 in Object.wait() [0x00007f0069a44000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at lucee.commons.io.IOUtil.copy(IOUtil.java:354)
        - locked <0x00000004b6222ed8> (a java.lang.Object)
        at lucee.commons.io.IOUtil.copy(IOUtil.java:310)
        at lucee.commons.io.IOUtil.toString(IOUtil.java:745)
        at lucee.commons.io.IOUtil.toString(IOUtil.java:705)
        at lucee.runtime.tag.Http._doEndTag(Http.java:1247)
        at lucee.runtime.tag.Http.doEndTag(Http.java:685)
        at reportprocessor_cfc$cf.udfCall(/services/ReportProcessor.cfc:14)
        at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
        at lucee.runtime.type.UDFImpl._call(UDFImpl.java:342)
        at lucee.runtime.type.UDFImpl.call(UDFImpl.java:215)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:648)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:570)
        at lucee.runtime.ComponentImpl.call(ComponentImpl.java:1918)
        at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:768)
        at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1707)
        at services.daemonservice_cfc$cf.udfCall(/services/DaemonService.cfc:22)
        at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
        at lucee.runtime.type.UDFImpl._call(UDFImpl.java:342)
        at lucee.runtime.type.UDFImpl.call(UDFImpl.java:215)
        at lucee.runtime.type.scope.UndefinedImpl.call(UndefinedImpl.java:765)
        at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:768)
        at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1707)
        at services.basegateway_cfc$cf.udfCall(/services/BaseGateway.cfc:38)
        at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
        at lucee.runtime.type.UDFImpl._call(UDFImpl.java:342)
        at lucee.runtime.type.UDFImpl.callWithNamedValues(UDFImpl.java:205)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:649)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:570)
        at lucee.runtime.ComponentImpl.callWithNamedValues(ComponentImpl.java:1937)
        at lucee.runtime.ComponentPageImpl.callWDDX(ComponentPageImpl.java:664)
        at lucee.runtime.ComponentPageImpl.call(ComponentPageImpl.java:206)
        at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:942)
        at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:834)
        at lucee.runtime.listener.ModernAppListener._onRequest(ModernAppListener.java:216)
        at lucee.runtime.listener.MixedAppListener.onRequest(MixedAppListener.java:42)
        at lucee.runtime.PageContextImpl.execute(PageContextImpl.java:2409)
        at lucee.runtime.PageContextImpl._execute(PageContextImpl.java:2399)
        at lucee.runtime.PageContextImpl.executeCFML(PageContextImpl.java:2374)
        at lucee.runtime.gateway.GatewayEngineImpl.call(GatewayEngineImpl.java:392)
        at lucee.runtime.gateway.GatewayEngineImpl.callOneWay(GatewayEngineImpl.java:356)
        at lucee.runtime.gateway.CFCGateway.callOneWay(CFCGateway.java:170)
        at lucee.runtime.gateway.CFCGateway.doStart(CFCGateway.java:101)
        at lucee.runtime.gateway.GatewayThread.run(GatewayThread.java:44)

And for context, here’s the relevant portion from services/ReportProcessor.cfc

File: \app\cfm\services\ReportProcessor.cfc
13:     <cffunction name="run">
14:     	<cfhttp url="#variables.apiUrl#" method="GET">
15:     	<cfif cfhttp.statusCode eq "200 OK">
16:     		<cfreturn />
17:     	</cfif>
18:     	<cfset var errorMsg = "Error Contacting Queue Service">
19:     	<cfthrow type="ReportProcessor" message="#errorMsg#" detail="statusCode: #cfhttp.statusCode# #cfhttp.fileContent#">
20:     </cffunction>

I log all requests in the onRequestEnd method of Application.cfc, and that includes the thread name.

"INFO","XNIO-1 task-26","06/28/2019","08:04:24","","Request: /rest/v1/report/checkQueue - Time: 104 ms"

So this tells me that the request itself isn’t getting stuck in any of my code, as it makes it to the onRequestEnd handler just fine. I looked in the thread dump for the thread XNIO-1 task-26 and found the following:

"XNIO-1 task-26" #150 prio=5 os_prio=0 tid=0x00007f0014011800 nid=0x1201 waiting on condition [0x00007f0041cc0000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000004a538c100> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
        - None

So if I’m understanding this correctly, the EventGateway properly executed the <cfhttp> line, and is just stuck waiting on the response. The request it self properly went through the Lucee request lifecycle, but is getting stuck after Lucee hands the request off to the servlet container. And it never makes it back to the <cfhttp> call that requested it, so it’s just stuck waiting.

Anyone have any insight on what I can do to prevent this?

Edit: Forgot to add Lucee version