Hi all,
I’m running a pair of Google Compute Engine VMs with the following configuration :
Intel(R) Xeon(R) CPU @ 2.20GHz / Installed RAM 16.0 GB
Windows Server 2022 Datacenter Version 21H2
IIS 10.0.20348.1
Lucee 5.2.9.31
Servlet container : Apache Tomcat/8.5.35
Java : 1.8.0_192 (Oracle Corporation) 64bit
Request timeout is set to 50 seconds and no concurrent requests are queued.
Application Listener is set to “Mixed”.
SQL target is a remote SQL server 15.0.2095.3(web edition).
Lucee SQL 6.2.2.jre8 Application is used for datasources.
Lucee-server.xml settings
<default-resource-provider arguments="lock-timeout:1000;" class="lucee.commons.io.res.type.file.FileResourceProvider"/>
<resource-provider arguments="lock-timeout:10000;case-sensitive:false;" class="lucee.commons.io.res.type.http.HTTPSResourceProvider" scheme="https"/>
Servers are dedicated to API calls, meaning a lot of requests (say million) per day.
Most of the calls require SQL requests and return a limited amount of data.
Those machines are behind a GCE load balancer with a simple probe (cfm template) called every 5sec. for heath status.
The typical usage of the application should not be CPU harrassing, still, going between 10-80% would be “normal”. I see on the Overview page (when all is ok) JVM Heap size 46% and non heap 10%.
(when serving ok) Common daemon Service Runner is consuming up to 5GB, IIS worker around 2OOMB. Overall CPU in the 10% Memory 50%.
I noticed a particular behavior that I cannot explain.
Looking at the task manager I can see the “Commons Daemon Service” Runner Going high on memory while its CPU usage stay null and flat. Also a IIS Worker process that seems to follow the same beahaviour (NO CPU, high mem).
When this behaviour occurs, the probe can either respond “normally” or run into a timeout, several times in a quite short (5-10 seconds) time, causing the LB to “missdirect” the API requests to the failing host.
As you’ve already guessed, I’m not a sys/dev-op [at all] so that simple advices / tracks / configuration check would be greatly appreciated. I suspect we face a “too many threads” issue, like described here : https://lucee.daemonite.io/t/server-possibly-crashing-due-to-too-many-threads/5129/20 but this alone may not be relevant.
(and please excuse my English & bleeding eyes)