We’re currently using this Lucee official image from Docker Hub (I think):
In production, we are running on Kubernetes (K8), which works well most of the time. But, sometimes, the K8 liveness probe (which hits port
:8500 directly) just stops responding. And, nginx, on the same pod, can’t hit the upstream
:8500 port either). This causes K8 to pull the pod out of rotation and restarted it.
I cannot find any evidence of why Lucee is becoming unresponsive. If I look in FusionReactor around the time of a given restart, all of the graphs look totally normal. Example (the missing data is the restart):
Just prior to the blip, all graphs look fine - no memory issues, no CPU issues, not requests pilling up. Smooth sailing. Then, out of no where, Lucee stops responding.
I saw in another thread that I might get a better idea if I look at the Tomcat logs (that it might be Tomcat refusing connections and not necessarily something with Lucee itself). In our platform, we just slurp everything from
stdout into our log aggregator. So, if I can figure out how to pipe the Tomcat logs to the output, I can probably access them. But, this goes way beyond my day-to-day knowledge. I’ve tried Googling, but I can’t figure it out.
In the container, I can see that there is a file:
But, when I look at the contents, the last time-stamp is from
02-Nov-2020 (a year ago). So, I don’t think I’m even looking in the right place.
I’m totally lost Any help would be much appreciated.