5.2.7.63 upgrade and cfclasses


#1

Just to report an issue we experienced upgrading from 5.2.7.62 to the 5.2.7.63 patch.

After applying it locally apps took an unusually long time to load, but did eventually work.

In production, things were much worse: FusionReactor showed requests failing to complete and eventually timing out. The logs were full of thread death errors.

I’d experienced this before and found that the same solution worked again: which was to clear the cfclasses folder in each web-context. We did this before applying the patch to other servers and there were no issues.

Has anyone else seen this? Any ideas why existing files in the cfclasses folder might lead to this?

We didn’t see this at all on our previous upgrade to 5.2.7.62, which went smoothly.


#2

there was an issue in the build process that was solved in meantime, best upload it again and give it an other shot.


#3

We actually had an issue before 5.2.7.63. While we aren’t exactly sure what caused it, we know it wasn’t an update. The last update to our Lucee servers (5 of them - same build, but different scripts) was done on May 30, 2018 to 5.2.7.60. We preformed weekly reboots, but that is all since we applied no patches to the servers.

Out of the blue on Tuesday, we started seeing timeouts within our Lucee logs and AJP errors within our Apache logs for about 1-2 minutes.

[Tue Jun 19 09:09:54.495906 2018] [proxy_ajp:error] [pid 4507] (70007)The timeout specified has expired: [client 10.0.21.51:50938] AH00878: read response failed from (null) ()
[Tue Jun 19 09:09:55.395691 2018] [proxy_ajp:error] [pid 4513] (70007)The timeout specified has expired: AH01030: ajp_ilink_receive() can’t receive header
[Tue Jun 19 09:09:55.395735 2018] [proxy_ajp:error] [pid 4513] [client 10.0.11.51:45308] AH00992: ajp_read_header: ajp_ilink_receive failed
[Tue Jun 19 09:09:55.395742 2018] [proxy_ajp:error] [pid 4513] (70007)The timeout specified has expired: [client 10.0.11.51:45308] AH00878: read response failed from (null) (
)
[Tue Jun 19 09:09:59.085687 2018] [proxy_ajp:error] [pid 4518] (70007)The timeout specified has expired: AH01030: ajp_ilink_receive() can’t receive header
[Tue Jun 19 09:09:59.085742 2018] [proxy_ajp:error] [pid 4518] [client 10.0.11.51:45320] AH00992: ajp_read_header: ajp_ilink_receive failed
[Tue Jun 19 09:09:59.085749 2018] [proxy_ajp:error] [pid 4518] (70007)The timeout specified has expired: [client 10.0.11.51:45320] AH00878: read response failed from (null) ()
[Tue Jun 19 09:09:59.281565 2018] [proxy_ajp:error] [pid 4519] (70007)The timeout specified has expired: AH01030: ajp_ilink_receive() can’t receive header
[Tue Jun 19 09:09:59.281609 2018] [proxy_ajp:error] [pid 4519] [client 10.0.11.51:45324] AH00992: ajp_read_header: ajp_ilink_receive failed
[Tue Jun 19 09:09:59.281616 2018] [proxy_ajp:error] [pid 4519] (70007)The timeout specified has expired: [client 10.0.11.51:45324] AH00878: read response failed from (null) (
)
[Tue Jun 19 09:10:00.039193 2018] [proxy_ajp:error] [pid 4529] (70007)The timeout specified has expired: AH01030: ajp_ilink_receive() can’t receive header
[Tue Jun 19 09:10:00.039239 2018] [proxy_ajp:error] [pid 4529] [client 10.0.21.61:45986] AH00992: ajp_read_header: ajp_ilink_receive failed
[Tue Jun 19 09:10:00.039246 2018] [proxy_ajp:error] [pid 4529] (70007)The timeout specified has expired: [client 10.0.21.61:45986] AH00878: read response failed from (null) ()
[Tue Jun 19 09:10:00.124696 2018] [proxy_ajp:error] [pid 4535] (70007)The timeout specified has expired: AH01030: ajp_ilink_receive() can’t receive header
[Tue Jun 19 09:10:00.124730 2018] [proxy_ajp:error] [pid 4535] [client 10.0.11.61:48164] AH00992: ajp_read_header: ajp_ilink_receive failed
[Tue Jun 19 09:10:00.124736 2018] [proxy_ajp:error] [pid 4535] (70007)The timeout specified has expired: [client 10.0.11.61:48164] AH00878: read response failed from (null) (
)
[Tue Jun 19 09:10:01.130885 2018] [proxy_ajp:error] [pid 4530] (70007)The timeout specified has expired: AH01030: ajp_ilink_receive() can’t receive header
[Tue Jun 19 09:10:01.130929 2018] [proxy_ajp:error] [pid 4530] [client 10.0.11.51:45342] AH00992: ajp_read_header: ajp_ilink_receive failed
[Tue Jun 19 09:10:01.130937 2018] [proxy_ajp:error] [pid 4530] (70007)The timeout specified has expired: [client 10.0.11.51:45342] AH00878: read response failed from (null) ()
[Tue Jun 19 09:10:01.431865 2018] [proxy_ajp:error] [pid 4528] (70007)The timeout specified has expired: AH01030: ajp_ilink_receive() can’t receive header
[Tue Jun 19 09:10:01.431910 2018] [proxy_ajp:error] [pid 4528] [client 10.0.21.51:50988] AH00992: ajp_read_header: ajp_ilink_receive failed
[Tue Jun 19 09:10:01.431918 2018] [proxy_ajp:error] [pid 4528] (70007)The timeout specified has expired: [client 10.0.21.51:50988] AH00878: read response failed from (null) (
)

We tried a number of different things over 2 days and nothing helped. No memory or hardware issues. Zero changes to our scripts (as mentioned above - not all servers run the same scripts) and ALL SERVERS HAD THE SAME ISSUE.

Our last ditch effort was to try and update Lucee to 5.2.7.63 and see if that helped - It didn’t!

I messaged @Julian_Halliwell and he sent me what he had done. We immediately cleared out the cfclasses directory on each machine and rebooted them. That was 36 hours ago and everything looks to be working great!

Huge thanks to @Julian_Halliwell!

QUESTION: What would case this and why did clearing out the cfclasses directory resolve it (I actually know the answer, but figure someone might clarify here)?


#4

Just trying to get a few questions answered about this cfclasses “bug”:

QUESTION: What would case this and why did clearing out the cfclasses directory resolve it (I actually know the answer, but figure someone might clarify here)?

QUESTION: Do we need to clear out the cfclasses directory every reboot or was this a one time issue?

QUESTION: @micstriit how do we know if we have the correct version of Lucee 5.2.7.63 that will not have this issue?

Thanks in advance!


#5

Lucee should (and normally does) clear the cfclasses folder with every update. this is necessary because an update also can influence how the bytecode looks like.
if Lucee is not able to clear the cfclasses folder (access right?) it log this to the error stream (in tomcat to catalina.out).

we did not have different 5.2.7.63 versions, actually we never ever have differences with the same version.
there was an issue in the build deploy packages, like Lucee express or war.
The Lucee core itself never was affected.