Regarding further progress on my Lucee fork discussed here:
https://lucee.daemonite.io/t/java-16-performance-boost-and-lucee-mods/8430
I had theorized that maybe Application.cfc could be optimized internally and now I have proven it. Regardless of whether the CFML language standard gets broken or not with this behavior change, I was willing to tinker to see how to remove the overhead of Application.cfc and the Lucee internals surrounding this since I could tell something was wrong here in my benchmarking, since Lucee is at least twice as fast when you delete Application.cfc. This overhead is very significant because the configuration is reloaded on every request instead of once during start-up and there is also disk I/O.
Internally in Lucee Java, I found 2 ways to optimize this, and achieve a massive performance increase while still allow concurrent access to database and overall thread safety.
#1 I found that the PageSourceImpl.java constantly performs disk I/O to verify the existence of Application.cfc
To fix this, I created static concurrenthashmap cache of āexistsā results in physcalExists (typo is intentional) in PageSourceImpl.java. I also added cache clear to call function of PagePoolClear tag to make sure existence is only tested again once when desired by application developer.
When a developer sets their application to āneverā inspect templates, they would expect that to also mean Application.cfc doesnāt get inspected or cause disk I/O overhead, but unfortunately Lucee has this disk I/O for just Application.cfc still. I believe at one point there was caching there, but then it was removed because there was a hashmap remnent in the code. Perhaps there are reasons for compatibility not to do this, but itās something very good for those willing to optimize more for best performance!
#2 I found that the configuration of Application.cfc is reloaded on every request. While this behavior might ensure maximum compatibility, it would make sense to create a mode and documentation describing a way to load Application.cfc values once and only once during the lifetime of the application scope. This is a massive issue for performance because Application.cfc processing is not trivial.
To fix this, I was able to update PageContextImpl.java initApplicationContext and ModernAppListener.java to utilize a ConcurrentHashMap for both the component instance and the applicationContext instance. I recognize other changes would be required for the other application context modes like classic and mixed, but in my Lucee fork, I have removed those to avoid any additional code execution.
These 2 optimizations allow a massive performance increase to Lucee for simple requests in simple load testing using Java future threadpool in Intellij editor, not on a production server. A production system could be even better.
My Lucee fork is already 2 or 3 times faster before these changes, but now it is extremely fast, close to raw Java server socket limits. Original Lucee would struggle to reached 4000 or 5000 requests per second on my system, but now I can hit 25,000 to 48000 requests per second on my Intel 4790K quad cpu in windows running the same āhello worldā kind of CFML code as before. Amazing! These CFML requests are being done internally in a custom CLI script and not over the network since Iām just trying to benchmark and optimize the CFML engine request flow. Iām sure the TCP network overhead would slow it down some, but I just removed a bottleneck on Lucee that lets it handle the normal CFML request flow up to 10 times faster. I have also made so many other optimizations and removal of features that also add to the performance gains, but these 2 items listed are very significant and easy to address in the original Lucee code that everyone uses I believe.
I also verified these optimizations are not in the current master branch of Lucee on github so the opportunity still exists.