I have been running some tests on 126.96.36.199 and that fixes a lot of issues we had with previous versions, although I don’t know how stable it is as we are just moving over to it.
It would be great to get the below into the next sprint
Stating the obvious but software like Lucee is complex and a version that may be “stable” in one situation could be unstable in another.
We started having apparent memory leak problems with 4.5 over a year ago which we isolated to a particular application, but couldn’t determine the root cause (despite having FusionReactor, which just showed waiting/blocked threads piling up).
Having upgraded a client installation from 4.5 to 188.8.131.52 and found it very stable (still running smoothly as I write), we optimistically did the same with our own, only for it to bomb immediately forcing a roll back.
It was only when we got positive reports from others similarly affected that we gave the 184.108.40.206 SNAPSHOT a try and as I’ve mentioned in other posts, it has indeed proved very stable… for us… in our environment…running our particular apps with their particular workloads.
But sadly it would appear even this version is not stable for everyone.
I’m certainly happy to commend this release based on our experience, but YMMV.
5.2.5 is not stable for me. I’m stuck on 220.127.116.11 right now. See
All News related posts are served from here:
Unfortunately, the development of the new website is lagging behind the move in comms. We do update the main web site blog section with a link through to here. But looks like that has been delayed a little bit over the break.
When the new site is released this year, new posts will be dynamically represented on the main site.
This is our experience too. We appear to have no issues with the current release.
The LAS team has grown in 2017; additional Java development resources, project management and admin support. Members and supporters contributions will start to make a real difference in our resourcing capacity in the coming year.
@modius and @Julian_Halliwell Out of curiosity, are either of you using query caching in your applications? I’ve been reviewing a heap dump with @sbleon today on https://luceeserver.atlassian.net/browse/LDEV-1640 and have discovered 93% of his heap space being consumed with cached queries. We’re still digging into it so it’s just a lead at this point, but it could possibly be what makes the issue completely absent for some people and present for others.
Do you mean query of queries sort of stuff, or cachedwithin sort of stuff?
While it’s possible somewhere in our portfolio it is not something we typically do. If we need to cache something we would more likely externalise the data in memcached or similar.
So without any sort of close scrutiny I’d say, no; we don’t use query caching.
You might be onto something, Brad. No we hardly use the built-in
cachedwithin etc query caching at all. In fact we reduced its use in favour of our own caching mechanism a while ago in case it was contributing to the memory problems we were experiencing at the time.
To add to this, Leon is making large use of
cachedwithin=0, whose behavior was changed recently in this ticket:
However, it looks as though it possibly was implemented in such a way that even though Lucee returns an un-cached query, it doesn’t actually clear the items from the cache. See this related ticket
If some developers are using
cachedwithin=0 AND that value incorrectly leaves the items in the cache AND the RamCache has no limits set, then it seems reasonable that this could be the source of memory issues that only affect some people.
This is a total guess though right now. I need to hear back from some other people who are still claiming to have memory issues on 18.104.22.168 to see if this profile possibly fits their usage.
Just to quickly document another snafu we’ve just encountered switching to this release. As mentioned we’ve been using the 22.214.171.124-SNAPSHOT in production since its release in October because it solved our memory issues, but now that 126.96.36.199 is officially released we want to be back on the stable release update channel.
This should have been straightforward being the same version, but after switching .lco files to the latest in one of our instances and restarting found that request execution times were terrible. No memory problems or stuck requests as before, but high CPU and very long running requests which eventually timed out.
This server has lot of different applications running on it, some constantly busy, others only accessed at certain times. After 10 minutes or so things seemed to settle down but when a new application was accessed for the first time, the same thing started again: CPU thrashing, long requests and timeouts (the logs were full of Java thread death events).
Eventually I found a solution: clearing the
WEB-INF\cfclasses folder before the first request, prevented the issue and the app loaded normally with none of the symptoms described.
An edge case of “side”-grading, but just in case anyone else is in the same situation.
I’ve been doing some load testing on my server using Webserver Stress Tool 8 with simulations of 45 users and a random number of clicks per user over a 20 minute period.
The webserver is running:
Win 2012 R2
4 X6560 Xeon Processors / 8GB Ram
For testing purposes max heap was set to 2GB
Running Mura 7 connecting to mySQL database
The site I’m testing ran at around a 13% average heap and 8-10% non-heap on version 188.8.131.52. It also ran at about 13% heap on 184.108.40.206 but I saw a gradual increase of the non-heap (about 1 - 2% over a 24 hr period). There haven’t been any code changes to the site.
When the simulation first starts, the java heap begins to climb; Lucee appears to reclaim much of the heap, but as the test continues the heap climbs and Lucee doesn’t claim more than about 20% of the heap (sometimes as little as 2%). At the beginning of each test Lucee reclaims about 20% of the heap. During each test, CPU was fluctuating between 66% and 99%.
On average, at the end of each 20 minute test heap was at 75-80%. When I started the initial test I had a 3% heap, 3% non-heap, (non-heap climbs to about 10% and stays steady around 10% through out the duration of the tests). If I re-initiated the test, there was an initial reclaiming of about 20% of the heap, however, If I increase the duration of the test to 40 minutes the server eventually becomes unresponsive with a 95% heap, 10% non-heap and cpu railing at 99%.
I ran the test on the same server on a simple static site with no database connections and the there was nominal heap growth and the heap was fully reclaimed.
I think this leans towards the hypothesis of cached queries not be released…
we are still stuck on Coldfusion, because of this little beast: https://luceeserver.atlassian.net/browse/LDEV-1147
There was a workaround suggested, which unfortunately didn’t work. I would really appreciate, if you could find the time to dig into this
Thanks and a happy and successful new year!
Testing on Lucee 220.127.116.11-SNAPSHOT and things look good. Heap is actively reclaimed and non-heap not climbing about 12%.
Hi @phal0r. That ticket is being investigated, so keep an eye out for updates.
Thanks for that info @kuc_its. Helpful for us (and for the community).
thanks for the heads up. Will do that
Unfortunately it appears I may have been premature in my assumption that this issue has been resolved.
See my post here - Lucee 5.2.x Java Heap Issues
@phal0r Can you please see my comments at https://luceeserver.atlassian.net/browse/LDEV-1147 and see if you can add something that will help me get the SQL code to work in a SQL client? Thank you.