Very high Old Gen usage

My app is a data processing pipeline, matching different pieces to each other, doing some cleaning/transforms, loading and saving all of this data from/to a database, with a fair amount of json (de)serialization. I can try to go into more detail if it turns out to be useful.

My problem is that this seemingly simple series of pipeline stages ends up consuming 18-24 gigs of ram and never gets smaller than the 18 once its run. Eventually I hit 100% CPU as the GC can’t reclaim anything. My worry is that I have a fundamental misunderstanding of some aspect of CFML causing this memory leak.

The structure of all of my components and methods is the same: I declare a local variable ( var v = {} ) and store all data specific to that method in it. There are application scope lookups that are populated when the app starts, and a few of them change slightly over the course of execution, but mostly they are fixed. Some data ends up in the request scope to be shared by different processes, always locked when I need to write something from a multi-threaded block (like a closure that can be made parallel). But nothing heavy, mostly outcome codes and some data for a UI.

As I understand GC, both the local and request variables should be collected. I know there’s a time delay before something is considered collectable, but I’m sitting at 30+ minutes of no execution and still seeing 18 GB in the old gen. Some of my components are cached by framework/1, some are invoked directly. I cannot even get a starting point of which code is causing all of this uncollectable data.

I’ve installed FusionReactor, but I can’t make sense of its output. When the profiler does point to my code (as opposed to the underlying java) there’s no method name or line number. The Heap Histogram is a huge list of java classes that I can’t translate to my code.

So what I’m wondering is if there’s a memory management concept I don’t know about. Like variables in a persistent scope never get collected, or knowing when variables are passed by reference and thus maybe a pointer to them stays alive. Any special considerations for closures? And any help with FusionReactor is greatly appreciated, their onboarding documentation is not great. Everything I’ve found about memory management in coldfusion is about GC settings and heap size, not about the code itself, so I’ll take anything on the subject.

OS: Windows Server 2022 Datacenter
Java Version: 11.0.17 (Eclipse Adoptium) 64bit
Tomcat Version: 9.0.68
Lucee Version: 5.4.3.2

1 Like

Let me try some more specific questions that might help me and others with core concepts:

  1. Is there any difference in garbage collection between methods that are (cf)invoked vs ones that are cached in the application (or any other persistent) scope?
    a. Variables within both are scoped using var v = {}; v.someVar = "thing";
  2. Is it kosher to reuse a variable, or is GC seeing that as more frequent use and thus graduating to old gen? And is there any GC difference between the following where I reinitialize every time vs re-use?
    a. <cfset var v = 1 />, <cfset var v = 2 />, etc
    b. <cfset var v = 1 />, <cfset v = 2 />