UDF Performance: Include (udf.cfm) vs Application Singleton (udf.cfc)

I have a legacy app that I’m finally bringing into the 21st Century. It uses a bunch of UDFs, mostly from cflib.org. Nevermind for now that I need to pick through all those and either update them or replace with more modern solutions. I’m well aware. :wink:

The point of this topic is that one of the things I assumed I needed to do for maximum performance is to move all those UDFs into the Application scope so they’re only loaded once into memory onApplicationStart() instead of with every Request, but as has been discussed before, for some reason that appears to introduce a performance decrease.

What hasn’t been shared before is a comparison of simple timers which indicate the slightly longer run times when UDFs are called from the Application scope. My first thought was that it must be because I was testing with just a very simple function, but I got the same results after beefing up both udf.cfm and udf.cfc (see below) with more than 2400 lines of code from the cflib functions.

Is there something incorrect about my test code? Is there caching of my udf.cfm that is just as good as Application scope caching? Or is it simply because all .cf* files are compiled into Java binaries?

It’s not much of a difference, but I’m curious to know the reason why. And the whole point was to improve performance, so if that’s not happening then should I even bother moving the UDFs into Application scope?

Application.cfm

component {
	function onApplicationStart() {
		// Instantiate User Defined Functions (UDF) as Singletons so they're cached for the entire Application
		Application.udf = new udf();
	}
}

udf.cfm

<cfscript>
function hello() {
	echo('hello from included udf.cfm!');
}
</cfscript>

udf.cfc

component {
	function hello() {
		echo('hello from Application.udf!');
	}
}

timer-udf-app.cfm

<cfscript>
Application.udf.hello();
</cfscript>

timer-udf-include.cfm

<cfscript>
include "udf.cfm";
hello();
</cfscript>

timer-udf-include-vs-app.cfm

<cfscript>
// cfhttp() is used to simulate multiple Requests
echo('udf via app');
timer type="outline" {
	loop times=1000 {
		cfhttp(method="get" url="http://localhost/timer-udf-app.cfm");
	}
}
echo('udf via include');
timer type="outline" {
	loop times=1000 {
		cfhttp(method="get" url="http://localhost/timer-udf-include.cfm");
	}
}
/*
Test Results With Simple UDFs:
 udf via app 2651ms
 udf via include 2515ms

Test Results When Using Large UDFs:
 udf via app 2792ms
 udf via include 2624ms
*/
</cfscript>

OS: Pop!_OS (Ubuntu fork) 21.10
Java Version: 11.0.6
Tomcat Version: 9.0.31
Lucee Version: 5.3.9.141

Sorry, I dont understand.
“appstop.cfm” is invoked on each request?

I enabled nano timing (createObject("java", "java.lang.System").nanoTime();) and performed the same tests inline using an application-based CFC and an inline UDF (the same as a one-time CFIncluded UDF to a global request/form/url scope) and the results were interesting (and saddening since I’m moving UDFs to an application-scored VAR too). I chose not to induce the overhead of an external CFHTTP connection or increase the output buffer by using echo (or writeOutput when writing cross-compatible CFML.)

Using the CFML below (using Adobe ColdFusion 2021, sorry), using an application CFC function can take almost twice as long to perform. (Please retest this using Lucee and see what the results are.)

if (!isDefined("application.hello")){
	application.hello = new core.hello();  //This is the same function as the inline Hello UDF function below.
}
function hello() {
	return 'hello';
}

cf_timer(type="outline", label="global udf", nano=1) {
	for (i = 1; i <= 10000; i++) {
		a = hello();
	}
}
cf_timer(type="outline", label="application cfc", nano=1) {
	for (i = 1; i <= 10000; i++) {
		a = application.hello.hello();
	}
}

NOTE: To measure performance using nanoTime, use my CFTimer CFTag. (If not, feel free to rewrite the above CFML to use getTickCount() instead.)

Times

UDF CFC Unit
25 31 ms
15 16 ms
0 31 ms
16 16 ms
0 16 ms
0 344 ms
13244700 29579500 ns
12401000 24100000 ns
9051600 16483900 ns
12353100 25154100 ns
6400400 20242800 ns

No that was just to demonstrate that I was refreshing the Application scope before each test as I was developing my test code.

1 Like

Okay, but then is your test performing a proper comparison per request?

And what if the UDF library is much larger than the simple hello() function?

I’m not certain that my testing methodology is 100% correct, but I’m satisfied with my use of Lucee’s timer command. I’ve never had to add my IP address to a list of debugging IP addresses to use it, and I don’t need nanosecond precision when milliseconds are already more than sufficient.

I also have no interest in writing cross-compatible “CFML” (actually cfscript … I only use the ugly old “markup language” tags when mixed in with HTML presentation) because:

  1. I’m never going back to Adobe ColdFusion.
  2. echo() is way more elegant than the cumbersome writeOutput().
  3. This is, after all, the Lucee forum. :stuck_out_tongue:

Regardless, your test has confirmed what mine has indicated, and I appreciate you taking the time to share your results and your equal concern about how apparently moving UDFs into the Application scope not only doesn’t improve performance, but surprisingly decreases performance slightly.

So my questions remain:

Anyone know the answers?

Sorry… I’ve been professionally developing CFML applications since “Cold Fusion 3” (back in '99) and haven’t had the luxury of ever being able to write Lucee-only CFML. I’m attempting to port my many CFinclude-based UDF libraries to Lucee, but haven’t been successful yet. Many Lucee contributors release CFML libraries that work on both platforms, so sharing cross-compatible CFML for testing purposes isn’t something that Lucee developers all ignore by proxy.

Regarding “proper comparison”, your sample CFML performs a blind CFHTTP GET without any follow-up comparison. Are you even sure that the response returns a 200 OK and the expected payload?

The “per request” performance didn’t interest me as much as the raw performance regarding reuse of a non-Application-scoped function versus a CFIncluded UDF (without the added CFHTTP overhead). The UDF & method I used returned a value, it just didn’t blindly output it into the output buffer. (I try not to write functions that do this.) We currently use lots of UDFs to sanitize & normalize data (using 3rd party libraries like jsoup & junidecode) and I’d rather use the method that’s the most performant on all CFML server platforms. “Why” is not as important to me as “which” method developers should consider using going forward for best performance.

And I started with Allaire Cold Fusion version I don’t even remember in '97 coding for Intel, but hey this isn’t a contest. :wink:

But I must say Lucee isn’t a luxury for me. It was the choice I made in my full migration to open source, and of course it’s saved me lots of $$$, which is important for a sole proprietor. In fact, from the money perspective, Adobe is the luxury.

I would think the per request issue is at the very heart of the matter, because it’s all about whether the UDFs in Application scope are indeed cached in requests subsequent to the application start, as opposed to the cfincluded UDFs which I previously assumed were freshly interpreted with every request, but now it’s looking like the compiled Java binaries make that insignificant.

Whether theoretical or simply practical, I appreciate both perspectives, and the bottom line is we’re both waiting for the same answers.

Lucee 6 supports nano for cftimer, just be aware it’s not super reliable on windows

https://luceeserver.atlassian.net/browse/LDEV-3250

@Zackster do you have any insight re: my original topic here?