UDF Performance: Include (udf.cfm) vs Application Singleton (udf.cfc)

LionelHolt · July 3, 2022, 1:05am

I have a legacy app that I’m finally bringing into the 21st Century. It uses a bunch of UDFs, mostly from cflib.org. Nevermind for now that I need to pick through all those and either update them or replace with more modern solutions. I’m well aware.

The point of this topic is that one of the things I assumed I needed to do for maximum performance is to move all those UDFs into the Application scope so they’re only loaded once into memory onApplicationStart() instead of with every Request, but as has been discussed before, for some reason that appears to introduce a performance decrease.

What hasn’t been shared before is a comparison of simple timers which indicate the slightly longer run times when UDFs are called from the Application scope. My first thought was that it must be because I was testing with just a very simple function, but I got the same results after beefing up both udf.cfm and udf.cfc (see below) with more than 2400 lines of code from the cflib functions.

Is there something incorrect about my test code? Is there caching of my udf.cfm that is just as good as Application scope caching? Or is it simply because all .cf* files are compiled into Java binaries?

It’s not much of a difference, but I’m curious to know the reason why. And the whole point was to improve performance, so if that’s not happening then should I even bother moving the UDFs into Application scope?

Application.cfm

component {
	function onApplicationStart() {
		// Instantiate User Defined Functions (UDF) as Singletons so they're cached for the entire Application
		Application.udf = new udf();
	}
}

udf.cfm

<cfscript>
function hello() {
	echo('hello from included udf.cfm!');
}
</cfscript>

udf.cfc

component {
	function hello() {
		echo('hello from Application.udf!');
	}
}

timer-udf-app.cfm

<cfscript>
Application.udf.hello();
</cfscript>

timer-udf-include.cfm

<cfscript>
include "udf.cfm";
hello();
</cfscript>

timer-udf-include-vs-app.cfm

<cfscript>
// cfhttp() is used to simulate multiple Requests
echo('udf via app');
timer type="outline" {
	loop times=1000 {
		cfhttp(method="get" url="http://localhost/timer-udf-app.cfm");
	}
}
echo('udf via include');
timer type="outline" {
	loop times=1000 {
		cfhttp(method="get" url="http://localhost/timer-udf-include.cfm");
	}
}
/*
Test Results With Simple UDFs:
 udf via app 2651ms
 udf via include 2515ms

Test Results When Using Large UDFs:
 udf via app 2792ms
 udf via include 2624ms
*/
</cfscript>

OS: Pop!_OS (Ubuntu fork) 21.10
Java Version: 11.0.6
Tomcat Version: 9.0.31
Lucee Version: 5.3.9.141

Roberto_Marzialetti · July 3, 2022, 7:12am

Sorry, I dont understand.
“appstop.cfm” is invoked on each request?

Jamo · July 3, 2022, 5:07pm

I enabled nano timing (createObject("java", "java.lang.System").nanoTime();) and performed the same tests inline using an application-based CFC and an inline UDF (the same as a one-time CFIncluded UDF to a global request/form/url scope) and the results were interesting (and saddening since I’m moving UDFs to an application-scored VAR too). I chose not to induce the overhead of an external CFHTTP connection or increase the output buffer by using echo (or writeOutput when writing cross-compatible CFML.)

Using the CFML below (using Adobe ColdFusion 2021, sorry), using an application CFC function can take almost twice as long to perform. (Please retest this using Lucee and see what the results are.)

if (!isDefined("application.hello")){
	application.hello = new core.hello();  //This is the same function as the inline Hello UDF function below.
}
function hello() {
	return 'hello';
}

cf_timer(type="outline", label="global udf", nano=1) {
	for (i = 1; i <= 10000; i++) {
		a = hello();
	}
}
cf_timer(type="outline", label="application cfc", nano=1) {
	for (i = 1; i <= 10000; i++) {
		a = application.hello.hello();
	}
}

NOTE: To measure performance using nanoTime, use my CFTimer CFTag. (If not, feel free to rewrite the above CFML to use getTickCount() instead.)

Times

UDF	CFC	Unit
25	31	ms
15	16	ms
0	31	ms
16	16	ms
0	16	ms
0	344	ms
13244700	29579500	ns
12401000	24100000	ns
9051600	16483900	ns
12353100	25154100	ns
6400400	20242800	ns

LionelHolt · July 3, 2022, 8:08pm

No that was just to demonstrate that I was refreshing the Application scope before each test as I was developing my test code.

LionelHolt · July 3, 2022, 9:33pm

Okay, but then is your test performing a proper comparison per request?

And what if the UDF library is much larger than the simple hello() function?

I’m not certain that my testing methodology is 100% correct, but I’m satisfied with my use of Lucee’s timer command. I’ve never had to add my IP address to a list of debugging IP addresses to use it, and I don’t need nanosecond precision when milliseconds are already more than sufficient.

I also have no interest in writing cross-compatible “CFML” (actually cfscript … I only use the ugly old “markup language” tags when mixed in with HTML presentation) because:

I’m never going back to Adobe ColdFusion.
echo() is way more elegant than the cumbersome writeOutput().
This is, after all, the Lucee forum.

Regardless, your test has confirmed what mine has indicated, and I appreciate you taking the time to share your results and your equal concern about how apparently moving UDFs into the Application scope not only doesn’t improve performance, but surprisingly decreases performance slightly.

So my questions remain:

Anyone know the answers?

Jamo · July 3, 2022, 11:50pm

Sorry… I’ve been professionally developing CFML applications since “Cold Fusion 3” (back in '99) and haven’t had the luxury of ever being able to write Lucee-only CFML. I’m attempting to port my many CFinclude-based UDF libraries to Lucee, but haven’t been successful yet. Many Lucee contributors release CFML libraries that work on both platforms, so sharing cross-compatible CFML for testing purposes isn’t something that Lucee developers all ignore by proxy.

Regarding “proper comparison”, your sample CFML performs a blind CFHTTP GET without any follow-up comparison. Are you even sure that the response returns a 200 OK and the expected payload?

The “per request” performance didn’t interest me as much as the raw performance regarding reuse of a non-Application-scoped function versus a CFIncluded UDF (without the added CFHTTP overhead). The UDF & method I used returned a value, it just didn’t blindly output it into the output buffer. (I try not to write functions that do this.) We currently use lots of UDFs to sanitize & normalize data (using 3rd party libraries like jsoup & junidecode) and I’d rather use the method that’s the most performant on all CFML server platforms. “Why” is not as important to me as “which” method developers should consider using going forward for best performance.

LionelHolt · July 4, 2022, 12:20am

And I started with Allaire Cold Fusion version I don’t even remember in '97 coding for Intel, but hey this isn’t a contest.

But I must say Lucee isn’t a luxury for me. It was the choice I made in my full migration to open source, and of course it’s saved me lots of $$$, which is important for a sole proprietor. In fact, from the money perspective, Adobe is the luxury.

I would think the per request issue is at the very heart of the matter, because it’s all about whether the UDFs in Application scope are indeed cached in requests subsequent to the application start, as opposed to the cfincluded UDFs which I previously assumed were freshly interpreted with every request, but now it’s looking like the compiled Java binaries make that insignificant.

Whether theoretical or simply practical, I appreciate both perspectives, and the bottom line is we’re both waiting for the same answers.

Zackster · July 4, 2022, 1:48pm

Lucee 6 supports nano for cftimer, just be aware it’s not super reliable on windows

https://luceeserver.atlassian.net/browse/LDEV-3250

LionelHolt · July 4, 2022, 5:29pm

@Zackster do you have any insight re: my original topic here?