Built-in functions vs members functions?

This was mentioned in passing in another thread but I thought it was worth discussing in its own thread.

CFML today has a lot of built-in functions with names of the form _typeName_Something() and recently it added many of those as member functions on objects of the appropriate type.

For Lucee, would it be a worthwhile cleanup to remove those built-in functions, leaving just the member functions?

That would remove most (all?) of the array*(), struct*(), query*(), image*() functions (and perhaps some of the xml*() functions too?).

Removing the list*() functions and file*() functions would be tricky in several cases: the former are based on strings (so s.list*() member functions could replace them) and the latter are mostly based on strings (but having string member functions is “wrong” for file operations).

3 Likes

Although I favor member functions, when I tested performance they we ~10% slower than the headless functions. This only made me stick to the headless functions.

A reasonable concern. Perhaps @micstriit can comment on performance, and whether he believes it likely that could be reduced to a negligible difference?

For Lucee lang I’m in favour of removing typeNameSomething() built-in functions and using member functions as the only option (provided there’s no performance overhead as noted above). I think s.list*() is a good solution for list functions as well.

On a related note - perhaps this should be a separate topic - but has anyone discussed namespacing/packaging some (or all) of the other global functions? Math.abs(), File.open(), etc. That might make sense for many of the global built-in functions that aren’t related to data types, but it would also depend on what functions are left hanging around. It could be a difficult cleanup :smile:

2 Likes

That was going to be my next thread, depending on how this thread goes… :smile:

1 Like

Haha :slight_smile: (And perhaps there’s also Java methods to consider…)

To continue the discussion in smaller steps perhaps I should say why I’m leaning towards preferring member functions.

The first reason for me is for Lucee lang to be providing positive steps forward from “plain CFML” (for want of a better term). Developers could choose to stop using the typeNameSomething() built-in functions and instead use the member functions only, and some may already be doing this with good success. So my thinking is that Lucee lang should not only “encourage” it but enforce it.

The second reason is developer familiarity/appeal. When you learn most languages you’re often taught that “too many globals are bad”. For Lucee lang to appeal to a wider audience the huge number of global functions or the lack of organisation may be a turn off.

The third reason is documentation. Many developers may go looking for information about the data type they are working with (at least in most object-based languages). CFML documentation has typically just “grouped” all the functions using some kind of tag, e.g. “string functions”, but the data types themselves haven’t been given too much love. It would be nice to have the member functions documented under a data types section, and this would then clean up a lot of methods from the global functions documentation so that they become easier to browse as well.

1 Like

+1 non-financial vote for removing typeNameSomething()

If the performance concerns can be worked around, I’ll all for following @seancorfield’s suggestion (and looking into bundling some of the other global functions in some sensible way)

1 Like

That is on our todo list …

2 Likes

Built in functions can already be hard wired by the compiler, member function not, because the type is unknown at compile time.

An other option would be to set them to status hidden of this build in functions, what make it easier to migrate existing apps.

Am I correct that the difference is between a static method dispatch (on the page context class?) vs a dynamic dispatch (on the evaluated object)? I know you can’t get parity on that but I was surprised at @kliakos’s assertion of a 10% overhead overall.

Yep, 10% is correct and maybe even higher. Of course I am taking about micro seconds, so you have to execute a function like 10000 times in order to have a millisecond loss.

You can try it out too:

So it’s not worth worrying about then.


Adam

I never have measured that. What I can say is that we have added a new interface for functions, so that we don’t have to use reflection for member functions.

Is this included in the current version? I just tested my gists with Lucee 5 on trycf and got the same result, member functions are slower. In some cases, like isEmpty() for Strings for example, it’s almost 50% slower.

Well, this is one of those things.

I’d personally say I’d opt for language clarity and take a somewhat limited performance hit (in particular if we talk about 10% on the microsecond level).

On the other hand — before making a final decision it’d certainly be interesting to see if @micstriit can reproduce those performance penalties and see if there’s maybe a way to improve on them.

I never can hardware things the same way with member functions as we can with build in functions. Take this example:
1: ArrayLen(array);
2: array.len();

In line one the compiler recognize the build in function “arrayLen” and does hard wire it in the bytecode.
But on line 2 this is impossible, then for the compiler this could be everything:

  • member function
  • component function
  • Java object method
  • webservice soap call
  • com object …

So only the runtime can decide what to do based on the type of the object “array”. That is the nature of a dynamic language.
Btw this is also something groovy makes at runtime I difference to Java.

Yes the interface is already in place, not with all functions (Lucee generated a proxy if necessary). The overhead is alwYsvthe same, but when you have a method that is executed very fast, like “isempty” the overhead has a bigger percentage of the execution.

2 Likes

Sounds pretty reasonable. You do need to make a tradeoff in performance to use member functions in a dynamic language.

It kind of bothers me that I have to use a slower function in order to have cleaner code or readability. But that’s the way it is.

1 Like

If you’d never had the top-level built-ins, only the member functions, you would have just accepted “that’s the way it is”. In Java, method calls are dynamically dispatched so they’re slower than static member function calls – but folks don’t write static member functions everywhere for extra performance.

Abstraction always has a cost but abstraction is important and nearly always worth the cost.

1 Like