Built-in functions vs members functions?

seancorfield · June 19, 2015, 10:37pm

This was mentioned in passing in another thread but I thought it was worth discussing in its own thread.

CFML today has a lot of built-in functions with names of the form _typeName_Something() and recently it added many of those as member functions on objects of the appropriate type.

For Lucee, would it be a worthwhile cleanup to remove those built-in functions, leaving just the member functions?

That would remove most (all?) of the array*(), struct*(), query*(), image*() functions (and perhaps some of the xml*() functions too?).

Removing the list*() functions and file*() functions would be tricky in several cases: the former are based on strings (so s.list*() member functions could replace them) and the latter are mostly based on strings (but having string member functions is “wrong” for file operations).

kliakos · June 19, 2015, 10:42pm

Although I favor member functions, when I tested performance they we ~10% slower than the headless functions. This only made me stick to the headless functions.

seancorfield · June 19, 2015, 11:06pm

A reasonable concern. Perhaps @micstriit can comment on performance, and whether he believes it likely that could be reduced to a negligible difference?

justincarter · June 19, 2015, 11:22pm

For Lucee lang I’m in favour of removing typeNameSomething() built-in functions and using member functions as the only option (provided there’s no performance overhead as noted above). I think s.list*() is a good solution for list functions as well.

On a related note - perhaps this should be a separate topic - but has anyone discussed namespacing/packaging some (or all) of the other global functions? Math.abs(), File.open(), etc. That might make sense for many of the global built-in functions that aren’t related to data types, but it would also depend on what functions are left hanging around. It could be a difficult cleanup

seancorfield · June 19, 2015, 11:24pm

That was going to be my next thread, depending on how this thread goes…

justincarter · June 19, 2015, 11:58pm

Haha (And perhaps there’s also Java methods to consider…)

To continue the discussion in smaller steps perhaps I should say why I’m leaning towards preferring member functions.

The first reason for me is for Lucee lang to be providing positive steps forward from “plain CFML” (for want of a better term). Developers could choose to stop using the typeNameSomething() built-in functions and instead use the member functions only, and some may already be doing this with good success. So my thinking is that Lucee lang should not only “encourage” it but enforce it.

The second reason is developer familiarity/appeal. When you learn most languages you’re often taught that “too many globals are bad”. For Lucee lang to appeal to a wider audience the huge number of global functions or the lack of organisation may be a turn off.

The third reason is documentation. Many developers may go looking for information about the data type they are working with (at least in most object-based languages). CFML documentation has typically just “grouped” all the functions using some kind of tag, e.g. “string functions”, but the data types themselves haven’t been given too much love. It would be nice to have the member functions documented under a data types section, and this would then clean up a lot of methods from the global functions documentation so that they become easier to browse as well.

webonix · June 20, 2015, 12:53am

+1 non-financial vote for removing typeNameSomething()

agentK · June 20, 2015, 2:13am

If the performance concerns can be worked around, I’ll all for following @seancorfield’s suggestion (and looking into bundling some of the other global functions in some sensible way)

micstriit · June 20, 2015, 4:31am

That is on our todo list …

micstriit · June 20, 2015, 4:37am

Built in functions can already be hard wired by the compiler, member function not, because the type is unknown at compile time.

An other option would be to set them to status hidden of this build in functions, what make it easier to migrate existing apps.

seancorfield · June 20, 2015, 4:59am

Am I correct that the difference is between a static method dispatch (on the page context class?) vs a dynamic dispatch (on the evaluated object)? I know you can’t get parity on that but I was surprised at @kliakos’s assertion of a 10% overhead overall.

kliakos · June 20, 2015, 7:37am

Yep, 10% is correct and maybe even higher. Of course I am taking about micro seconds, so you have to execute a function like 10000 times in order to have a millisecond loss.

You can try it out too:

ArrayAppend: TryCF.com
Array.Append: TryCF.com

adam_cameron · June 20, 2015, 8:04am

So it’s not worth worrying about then.

–
Adam

micstriit · June 20, 2015, 4:26pm

I never have measured that. What I can say is that we have added a new interface for functions, so that we don’t have to use reflection for member functions.

kliakos · June 20, 2015, 5:20pm

Is this included in the current version? I just tested my gists with Lucee 5 on trycf and got the same result, member functions are slower. In some cases, like isEmpty() for Strings for example, it’s almost 50% slower.

agentK · June 21, 2015, 12:57am

Well, this is one of those things.

I’d personally say I’d opt for language clarity and take a somewhat limited performance hit (in particular if we talk about 10% on the microsecond level).

On the other hand — before making a final decision it’d certainly be interesting to see if @micstriit can reproduce those performance penalties and see if there’s maybe a way to improve on them.

micstriit · June 23, 2015, 7:08am

I never can hardware things the same way with member functions as we can with build in functions. Take this example:
1: ArrayLen(array);
2: array.len();

In line one the compiler recognize the build in function “arrayLen” and does hard wire it in the bytecode.
But on line 2 this is impossible, then for the compiler this could be everything:

member function
component function
Java object method
webservice soap call
com object …

So only the runtime can decide what to do based on the type of the object “array”. That is the nature of a dynamic language.
Btw this is also something groovy makes at runtime I difference to Java.

Yes the interface is already in place, not with all functions (Lucee generated a proxy if necessary). The overhead is alwYsvthe same, but when you have a method that is executed very fast, like “isempty” the overhead has a bigger percentage of the execution.

kliakos · June 23, 2015, 9:36am

Sounds pretty reasonable. You do need to make a tradeoff in performance to use member functions in a dynamic language.

It kind of bothers me that I have to use a slower function in order to have cleaner code or readability. But that’s the way it is.

seancorfield · June 23, 2015, 2:52pm

If you’d never had the top-level built-ins, only the member functions, you would have just accepted “that’s the way it is”. In Java, method calls are dynamically dispatched so they’re slower than static member function calls – but folks don’t write static member functions everywhere for extra performance.

Abstraction always has a cost but abstraction is important and nearly always worth the cost.