"Strict" decision functions?

bennadel · July 18, 2022, 11:45am

ColdFusion has a long history of being able to seamlessly cast between String and non-String data types. Which is why the following all yield true:

isBoolean( "yes" )
isBoolean( "true" )
isBoolean( true )
isBoolean( 1 )

I many cases, this is great functionality. However, when validating data pre-persistence layer, it would be nice to have access to more strict testing. I think this could either be available as an optional argument, like:

isBoolean( "yes", false ) => true
isBoolean( "yes", true ) => false

… where that second argument is a “strict mode” setting.

Or, a better option would be a series of new isStrict{type}() functions like:

isStrictBoolean() ~= isInstanceOf( value, "java.lang.Boolean" )
isStrictNumberic() ~= “???”
isStrictDate() ~= isInstanceOf( value, "java.util.Date" )

ASIDE: We already have some prior art here in terms of isNumericDate(), which is basically a lower-level validation on the isDate() BIF.

I’ve never really had to think about this before because I’ve always saved data to a Relational Database system where the DB schema basically enforces this for me. But, if i needed to store data in a Document store, or a flat-file, or something more loosey-goosey, then enforcing strict data types pushes “left” and it becomes my responsibility.

bennadel · July 18, 2022, 11:48am

Building on this idea, it would be nice to have an isString() / isStrictString() method, which we’ve never had (likely since historically in early days, all SIMPLE values were strings).

AdamCameron · July 18, 2022, 5:09pm

What would they do? What’s a “strict” string as opposed to any other sort of string?

Are you meaning like isStrictString(1234) would be false cos 1234 is not "1234"?

–
Adam

AdamCameron · July 18, 2022, 5:12pm

I prefer the long version to adding more params.

Not convinced by “Strict” being the right term. I’m think it leans too much on “everything else is loose”, rather than focusing on what these functions actually do, which is a type check. So I first thought isTypeBoolean (etc), but not sure that’s actually an improvement.

The general idea though? Solid.

It’s really sad that CFML decided the original isBoolean etc functions actually mean canBeUsedAsABooleanSometimesGoodLuck rather than what the function actually implies it checks

–
Adam

bennadel · July 18, 2022, 5:14pm

Good point about string vs strictString … I was thinking in terms of “simple values”, and forgot to actually map my thinking over to strings.

Re: strict vs another term, I definitely don’t feel strongly - it was just first one that popped to mind… I mean, right after isBooleanForRealz()

AdamCameron · July 18, 2022, 5:29pm

+1 for this

markdrew · July 18, 2022, 5:58pm

Also in favour with longer function names vs more params in this case.
I don’t think using the word strict in this case is bad though, as we have the idea of “strict type checking” vs dynamic type checking (“is it” vs “can it be”). And it’s succinct enough.

I am just thinking of the case of isValid() and how that would work. since that might need another parameter.

Also for isStrictArray() I am guessing it would barf on non CFML arrays (I am not sure what it does now tbh)

bennadel · July 18, 2022, 6:06pm

From what I saw in my testing, isValid( value, "boolean" ) has the same behavior as isBoolean(), in so much as it is loose in its casting.

I hadn’t even thought about this for non-simple values, such as Arrays and Structs. It might not apply a much there. Though, I have been bitten in the past when a non-“Map” gets returned from Java and Lucee doesn’t quite know what to do with it (or rather, that it doesn’t have member methods). Interesting thought!

AdamCameron · July 18, 2022, 7:19pm

A lot of ppl get bitten like when they call String.split and then get uppity when they can’t then call CFML member functions on the resultant String[], and get uppity about things.

Still: it would not occur to those ppl to check before they use either.

Thinking about this, perhaps we’re treating the symptom not the cause here. Do you actually want strict typing on method params? Is this how this arose?

–
Adam

bennadel · July 18, 2022, 8:16pm

This came about (in my mind) because I’m building up some complex data with Form Posts, and I want to store it with the correct data type, not a castable data-type. Basically, I want a way to do what the relational database would have done for me, at least from a validation stand-point. Meaning, if the object-path:

foo.bar.baz.isActive

… is a Boolean, I want to be able to do:

if ( ! isBoolean( foo.bar.baz.isActive ) ) {
    throw( "..." );
}

… and make sure that it’s a true/false value, and not a yes or a 1 kind of thing.

Ultimately, all form Posts are strings. And the “controller layer” needs to convert all those strings into the right data types. And then my “business layer” needs to enforce those values to make sure the controller layer didn’t mess up.

At least, that’s where I’m coming from at the moment.

bdw429s · July 18, 2022, 8:34pm

One of the things I’ve always liked about CF is generally not needing to know or care what specific Java class was being used by the “simple value” I’m working with. The few times I’ve really needed to know is when sharing this data via a format such as JSON or WDDX where having the correct type can be a requirement of the client consuming the data.

But in those cases, I don’t necessarily need to test the type, I just need to force it. I.e., if my JSON needs a proper string and not a number, then I can pass it through toString() and then it doesn’t really matter what is was because now I know what it is! Now, there aren’t necessarily great first class functions for doing this. For example, in CFConfig when writing WDDX for Adobe CF, I force numeric strings to numbers like this

foo = mySetting + 0;

and truthy values to proper booleans like this:

bar = !!anotherSetting;

So basically, so long as the variable is floating around in my CF code, I’m more than happy for it to be some castable version of the data-- who cares!? I only need to force the type when I get to the part where I store it in a data transfer format that cares. For the sake of discussion, I’m curious if you really need a set of functions to detect the underlying type, or a set of functions to FORCE the underlying type.

bennadel · July 18, 2022, 8:42pm

For the most part, I totally agree with you! In fact, I don’t worry about the data most of the time since I’m going to use <cfqueryparam> to store in in the DB, and between the query-param and the DB schema, all the conversions “just work”.

In this, I’m actually storing data in a JSON file, which is how it occurred to me that I had the wrong data type (I could see it in the physical .json file).

So, to your point, it then becomes a question of 1) Do I coerce the values in my “business logic”. Or 2) do I enforce the values in my business logic and require my “controller logic” to do the casting and pass in the right stuff?

As I think we’re saying, historically I’ve done the former since I had so much auto-casting in place. But, since I don’t have a DB - just a .json file - this time, I’m kind of learning towards the latter.

bdw429s · July 18, 2022, 8:55pm

I would personally wait until the part of the app that cares. i.e.

Don’t do it in the “business logic”, unless the business logic needs to know or care what the actual types are
don’t do it in the controller unless the controller needs to know or care what the actual types are
When the data is written out from native CF types to JSON-- if the JSON needs to care, then enforce it then.

This seems like a fantastic opportunity to make a new library/module for ForgeBox which will take a JSON schema file (which is JSON itself) and then take an incoming data structure and validate/force the data over to the types required in the JSON. Honestly, that could even be a core feature of the CF engines…

serializeJSON( var=myData, schema=JSONSchema )

and then it would auto-cast types based on the schema, throwing if not possible.

bennadel · July 18, 2022, 9:02pm

That would be more “ColdFusiony” I could enforce the data-type in the “gateway” (ie, persistence layer), which is basically what the <cfqueryparam> and DB-schema would be doing traditionally.

I’m so used to thinking about “flat data”, that I don’t really have an instinct for how to go about managing complex, deep data. I’ll have to play around with that. I was really liking that my data persistence was basically just:

fileWrite( filePath, serializeJson( payload ), "utf-8" )

I appreciate the push-back.

AdamCameron · July 18, 2022, 9:34pm

No, it absolutely does not.

Not the job of the controller.

But sure, your model needs to know how to accept raw values from the controller and deal with them (reject validation, coerce into correct types(*)). Just not in the controller.

–
Adam

(*) what Brad says later notwithstanding… delay until you know how it needs to be used… but this is never the job of the controller.

AdamCameron · July 18, 2022, 9:44pm

Yeah, hence my question re why are we doing this? Do we need to know, or do we just need it to be a certain type.

foo = mySetting + 0;

bar = !!anotherSetting;

There’s a certain pragmatism about me (yes, sometimes I shove my dogmatism aside) that goes “yeah one could do it that way, and there’s well-established precedent”, but then I think “this is treating the symptom not the cause” (*).

I’d like to be able to go mySetting.asAFrickinIntegerLikeISaidItWas() (jury out on the method name there, but something like that), rather than monkeying around hand-cranking coercion.

But that said, that’s nowt to do with more accurate type-checking.

There’s merit in being able to go <cfargument type="anActualFrickinIntegerPlease" name="i"> or for other purposes isItActuallyAnIntegerWhyDoINeedToAskStuffLikeThis(arguments.i).

It’s not always the case (or is it?) that one then needs to go on and force that square peg into a round whole, so being able to do the check is also a thing.

–
Adam

(*) I’m mindful of the other thread I replied to where I said “this can already be done easily right now, so do we need this new feature?”

AdamCameron · July 18, 2022, 9:58pm

I think you nailed it with the ForgeBox suggestion.

There’s no need to for this sort of thing to baked into the language. It’s dead easy to do with CFML, it doesn’t need to be in CFML.

Leave Adobe(*) to do stuff that outright can’t be done in CFML now, which are basically language constructs, or methods that lie very close to the inbuilt data-types and belong there.

–
Adam

(*) oops this is a Lucee forum. Well…, yeah… suggest it to Adobe and then wait, I guess [cringe]

markdrew · July 19, 2022, 9:26am

I have actually wanted to have JSON validation in the language or as a library but there are a few standards floating about when I last looked, but having the corresponding JSONValidate() function to match the XMLValidate() would be super handy

bennadel · July 19, 2022, 9:51am

Very interesting perspective. My business layer will definitely do validation based on constraints; but, it’s always assumed that it is at least getting the right types. In my mind, the controller layer is the “delivery mechanism”, which (to me) means that it is responsible for taking inputs of some sort and feeding them into the “business layer”.

So, for example, if I have an API controller that accepts payloads of type application/json, the controller layer would be responsible for parsing that JSON before pushing data down into the “business layer”. This way, the business layer doesn’t have to know that the source is a JSON payload vs. a FORM post vs. something else altogether.

Similarly, I might have a business layer method like:

deleteWidgets( required array ids )

And, in my controller layer, let’s say I have a series of checkboxes that allow the end-user to select which widgets to delete:

<input type="checkbox" name="id" value="1" /> Widget 1 <br />
<input type="checkbox" name="id" value="2" /> Widget 2 <br />
<input type="checkbox" name="id" value="3" /> Widget 3 <br />

When I go to process this form, I’d likely have a call that looks like this, in order to break the form string up into an array of values:

service.deleteWidgets( form.id.listToArray() )

Because, the .deleteWidgets() method doesn’t know that I’m using checkboxes to deliver this data. As such, it doesn’t know that a comma-delimited list would be used. And, tomorrow, maybe I change the checkboxes to use a different notation:

<input type="checkbox" name="id[]" value="1" /> Widget 1 <br />
<input type="checkbox" name="id[]" value="2" /> Widget 2 <br />
<input type="checkbox" name="id[]" value="3" /> Widget 3 <br />

… and now the form.id payload is already an array, and I can just do:

service.deleteWidgets( form.id )

The “business layer” didn’t need to change simply because the “delivery mechanism” / “controller layer” changed. Because - to me - it’s the job of the controller layer to do all that transformation.

Not sure if that helps clarify how I’m looking at the different layers.

Zackster · July 19, 2022, 9:53am

JSON extension

ValidateJson()