Support FetchSize=1 row by row database streaming and preparedstatement reuse in pure CFML

brucekirkpatrick · August 9, 2015, 5:04pm

On the lucee lang spreadsheet there was a question like: remove query data type? and everyone voted no. Instead of asking to remove it, the question should have been:

Can we retrieve other more efficient data types after executing a query without being forced to always return a massive “query” data type?

Can we get an OO database api in pure CFML that works like this please?

d=new getDatabaseConnection({datasource: "myDatasource", fetchLimit:1, closeOnError:true, closeOnRequestEnd:true});

// allow 2 ways to build a query

// method 1
d.setSQL("select * from user limit ?, ?");
d.setParams([0, 1000000]);
resultObject=d.executeQuery();
while(resultObject.hasMoreRows()){
	struct=resultObject.getNextRow();  // defaults to struct data type.
  echo(struct.id);
}
// change params and reuse of same connection and preparedstatement java objects for next query
d.setParams([1000000, 1000000]); 
resultObject=d.executeQuery();

// continue to support for..in looping to retrieve structs one row at a time.
for(struct in resultObject){
  echo(struct.id);
}

// method 2
resultObject=d.executeQuery("select * from user limit ?, ?", [0, 1000000]);  

// allow retrieving the data as a simple array instead of struct, for even more efficient data type
while(resultObject.hasMoreRows()){
	arr1=resultObject.getNextRow('array');  
        echo(arr1[1]);
}
// allow us to manually close connection
d.close();

// allow us to check connection status
if(d.isClosed()){
echo('db was closed');
}

Notice that I allow auto-close of connection as an option on the connection to simplify the code, otherwise you’d need try/catch boilerplate code on every query written this way to avoid bugs/errors causing exhaustion of all connections. Auto-close management should be left on by default, and only disabled when developer wants to manage the connection in a more complex way.

cfquery and the huge query data type it creates wastes a lot of memory and cpu time and causes a greater amount of garbage collection after each query then there would be with a smaller fetchSize. To workaround, we have to write extra code when we need to split large results into many separate queries using offset,perpage limit. For example, I often pull 30 records out at a time, instead of all of them, and this seems like an acceptable balance, but I’d rather just have more fine grained control of the data types returned like you can in pretty much every other language.

lazy="true" wasn’t a good enough solution since the heap will still run out of memory if too much data is accessed and there is no reuse of a connection allowed.

kliakos · August 11, 2015, 11:24am

IMO, the whole purpose of CFML is to avoid this verbosity.

If you want more control of underlying objects, shouldn’t you use use plain java directly?

brucekirkpatrick · August 11, 2015, 1:32pm

Name any language worth using that doesn’t have these features? Every other language I have looked at has this. We have this for fileopen/fileread already and it only helped us do more with CFML, we need it for the rest of the i/o, query, http etc.

brucekirkpatrick · August 11, 2015, 1:47pm

The other thread about member functions said the static built in functions are faster then member functions.

I wouldn’t mind if a query api operated like fileopen/fileread does by passing an object around.

queryObject=queryConnect({...}); 
if (queryHasMoreRows(queryObject)) {
  struct=queryNextRowAsStruct(queryObject);    
  arr1=queryNextRowAsArray(queryObject);    
}
queryClose(queryObject); 
etc.

kliakos · August 11, 2015, 2:05pm

If the fact that the whole query isn’t set into a variable proves to have significant performance benefits or improves memory allocation, I would use such an interface.

Othewize the “for in” loop is super simple and works just fine.

clitnak · August 13, 2015, 5:34pm

I think it would be great if query fetching one record at a time was consumed by callback. Something like (but better than):

new QueryRecordSelector("SELECT * FROM myHugeTable").each( function( record ) { 
     writeOutput("<td>#record.name#</td>");flush();
  })

That would make me happy. My 2 cents.