JTDS timeouts

taivo · August 14, 2019, 6:49am

Wondering if anyone can give me some advice on how to troubleshoot seemingly random database connection failures on Lucee 5.2.9.31 using JTDS 1.31 driver? Seems until recently it was stable but over last week have been getting timeouts. Tried testing switch to MS JDBC driver, but it doesn’t seem to like some of the Lucee query syntax in the app, and rewriting all of it to use QueryExecute (which seems to work) instead of building the query object and the code breaking on calling .getResult() will take too long. Just need to get this legacy app stable again.

This is the stacktrace, down to where it hits custom code. The queries are optimized, the tables are well indexed, and the queries always execute in under 1s. However I seem to be getting the odd 70s timeout. Anyhow, any advice appreciated!

lucee.runtime.exp.RequestTimeoutException: request /index.cfm ([app-root]\index.cfm) has run into a timeout (70 seconds) and has been stopped. at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.DataInputStream.readFully(Unknown Source) at java.io.DataInputStream.readFully(Unknown Source) at net.sourceforge.jtds.jdbc.SharedSocket.readPacket(SharedSocket.java:850) at net.sourceforge.jtds.jdbc.SharedSocket.getNetPacket(SharedSocket.java:731) at net.sourceforge.jtds.jdbc.ResponseStream.getPacket(ResponseStream.java:477) at net.sourceforge.jtds.jdbc.ResponseStream.read(ResponseStream.java:114) at net.sourceforge.jtds.jdbc.ResponseStream.peek(ResponseStream.java:99) at net.sourceforge.jtds.jdbc.TdsCore.wait(TdsCore.java:4127) at net.sourceforge.jtds.jdbc.TdsCore.executeSQL(TdsCore.java:1086) at net.sourceforge.jtds.jdbc.JtdsStatement.executeSQL(JtdsStatement.java:563) at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.execute(JtdsPreparedStatement.java:784) at lucee.runtime.type.util.QueryUtil.execute(QueryUtil.java:301) at lucee.runtime.type.QueryImpl.execute(QueryImpl.java:282) at lucee.runtime.type.QueryImpl.(QueryImpl.java:226) at lucee.runtime.tag.Query.executeDatasoure(Query.java:951) at lucee.runtime.tag.Query.doEndTag(Query.java:700) at com.gateways.bookinggateway_cfc$cf.udfCall1(/model/com/GATEWAYS/BookingGateway.cfc:518) at com.gateways.bookinggateway_cfc$cf.udfCall(/model/com/GATEWAYS/BookingGateway.cfc)

Terry_Whitney · August 14, 2019, 2:27pm

Check your request time out settings in yourserverinstallURL/lucee/admin/server.cfm?action=server.request

Zackster · August 14, 2019, 2:41pm

Are your indexes in unicode? Lucee data sources defaults to casting everything to unicode which may bypass indexes which aren’t

taivo · August 14, 2019, 3:35pm

@Zackster, thanks for this insight!

Seems my DB collation is SQL_Latin1_General_CP1_CI_AS.

In our DB, Primary Key & Foreign Key Indexes are all of type integer, and all joins are made on indexed integer columns.

Nonclustered indexes can be on non-integer columns (such as dates, which have a collation value of NULL) and varchar (which have DB collation) and INCLUDE such columns as well.

Could this be something, then?

For example:

CREATE NONCLUSTERED INDEX [index_example] ON [schema_name].[table_name]
(
[foreign_key_to_table_1] ASC,
[foreign_key_to_table_2] ASC,
[date_value_with_NULL_collation] ASC
)
INCLUDE (
[primary_key_of_table_name],
[foreign_key_to_table_3],
[foreign_key_to_table_4],
[foreign_key_to_table_5],
[date_value_with_NULL_collation],
[integer_column],
[integer_column],
[integer_column],
[integer_column],
[integer_column],
[integer_column],
[varchar_with_SQL_Latin1_General_CP1_CI_AS_collation]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Thanks again for your help!

taivo · August 14, 2019, 3:45pm

P.S. I’m still thinking it has something to do with JTDS driver version (1.3.1) and Lucee version (5.2.9.31), and perhaps Java version (1.8.0_191_b12) as this is another error I’ve started getting intermittently now:

Catch Object:

Message: Invalid state, the Connection object is closed.
SQLState: HY010
StackTrace:
lucee.runtime.exp.DatabaseException: Invalid state, the Connection object is closed. at net.sourceforge.jtds.jdbc.JtdsConnection.checkOpen(JtdsConnection.java:1744) at net.sourceforge.jtds.jdbc.JtdsConnection.setAutoCommit(JtdsConnection.java:2276) at lucee.runtime.db.DatasourceManagerImpl.end(DatasourceManagerImpl.java:324) at lucee.runtime.db.DatasourceManagerImpl.end(DatasourceManagerImpl.java:305) at lucee.runtime.tag.Transaction.doFinally(Transaction.java:157) at com.services.externalbookingservice_cfc$cf.udfCall1(/model/com/SERVICES/ExternalBookingService.cfc:291) …

P.P.S. I’ve tried upgrading to latest stable version of Lucee and errors became more frequent.

Zackster · August 14, 2019, 4:19pm

Did u try the latest MS SQL 7 extension which was just released recently?

taivo · August 14, 2019, 4:53pm

Yes I have. I am having some Lucee syntax issues with db query code with the MS driver.

Old tag-type functions seem to work fine, and queryExecute in cfscript seems to work fine.

However building queries seems to fail consistently on the ‘getResult()’ call, and I would prefer not to have to rewrite a tonne of legacy code. But that may be something I have to do if I can’t solve this soon…

Example:

sql = ’ SELECT *
FROM [schema].[table]’ ;

// Coding method 1:
// this works on MS SQL driver
qResult = queryExecute(
sql: sql,
options: { datasource: ‘our_database_name’ },
params: {
ids: {type: ‘INTEGER’, value: arguments.ids, list=“true”},
start: {type: ‘TIMESTAMP’, value: arguments.start},
end: {type: ‘TIMESTAMP’, value: arguments.end}
}
);

// Coding method 2:
queryObj = new query();
queryObj.addParam( name=“ids”, value=“#arguments.ids#”, cfsqltype=“CF_SQL_INTEGER”, type=“IN”, list=“true” );
queryObj.addParam( name=“start”, value=“#arguments.seasonStart#”, cfsqltype=“CF_SQL_DATE”, type=“IN” );
queryObj.addParam( name=“end”, value=“#arguments.seasonEnd#”, cfsqltype=“CF_SQL_DATE”, type=“IN” );
queryObj.setDatasource( getConfigBean().getDSN() );
queryObj.setSQL( sql );
result = queryObj.execute();
// This next line fails on MS SQL driver:
qResult = result.getResult();

taivo · August 14, 2019, 4:54pm

Thanks, I don’t believe increasing the timeout setting would solve this satisfactorily, 70s wait is too long for a sub-second query.

Terry_Whitney · August 14, 2019, 6:29pm

Where is your SQL Server located?

Is it on the same host as your lucee install?

If it is, use ip address of 127.0.0.1 instead of hostname “localhost” in your db configuration
if it’s not, I would highly suggest you run ping tests between both hosts, log and compare.

If any of this is running windows, check for pending updates. Patch as necessary or at least restart your network services.

taivo · August 23, 2019, 4:47pm

Terry, no, application and database are not on the same machine.

Our network guru solved this issue, but thank-you everyone for your suggestions! More tools in my arsenal for next time.

FYI the problem was multifacted:
(1) our SQL Server operates in a Cluster, and CAU (cluster-aware-updating) had updated one server to SP2, one server to SP4.
(2) Another part of the problem was that Google’s DNS servers were included in our settings. Removed those and rebooted our DNS servers.
(3) We still have a few application-level timeouts happening, but this is where I shine so the solution is close at hand.

Thanks again!