Lock request time out period exceeded

Looking for some advice on how to get rid of DB lock timeouts.

After upgrading from Lucee 5.2.9.31 and moving from a colocation to the cloud (GCP), we are experiencing intermittent database Lock request timeouts (SQL Server 2022, Cloud SQL on GCP; all Ubuntu VMs). Message from SQL Server is “Lock request time out period exceeded.”

First couple of weeks in the cloud was flawless, then we began to experience 4/10 requests timing out, which would lead to Thread Death errors. Discovered that this was due to the default setting of locktimeout=-1 which means infinite wait for locks.

Current datasource settings that works 99% of the time with locktimeout=0 is:

    datasources["database_name"] = {
      class: "com.microsoft.sqlserver.jdbc.SQLServerDriver", 
      bundleName: "org.lucee.mssql", 
      bundleVersion: "12.4.2.jre8",
      connectionString: "jdbc:sqlserver://XXX.XXX.XXX.XXX:1433;DATABASENAME=database_name;trustServerCertificate=true;lockTimeout=0;SelectMethod=direct",
      username: "user_name",
      password: "encrypted:encrypted_password",
      // optional settings
      connectionLimit: -1, // default:-1
      liveTimeout: 15, // default: -1; unit: minutes
      storage: false, // default: false
      validate: true, // default: false
    };

We use a ‘luceesessions’ database with storage:true for our session storage. Otherwise same connection settings as above.

We use very few database locks, and have never had this issue even a single time prior to our upgrade from Lucee 5.2.9.31, Java 8, Tomcat 8. The queries it is hitting a lock timeout for do not make sense from a lock/resource contention perspective, though my next experimental step is to try to get rid of all of them… however this feels like guesswork.

Stack is below, but we began with Java 11, Lucee 6.1, and Tomcat 9. Which ran fine for a couple of weeks then slowed to a crawl with DB lock timeouts. So we reverted to Lucee 5.4 with Java 8, however we are still experiencing the same issues.

I have tried to change the locktimeout value (which is in ms) to 5000, and the requests timeout and fail almost immediately. I feel like there is some other setting I’m missing or messing up?

Don’t forget to tell us about your stack!

OS: Ubuntu 22.04
Java Version: 1.8.0_401
Tomcat Version: 9.095
Lucee Version: 5.4.6.9

p.s. I have run this to look for blocked threads right after getting this error, but it always returns zero:

SELECT @@TRANCOUNT

So to follow up here in case anyone reads this: after reverting to memory-based sessions this seems to have cleared up the locktimeout issues. We were using our primary database for session storage, which seemed to be the culprit. Will be looking at Redis or something in future, but have moved our average page response time from ~3s (with periods of 15-30s) down to ~300ms. Happy coding!