Solved: MySql Communication Link Error

Hi Everyone,

I am sure that previously I have written about exceptions that were being thrown in our application about Communication Link Error or
The last packet sent successfully to the server was XX milliseconds ago

We use MariaDB - which is a community driven fork of MySQL - after Oracle got weird about licensing.

I wanted to help future me and anyone else that might search for it - in a Lucee context - by documenting what we found to be the cause of the issue.

TL;DR;
Do not have

  • Incorrectly / inconsistent transaction nesting
  • RETURN statements “IN” transaction blocks.

Do have

  • Consistent use of transaction / try/catch blocks;
transaction {
    try {
        //doStuff
        transactionCommit();
        var result = true;
    } catch (any e) {
        //do stuff (logging / cleanup)
        transactionRollback();
        var result = false;
    }
} //End transaction

return result;

Full story;
We first noticed it after doing a new developer install with the latest Lucee and the latest extensions.
Additionally confounding the issue for us, is that our production system runs code that is basically more than 2 years older than our current source code - which has been updated along the way with cherry-picked code changes - that were done by hand, to bend the new code into working with our old platform.

So even these new code changes on our production boxes, weren’t exactly as coded in our repos / dev workstations / test servers, which makes it spectacularly difficult to ask; “Why does this work in Production but not on my local machine?”.

We started by doing all the obvious things.

  • Downgrading extensions
  • Upgrading extensions
  • Mixing and matching extensions,
  • Using the Lucee MySQL extension
  • Swapping it out for the “MariaDB” extension.
  • Using the Lucee ORM extension - various versions
  • Using the Ortus ORM extension - various versions

We changed every JDBC setting and used every combination - (that was an exhausting and very time consuming task)

On and off - we have struggled with this error for the last 2 years - which has stopped us from being able to implement our new code-base and CI/CD pipelines for managing upgrades to our production servers.

I have spent the past 4 weeks, solely working on this task and via Google, found quite a few Java based user forum posts - where they complained of the same error, while using Hibernate - where the majority of cases that I followed, were caused by incorrectly nested transactions.

I reached out to @bdw429s @michaelborn_me and @jclausen at Ortus for some help and they were extremely generous with their time and advice.

Which was initially about how more recent Hibernate versions are “stricter” and “less sympathetic” to dodgy coding practices.

So as a test - I removed ALL transactions - and ormFlush()ed after every ORM function.
This didn’t help us either.

I was also asked:

By any chance is there a return statement inside of one of your transaction blocks?
I have noticed that a return statement inside a transaction block, with Lucee, will keep the transaction open.

And we also had some discussion around our ORM / Application.cfc settings, too.
automanageSession = false,
flushAtRequestEnd = false,
Which means that we have to self-manage our transactions and ormFlush() (Because of course - we don’t use transaction consistently, for all ORM tasks - that would just be too easy!)

Where the advice back was clarification that;

ORMFlush, inside of a transaction, just flushes the SQL to the Hiberate buffer. Outside of a transaction, it will write directly to the datbase.

We also discussed the JDBC settings and the fact that Lucee uses two separate transaction managers, one for plain SQL and a separate one for Hibernate.@jclausen - gave me a collection of settings that successfully work for them in ContentBox.

Subsequently, I started focusing in on our ORM / Hibernate, code.
What I found was a lot of very inconsistent code across our application.

  • Some code used transaction {} blocks - some didn’t.
  • Some code used transactionCommit() and transactionRollback() - some didn’t.
  • Some code used a combination (commit for the positive path - nothing for the negative - or vice-versa)
  • Some code was;
transaction {
    try {
    } catch {
    }
}
  • Some was;
try {
    transaction  {
    }
} catch {
}

We have particularly problemtic process that;

  • Create a Lucee Datasource
  • Create some DB tables and populate data - directly in THIS function.
  • Via cfhttp call a Taffy API endpoint - that does more DB shenanigans
    • Within that : start a new thread and run ormReload()
  • Do more “stuff”
  • Start yet another thread, run ormReload() again.

So - now we finally have it working - by following;
Do not have

  • Incorrectly / inconsistent transaction nesting
  • RETURN statements “IN” transaction blocks.

Do have

  • Consistent use of transaction / try/catch blocks;
transaction {
    try {
        //doStuff
        transactionCommit();
        var result = true;
    } catch (any e) {
        //do stuff (logging / cleanup)
        transactionRollback();
        var result = false;
    }
} //End transaction

return result;
2 Likes

@Gavin_Baumanis really awesome writeup! Thank you for sharing!!!

1 Like

Should the return result; be outside the transaction {} block?

1 Like

Thanks for sharing. I encountered the error yesterday and got your solution with the latest newsletter. That saved me a lot of time.

1 Like

Thanks for catching the error in my write-up.
Fixed for future me and everyone else!