Https://dev.lucee.org/t/increase-in-communications-link-failure-errors/6399

Hi everyone,

I am seeing a lot of the following errors using MySQL datasources in our Prod instance.

"JDBC begin failed: ;JDBC begin failed: ;lucee.runtime.exp.NativeException: JDBC begin failed:
... Caused by: org.hibernate.TransactionException: JDBC begin failed:
... Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: The last packet successfully received from the server was 97,851,282 milliseconds ago.
... Caused by: com.mysql.cj.exceptions.CJCommunicationsException: The last packet successfully received from the server was 97,851,282 milliseconds ago.
... Caused by: java.net.SocketException: Connection timed out (Write failed)

I have found this thread;
https://lucee.daemonite.io/t/increase-in-communications-link-failure-errors/6399

Before I went ahead and altered the global timeout, as suggested in that thread - I thought I would shoot off a message and ask if there was a “verified” solution (and reasoning) ?

(In that the thread above has the global timeout change, “I clicked do not use-SSL” and one about bundle names.)

Our PROD datasources ALL have do not use SSL - Staging uses the new driver AND SSL is true

Our prod instance is using v8.0.11 of the driver - which I know is old.
We have upgraded our staging instance to the latest, where I still the same error message on occasion.

Lastly, I read a Java related thread about the same issue and stated to use a datasource “pool”, where the pool checks for a live connection before it is offered to the app. Is this something to investigate?

Thanks in advance.

** MySQL driver : 8.0.11
OS: Amazon Linux 2 (CentOs7)
Java Version: 11
Tomcat Version:8
Lucee Version: 5.3.8.206

This is a cloud engineering issue, not a lucee issue.

You have a noisy neighbor. You would do well to provision a transactional vm and place additional caching mechanisms in your application if you have a busy app / site.

QoQ and cachedwithin are 2 of the most powerful features completely underrated by nearly every CF developer.

1 Like

@Terry_Whitney! I’m using qoq a lot already :smiley:. When I don’t have too many records in a table, but need extensive filtering for different results (e g. for creating a deeper multilevel navigation drawer). I query all the rows and save it to a components constructor variable. Then I just query the query. Feels way more natural then using fancy reduce() and map() functions with closures.

1 Like

I think altering the global timeout on your database will only change the time you need to wait before seeing this error. We run a script every 8 hours with the line below, which solved the issue for us:

<cfset dbpoolclear()>

This is addressed in [LDEV-3139] - Lucee

Simon

If I run this command during the middle of the day - will it effect currently running SQL commands / transactions?

Will my active users get an error if they’re doing a DB process?

Thanks!

I can’t speak to the details, but we have been using this function for a long time and have never seen an error.
Simon

Thanks very much for the quick reply Simon.

Have an awesome day!

Does anyone know how database pools work?
WRT to communication failures.

If I have a pool limit of 100 and a timeout of 1 minute;
How does using dbPoolClear() every 8 hours help?
If my connection timeout is 1 minute - or any value less than 8 hours - how will it assist in stopping communication errors?