Connection to DB shuts down after several hours

Hi,

All is well for a several hours - then this:

Type: database
Messages: Could not create connection to database server. Attempted reconnect 3 times. Giving up.

The past two times this happened, restarting Lucee caused the connection to reestablish - now, restarting Lucee the DB connection won’t restablish.

I have a 2012 Lucee 4.5 machine running in same environment using same user/pass and running without issues.

Any ideas?

Windows 2016, Lucee 5.2.9.31 (installed few days ago)

MySQL 5.7.2.3

this.datasources[“@@@”] = { class: ‘com.mysql.cj.jdbc.Driver’ , bundleName: ‘com.mysql.cj’ , bundleVersion: ‘8.0.15’ , connectionString: ‘jdbc:mysql://@@@:3306/@@@?useUnicode=true&characterEncoding=UTF-8&zeroDateTimeBehavior=CONVERT_TO_NULL&tinyInt1isBit=true&serverTimezone=US/Pacific&autoReconnect=true&jdbcCompliantTruncation=true&allowMultiQueries=true&useLegacyDatetimeCode=true’ , username: ‘@@@’ , password: “encrypted:@@@” // optional settings , blob:true // default: false , clob:true // default: false , connectionLimit:-1 // default:-1 , alwaysSetTimeout:true // default: false };

In trying to setup a new driver in Lucee Admin - after adding the old JDBC driver - I now cannot access the DB page in admin:

Lucee 5.2.9.31 Error (java.lang.NullPointerException)
Message java.lang.NullPointerException
Stacktrace The Error Occurred in
/admin/services.datasource.list.cfm: line 143
called from /admin/services.datasource.cfm: line 108
called from /admin/web.cfm: line 433
Java Stacktrace lucee.runtime.exp.NativeException: java.lang.NullPointerException
at lucee.runtime.db.DCStack.openConnectionsIn(DCStack.java:104)
at lucee.runtime.db.DCStack.openConnections(DCStack.java:117)
at lucee.runtime.db.DatasourceConnectionPool.openConnections(DatasourceConnectionPool.java:339)
at lucee.runtime.db.DatasourceConnectionPool.openConnections(DatasourceConnectionPool.java:348)
at lucee.runtime.tag.Admin.doGetDatasources(Admin.java:4329)
at lucee.runtime.tag.Admin._doStartTag(Admin.java:593)

Removed the entry from lucee-web.xml.cfm so I could edit the db page - tried to add a new DS and received warning about needing to run: mysqladmin flush-hosts amazon – Did so, and now all is well again.

As some background that I think is likely the issue - AWS environment - the primary IP was released from the server - there was no mapping for the primary IP after this happened, so the next interface was used when connecting to DB - all while Lucee was running. The DB started blocking requests.

Resolved by binding a new IP to the primary - rebooting.

Ok, nothing changed - server sat for several hours and now the error has appeared again.

Type: database
Messages: Could not create connection to database server. Attempted reconnect 3 times. Giving up.

Restarted Lucee and all is well.

Very much at a loss. Any help much appreciated!

This type of error is out of my league. But sometimes a lesser skilled person has something to offer to at least track down the problem. So… have you tried setting this connection up in the Lucee admin instead of in the Application.cfc file?

Really appreciate the reach out, ty.

I did actually setup the DB in the admin… latest thing I’ve tried is to set the connection timeout to 0 – the default is 1. Thing I just noticed though, in the connection string - I see the default is 1 and the unit is seconds?

this.datasources.>>>>> // optional settings , blob:true // default: false , clob:true // default: false , connectionLimit:-1 // default:-1 , connectionTimeout: // default: 1; unit: seconds , alwaysSetTimeout:true // default: false };

The “unit” part of the string isn’t there when you actually have a positive value… and it should be minutes, as per the UI. It’s actually commented out anyway but I’m assuming this is a property I can pass if I paste this into my Application.cfc

Not sure if it’s worth opening a ticket on this but by default, unit def should not be 1 second.

Any chance you can see the logs on the DB server? This could be something like your driver opening connections that then expire or don’t expire and then opening up new connections that work. Usually you get connection limit info from the logs on the DB server.

Does it work when you use a more-standard DB server that you have access to the logs on?

Hi Jordan,

I’m using RDS on AWS, just checked the logs but I’m not capturing any history - about to change!

So, the issue has gone away so I think it’s safe to assume the default unit being applied as the standing issue. one second surely was the continual connect and disconnect that caused the db server to start refusing connections over time.

What harm can it do - am I safe to leave the connectionTimeout set to zero do u think?

In answer to your Q - it’s only this machine I’ve had issues with - it’s the only one running 5.x

Damn - just happened again:
Type: database
Messages: Could not create connection to database server. Attempted reconnect 3 times. Giving up.

This was with a 0 connection timeout… Perhaps that’s the issue?

Here are some command to look at.

SHOW GLOBAL VARIABLES LIKE "%timeout%"
SHOW SESSION VARIABLES LIKE "%timeout%"
SHOW STATUS WHERE `variable_name` = 'Threads_connected';

@kabutotx - I ran those commands and compared to my dev box - all were default. Thing is, I have one App server running in this environment with no issues at all. It’s just this new App server that is being blocked.

Ok, so in the RDS logs I found that the IP of my new server was being resolved unsuccessfully, so eventually it was blocked and thus the cause of this issue. I don’t need my App Servers to be resolved by DNS - wondering if anyone has an opinion on this?

I added this to a parameter group - applied and rebooted:

skip_name_resolve=1

This appears to have resolved the issue I was having - will post back in a few days.

Wow, it happened again! I’ve opened a ticket with AWS RDS - will post back once I get some help, though any opinions here welcomed of course!

Amazon responded - I added hostname resolution to the VPC (i should have realized). I’ve restored set skip_name_resolve=0 in a parameter group.

I’m no longer seeing any resolution issues in the logs and Lucee appears to be connecting without issue.

Ok, it just happened again…

Looking at the Client Connections in MySQL Workbench - when my App Server connected - it was immediately disconnected. So the ZERO timeout - is actually a 0 timeout - it connects and disconnects immediately.

I’ve changed the timeout back to 1 - (minute) - I’m seeing a consistent connection for the App Server.

…and again… Rebooting Lucee fixes the issue.

Digging into “Aborted Connections” on MySQL

I don’t think that’s exactly correct. I’m not sure what specific “connectionTimout” attribute you’re using where, but I assume it’s this one?

https://dev.mysql.com/doc/dev/connector-net/8.0/html/P_MySql_Data_MySqlClient_MySqlConnection_ConnectionTimeout.htm

That page says ‘0’ means no timeout. So, there must be something else preventing the connection. Anything in the logs about max connections? Anything from AWS that denies the connection?

-JM

I really appreciate the response! I want to be able to run 5.x at some stage.

So there were two issues:

On the private VPC, public hostnames needed to be enabled.

Secondly, the peer connection from the Private to DMZ needed DNS resolution to be enabled - this KB was sent to me by AWS:

Although I made these changes, I never confirmed it worked, I had another major issues with 5.x - .jars that refused to find classes from other jars, even though the bundle link from Lucee Admin shows all jars were loaded. To resolve this, I’m assuming I’ll have to rebuild the jars to include all dependent jars… I’ve slated this for a future point - then I can see if the hostname resolution fixed the issue…

Yeah, sorry for the (rather extreme) delay in my response. I just didn’t notice until yesterday that you had responded. Very sorry that happened!

1 Like