PostgreSQL DB connections not being released with 42.7.5 JDBC driver

Hi there!

We recently updated our dev environment to the latest version of Lucee. In this context, we’ve also updated the underlying JVM and Tomcat versions.

Setup now:

OS: Debian 6.1.0-22-amd64
Java Version: 21.0.7 (Eclipse Adoptium) 64bit
Tomcat Version: 9.0.105
Lucee Version: 5.4.7.3
PostgreSQL extension: 42.7.5

Previous setup:

OS: Debian 6.1.0-22-amd64
Java Version: 17.0.15+6-Debian-1deb12u1 / OpenJDK Runtime Environment Temurin-11.0.22+7 (build 11.0.22+7)
Tomcat Version: 9.0.86
Lucee Version: 5.4.5.23
PostgreSQL extension: 42.7.3

Generally, the updated versions work as expected. Though we now experience one issue: Idle connections to our PostgreSQL DB don’t seem to get released anymore, so the number of connections is steadily increasing.

For testing, we’ve reduced the connection limit from 100 down to 50 and and the connection idle timeout from 5 min to 1 min. That didn’t have any effect, though.

Any idea how to fix that issue?

Sebastian

1 Like

That’s a lot of change at once, of course. I understand: ya gotta do what ya gotta do. :slight_smile:

So first, have you tried simply reverting the Postgres driver back to what it was? That would seem to have the most direct impact on the db connection management, so is worth ruling out first. Then the Lucee version. (And some will suggest you try 6 or 7, but I appreciate that may seem to be “too much work” at this point.)

Second, it’s not clear: did anything about Postgres itself change? Or the server on which it runs, if it’s separate from Lucee’s?

Finally, as for the upgrades regarding Lucee, were those done in-place, or as a new Lucee install? And if the latter, is it on a different server than the first? (We can’t presume either way, since you show your OS version being unchanged.)

Sorry for only questions, but they may help narrow the path to resolution.

Known issue. See DB Connection idle timeout isn't honored in Lucee 6 - #2 by psarin and Jira (and the history specific to v5.x).

If you are able to upgrade to the latest RC for 6.2.2, it may solve your problem. However, it is a big upgrade (upgrade of Tomcat + Java).

Does the problem still occur with the older 42.7.3 JDBC driver?

There really aren’t that many changes to 5.4 between those two LTS releases

I just published the latest 42.7.7 postgres driver (it’s a minor CVE update, the 42.7.6 release had a lot of fixes)

https://luceeserver.atlassian.net/browse/LDEV-5661

Another user had also reported problems with the 42.7.5 driver

Thank you for the quick replies! And excuse my late reply!

That’s what delayed my reponse. We were checking the different PostgreSQL driver versions when I posted my question. And that took several days to get proper results.

We can now confirm that 42.7.5 is the culprit. 42.7.3 was working fine. We’ll try out 42.7.7 once available, though it doesn’t seem to be offered yet in the Lucee admin.

To answer the other questions and for reference if anybody else runs into problems when upgrading:

We also just started to (finally) evaluate the upgrade to 6.2, though wanted to upgrade to the latest bugfix release of 5.4, first.

No, the PostgreSQL version did not change, nor the (separate) servers on which Lucee and the DB run.

The upgrade was done in-place. The upgrade to 6.2 will be a new install.

Thank you all for your help!

Sebastian

1 Like

there was a caching issue on the extension update provider, just deployed an update to address that.

42.7.7 should show up now in the admin?

Yep, now it’s available. Thanks!

Installed it. I’ll provide feedback once we know whether this version fixed the issue or not.

Sebastian

1 Like

with 6.2.2.54-RC and 42.7.7 the issue seems to be resolve for us.

2 Likes

@SebastianZ, great to hear. Hope it works out well.

And @moonpo, do you work with him, or are you just saying you had the same problem? :slight_smile:

@carehart we were facing the issue somehow I see that my previous message didn’t come up into the thread properly.

Our previous stack was like that
Lucee 6.2.1.122
Java 21.0.7 (Eclipse Adoptium) 64bit
PostgreSQLextension 42.7.5
Apache Tomcat/11.0.6
OS Linux (6.8.0-1029-aws) 64bit

And now upgrading to the new RC version and the new postgresql driver everything is running smooth.

1 Like

Ah, thanks for the clarifications. :slight_smile:

FYI, we could now confirm that the issue is fixed in version 42.7.7 of the PostgreSQL extension.

Sebastian

1 Like

I have to come back to this topic. The connections now did accumulate again after them being stable for the last two weeks. So the issue doesn’t seem to be completely fixed in 42.7.7.

I couldn’t pinpoint the reason for this yet. Though from what I can see, the idle connections are not closed as expected.

Current setup is a connection limit of 50 and 1m of connection timeout and idle timeout but the connections stay open for much longer, sometimes several minutes up to several days.

My hypothesis is that this might be related to queries executed inside threads, as we have a few of them running inside threads, but we also see idle connections with queries that are unrelated to threading.

So, generally speaking, the connection pooling doesn’t work as expected. And the issue is not totally gone yet, even when the connections didn’t accumulate over the past two weeks.

Sebastian

We see the same. Things are better with Lucee 6.2.2 RC 54+ and 42.7.7 driver but occasionally see hung connections. Narrowed it down to, whenever a thread doesn’t complete successfully (e.g., timeout due to long running, or error within thread processing), all open connections that the thread used remain open in Postgres and remain idle forever, even with setting keep alive or connection time out’s.

1 Like

Are you able to repo this problem locally?

@psarin nice analysis, lead me right to this potential fix

I was doing some code review the other day and found this which might solve this problem

I tried to create a reduced test case for this using the following script:

<cfparam name="url.useThread" default="false">
<cfparam name="url.throwError" default="false">
<cfparam name="url.causeTimeout" default="false">

<cffunction name="runQuery">
	<cfquery name="qTest" datasource="mydb">
		SELECT * FROM test
	</cfquery>

	<cfdump var="#qTest#">
</cffunction>

<cfif url.useThread>
	<cfthread action="run" name="testThread" timeout="15">
		<cfset runQuery()>

		<cfif url.causeTimeout>
			<cfset sleep(60000)>
		</cfif>

		<cfif url.throwError>
			<cfthrow message="This is a test error from the thread.">
		</cfif>
	</cfthread>
<cfelse>
	<cfset runQuery()>

	<cfif url.causeTimeout>
		<cfsetting requesttimeout="10">
		<cfset sleep(60000)>
	</cfif>

	<cfif url.throwError>
		<cfthrow message="This is a test error.">
	</cfif>
</cfif>

Though in all cases the connection got closed again a few seconds after the query execution.

Sebastian

1 Like

@Zackster Any chance you can backport your fix for 7.0 to 6.2 and 5.4?

Could you reproduce the issue? Unfortunately, I couldn’t break it down to a simple reproducible test case, so far.

@psarin Do you have a test case you could post?

Sebastian

That fix is included in the latest 6.2.2.90-RC

1 Like