Locking sessions across a cluster and seeing changes to session variables immediately?

Redtopia · March 20, 2019, 4:39pm

Background: I’m implementing an auto-login function in application.cfc that looks for a token that’s stored in a cookie, and then uses it to authenticate the user. Once a token is used, a new one is generated, and the cookie is updated. If a token is not found, the cookie is deleted. A session lock is used to prevent multiple concurrent requests from attempting to login the user, which would have unintended side effects.

All the core functionality for this works (on a single node), but I need to make the code cluster-friendly. In application.cfc, this.sessionCluster = true; is set, and session storage is a shared memcached instance. This all works fine.

The main questions I have are (referencing the code below)

The code uses an exclusive session lock to prevent multiple requests from executing the login code at the same time. How would you replace the session lock below with one that locks the session across the whole cluster?
The code assumes that changes to session variables can be seen immediately. Is this true when a session variable is changed on one node, while a concurrent request on another node tries to access that same variable? If not, is there a way to refresh the session scope to ensure you’re getting the latest?

Below is the autoLogin() function (works on a single node):

private void function autoLogin () {

	// multiple, concurrent requests could be hitting this on different nodes in the cluster

	// if we're already logged in, nothing to do
	if (session.isLoggedIn) {
		return;
	}

	// get the auth token if it exists
	var token = cookie.keyExists("auth") && isValid("uuid", cookie.auth) ? cookie.auth : "";
	if (token == "") {
		// if a token doesn't exists, nothing to do
		return;
	}

	// assertion: user is not logged in and an auth token exists - login using token
	// but we need to make sure that only one request can attempt to login using the token
	
	// lock the session to block other requests - how would you do this on a cluster?
	lock scope="session" type="exclusive" timeout="10" throwontimeout=false {
		// check if logged in again - another thread may have succeeded while this
		// thread was waiting for the lock to open
		if (!session.loggedIn) {
			// we can only call this once if user is not logged in!
			application.auth.loginWithToken(authToken=token);
		}
	}

} // autoLogin()

bdw429s · March 21, 2019, 6:29pm

There is no built in way to lock a session across the cluster. You’d have to have a shared external resource like the DB or a custom key in Memcache to manually build something like that.

As far as how often the session data is read and written from cache, I’m not sure any longer. Back when I wrote the couchbase extension I did a lot of testing on this and the Couchbase wen UI made it easy to “see” the reads and writes. Lucee used to read the cache once at the start of a request and write it once at the end. Later, Micha changed it to perform a second read before writing to see if anything had changed and attempt to merge the keys. Later, he talked about changing Lucee to read and write each top level key in the session scope separately, but I’m not sure if that ever happened. At Ortus, we stopped using the session storage for a while due to a bunch of bugs in Lucee 4.x and I haven’t touched it in a while. I assume those bugs are gone, but once we rewrote out apps to use the session storage module to leverage the cache directly we never really had a reason to go back. I would recommend you find a way to log the reads and writes on your cache and do some testing to see how it works now.

bdw429s · March 21, 2019, 6:34pm

I forgot to add, is there a way you can refactor you logic to not need the locking. I didn’t take the time to absorb everything you’re doing there, but it would be all around easier if the locking just wasn’t needed.

Redtopia · March 21, 2019, 6:49pm

Unfortunately, I can’t think of a way to eliminate the lock in this case.

justincarter · March 21, 2019, 11:10pm

One way to eliminate the need for the lock (or to have it work) is to set this.sessioncluster = false and to use a load balancer with sticky sessions / session affinity (I find using a cookie is better than using client IP) so that the requests for an individual user will always go to the same backend application server. If that backend node goes down the users session data will still be persisted in your memcached server, and when they get switched to a new backend node their session will be read from the session store and they will be able to continue without having to establish a new session.

This may or may not be a trivial change depending on your environment, but I think it should solve the problem This is how I always deploy apps that use a session store.

My understanding of how sessioncluster works based on the behaviour I’ve seen (and I could be wrong…) is;

sessioncluster set to true will always read the session from the session store at the start of a request, and persist it back into the session store at the end of a request – this is problematic if concurrent requests to different nodes read and write to the session at the same time?
sessioncluster set to false will read the session from the session store if it doesn’t already exist in memory, changes to the session are immediately made in memory and then persisted back into the session store at the end of a request – concurrent requests to the same node will always see the latest session data

Redtopia · March 22, 2019, 1:08am

Thanks Justin… I want to avoid sticky sessions because I want stateless nodes in the cluster. I’ve got sessionCluster turned on, and it works fine with multiple nodes.

I’ll be happily write an article on how to do this once I get it figured out! I’ve been trying to get answers on locking sessions across a cluster and concurrent session variable access for a while, but nobody seems to know the answers, and testing this is difficult to do (though not impossible). To me, it seems like these are core architectural patterns for modern web development, and the fact that I can’t get authoritative answers to these basic questions is frustrating.

justincarter · March 22, 2019, 2:22am

I agree docs are sorely needed, and perhaps the sessioncluster variable being a boolean is what leads to some of the confusion in how we expect it to work too, since I think it just controls when the session is read; true means “always read the session from the session store on every request”, false means “only read the session from the session store if it’s not already in memory”?

The CFML Session is a (potentially) large data structure and when it’s stored in “external memory” (i.e. memcached) and it’s probably only efficient for the application server to read it once at the start of a request and then write it once at the end of a request. I think is how sessioncluster set to true behaves. This means that if the session is read from and written to on different nodes concurrently, there is the potential to lose data / overwrite a session object where individual values inside the session might go missing, because the whole object is serialised each time – each individual key to N levels of depth is not stored separately (again, from my understanding). So to me, sessioncluster set to true isn’t really usable when concurrent writes to the session could / need to occur.

I’ve found an old Railo ticket that describes this issue here;
https://issues.jboss.org/browse/RAILO-2619

With sessioncluster set to false, in combination with sticky sessions, I still consider the nodes to be “stateless” in so far as the underlying session data is always persisted in your session store (i.e. memcached), so if a node dies and a user gets moved to another node their session will continue uninterrupted and you haven’t lost any user session state. You could think of it as using the JVM memory as an additional caching layer to reduce the number of hits to the caching server, which would also improve performance (less hits to the cache server means less network traffic and less resource usage on the cache server itself, less network traffic means less latency in retrieving the session data, etc).

Perhaps @micstriit or @Gert can clarify, or anyone else who has run up against issues with session storage / clustered sessions.

Redtopia · March 22, 2019, 3:33am

Thanks for the explanation Justin, you’ve clarified my understanding of what this.sessionCluster actually does, so thank you for that. As you’ve said, you “think” this is how it works, and @bdw429s has also mentioned some issues with locking sessions on older versions of Railo and Lucee 4.x., so it would be nice to get some additional confirmation/clarity on this.

Assuming your explanation is accurate, I think that using sticky sessions would be fine, and it obviously eliminates some of the complexity, re: concurrent changes to session data. I mainly want to make sure that I my cluster can scale and that nobody loses their session when a node goes down.

The app is behind an AWS Elastic load balancer, (round-robin), so I assume I would have to make a change to that load balancer configuration, and I see an article about it here: Configure sticky sessions for your Classic Load Balancer - Elastic Load Balancing.

Could you expand on what you said here:

Thanks!

justincarter · March 22, 2019, 3:47am

Some load balancers can do sticky sessions by client IP address, which sometimes doesn’t work well if you have a high volume of traffic coming from a small number of IP addresses, such as users within a large organisation or cases where you have a reverse proxy in front of the load balancer like AWS CloudFront or CloudFlare SSL proxying. That can result in an uneven load distribution because all of those users might end up on the same backend node, whereas using a cookie for the sticky session is usually fine (as long as the reverse proxy allows it through) because each client gets their own cookie from the load balancer and you’ll have a more normal distribution.

If you’re using AWS ELBs/ALBs I think they always use a cookie when you enable session stickiness rather than client IP so you probably don’t have those concerns