Request Timeout


#1

We worked recently on the Request Timeout functionality in Lucee to counter changes in Java 8 that affected the existing functionality.
Remaining problem is that every request timeout can leave the server in a instable state, because it can kill a thread in a monitor, lock, writing a file … .
So in any case a request timeout should be avoided at any cost.

This raises the question, can we improve the current implementation to avoid request timeouts?

Raise awarness of Request Timeouts happening
Show in admin or the debug output how many request timeouts you had since startup.
maybe with a little bit more info (compressed).

Make request timeout smarter
Instead of simply stop a thread after a certain time Lucee should analyze the thread instead.
Answer the following question:

  • is the thread blocked (waiting for example on a network stream) or still running
  • log information about long running requests and compare current requests with requests from history

Soft timeout
Instead of only supporting forcing a stop, Lucee should have a more graceful stopping mechanism to choose.

Log more useful information
for example log a category of the request timeout (network,code,waiting,…)

Any other idea how to improve?


#2

This absolutely explains the problem we had on a server here. Running IIS,Tomcat,Lucee,Java8stuff and FusionReactor leade to a server crash once a week. After uninstalling FusionReactor everything went smooth. The sympton were tasks that never ended and stacked up until the server crashed. Seems that the Timeoutprotection in FR has been the problem in conjunction with the behaviour you mentioned above.


#3

Just keep in mind that any solution must still allow to kill the request, otherwise we make the server more vulnerable to a DoS attack.


#4

why that? but sure we will maybe extend the current functionality…


#5

If a malicious user can figure out a way to send a request that will never end then very easily he can make the server unusable.

see https://www.cvedetails.com/vulnerability-list/opdos-1/denial-of-service.html


#6

One thing I’ve noticed in the past, maybe not with Lucee but ACF (so not sure if the same in Lucee) is the Request Timeout is not a hard and fast limit, it is only “reviewed” between tags, and then stops if it has been exceeded, so if you get stuck in a tag, like CFQUERY or CFHTTP when the timeout is reached it doesn’t have any effect until after that tag has finished.

Something else I’ve always thought was kind of silly is that request timeout is an error so therefore caught by error handling, but as the Request Timeout has been reached there is no time left to process anything in the error handling. So it would be good, if, in the OnError and CFCATCH, you could specify a new timeout for that, giving it additional time to process the error handling.


#7

that is only the case with ACF, not lucee…


#8

like i said, the idea is not to remove functionality, the idea is to extend the current implementaton


#9

back in the day with classic ASP, there was a feature which let you test if the client connection was still active,
there’s no point in generating a very complex html report if the browser has stopped listening, i.e it’'s going nowhere.

perhaps there could be way to check if the request was due to be terminated and allow cleanup to occur.

i’ve often done this with session or application flags so i can interrupt long running requests doing a lot of work


#10

That depends on the type of the script. If the result is intended for the user to view in the browser, then sure.

But if the result is saved to a data store or simply performs some maintenance then that is incorrect.


#11

there isn’t a function which returns the time remaining before the request timeouts is there?


#12

I have a suspicion that this is happening to us sporadically. ColdBox interceptors use a SychronizedMap for storing sets of CFCs that will be invoked when an event is called. Trouble is, these are single threaded and READS are locked and block any other threads from reading. We make a lot of use of interceptors and I have a feeling (plus some thread dumps that point to it) that a request timeout while one of these locks is opened can leave the server in a state where it can’t serve requests (because the lock is never released).

I would love to know if that could indeed be resolved, but I would also like to see an option to turn off Lucee request timeouts altogether, either engine-wide or per page context.

This would enable things like using CFConcurrent to spawn long lived thread pools that are able to schedule Lucee code, etc. without said background threads being subject to timeout.

There is already a setting requestTimeout=-1 option to set the MAX timeout, but could this be used instead to turn them off for the request?

And could PageContext have that added as a method? i.e. to set/get whether or not request timeout is enabled for that context?


#13

there’s a bit of related discussion going on over in jira

LDEV-1956 - DirectoryCopy doesn’t check the thread interrupt
https://luceeserver.atlassian.net/browse/LDEV-1956