Duplicate query_string and RequestTimeOut passed by the scheduled task

Hi,

I created a ticket a bit too quickly and saw the message that I should rather submit the bug here first to confirm that it is indeed a bug. So here it is!

When we create a scheduled task with the url : https://www.example.com/sub-folder/update.cfm?myParameter=HelloWorld

The scheduler seem to call the url and duplicate the query_string. The web server receive this : myParameter=HelloWorld&RequestTimeout=50&myParameter=HelloWorld&RequestTimeout=50

In any case, it does when you go to the web admin, check the scheduled task and click Execute.

Lucee Version: 5.3.8.201

I can’t reproduce that?

this is my test scheduler.xml

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?><schedule>
<task autoDelete="false" hidden="false" interval="60" name="test" paused="false" port="8888" proxyHost="" proxyPassword="" proxyPort="0" proxyUser="" publish="false" readonly="false" resolveUrl="false" startDate="{d '2022-07-20'}" startTime="{t '00:00:00'}" timeout="15000" unique="false" url="http://127.0.0.1:8888/task.cfm?a=1&amp;b=2"/></schedule>

task.cfm

<cfscript>
    f = "c:\temp\shed.txt";
    fileAppend(f, chr(10) & "---#now()#------" & chr(10));
    fileAppend(f, cgi.HTTP_USER_AGENT & chr(10));
    fileAppend(f, url.tojson() & chr(10));
    fileAppend(f, form.tojson() & chr(10));
</cfscript>

shed.txt

---{ts '2022-07-20 02:44:35'}------
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36
{"a":"1","b":"2"}
{}

---{ts '2022-07-20 02:44:47'}------
CFSCHEDULE
{"a":"1","b":"2","RequestTimeout":"15"}
{}

---{ts '2022-07-20 02:45:00'}------
CFSCHEDULE
{"a":"1","b":"2","RequestTimeout":"15"}
{}

---{ts '2022-07-20 02:46:00'}------
CFSCHEDULE
{"a":"1","b":"2","RequestTimeout":"15"}
{}

Here my scheduler.xml the call a website on the same server :

<task autoDelete="false" hidden="false" interval="28800" name="Test" paused="false" port="443" proxyHost="" proxyPassword="" proxyPort="0" proxyUser="" publish="false" readonly="false" resolveUrl="false" startDate="{d '2022-07-20'}" startTime="{t '03:00:00'}" timeout="50000" unique="false" url="https://www.example.com:443/fr/sub-folder/update2.cfm?myParam=HelloWorld"/>

When I dump the URL scope or the CGI scope, url params are duplicated :

myParam=HelloWorld&RequestTimeout=50&myParam=HelloWorld&RequestTimeout=50

I investigated further to find where it came from.

  1. If I call directly the url with a web browser, the url params are ok (not duplicated)
  2. If I change the url in the scheduled task to point to another website on another server, the url params are ok (not duplicated)
  3. If I change back the url in the scheduled task to point to the website on the same server, the url params are duplicated

Maybe it’s not coming from the sender (Lucee), but from the receiver (IIS, Website itself, or whatever is between Lucee and the website like a proxy). I’ll come back when I find out!

I found the problem.

In IIS, we use the URL Rewrite to redirect http request to https request, like that :

iis-url-rewrite

See the problem? The person who created this rule did a mistake. REQUEST_URI already contains the url params, so if the checkbox Append query string is checked, it will duplicate the url params.

This raises a question. In the scheduled task, the url is well specified in HTTPS. Does Lucee make the HTTP call first anyway?

so, what do your web server logs say?

When I hit the Execute button, here what I see in the IIS logs :


2022-07-20 14:55:48 XXX.XXX.XXX.XXX GET /fr/sub-folder/update2.cfm NewTestForHTTPS=1&RequestTimeout=50 443 - XXX.XXX.XXX.XXX CFSCHEDULE - 301 0 0 2
2022-07-20 14:55:48 XXX.XXX.XXX.XXX GET /fr/sub-folder/update2.cfm NewTestForHTTPS=1&RequestTimeout=50 80 - XXX.XXX.XXX.XXX CFSCHEDULE - 301 0 0 0
2022-07-20 14:55:49 XXX.XXX.XXX.XXX GET /fr/sub-folder/update2.cfm NewTestForHTTPS=1&RequestTimeout=50 443 - XXX.XXX.XXX.XXX CFSCHEDULE - 200 0 0 1045

Besides of what is happening in this issue… is the scheduler server and the web server on the same box in a classic environment? If so, then it would make sense to make the URL call Tomcat directly through port 8888 ( bypassing IIS ).

that’s a style preference. for example, if you have cfloudflare in front of your site, they block CFSCHEDULE UA by default

Having been bitten by that, Lucee 6.0 now let’s you specify a custom UA for cfschedule [LDEV-2999] - Lucee

1 Like

I did this test :

  1. I disabled the URL Rewrite rule for HTTP to HTTPS.
  2. I double-checked my url in the scheduled task and the protocol is HTTPS.
  3. I executed the scheduled task manually with the button Execute.
  4. The scheduled task hit my cfm page and I dumped the CGI scope and got this :
request_url : http://www.xxxxxxxx.......
server_port : 80
server_port_secure : 0

And the IIS logs look like this :

2022-07-20 15:22:38 XXX.XXX.XXX.XXX GET /fr/sub-folder/update2.cfm ScheduledTaskExecuteTest=1&RequestTimeout=50 443 - XXX.XXX.XXX.XXX CFSCHEDULE - 301 0 0 1
2022-07-20 15:22:39 XXX.XXX.XXX.XXX GET /fr/sub-folder/update2.cfm ScheduledTaskExecuteTest=1&RequestTimeout=50 80 - XXX.XXX.XXX.XXX CFSCHEDULE - 200 0 0 1040

So, is it possible that Lucee makes the scheduled task call in HTTP even if the url is in HTTPS?

Or maybe it first makes the call over https and it fails for some reason I don’t know, then tries over http, and if there’s a rewrite rule, there’s a redirect and there it works?

I don’t see any logic which does that fallback?

those logs clearly state you are redirecting from 443 to 80, with a 301.

301 is a permanent redirect, unlike 302 which is temp and not cacheable, so then you have config caching coming into play

1 Like

Do you have any other rewrite rules lurking on IIS? It looks like it is hitting the https and then redirecting to http…

2 Likes

Indeed, it is another rewrite rule (canonical domain) that is misconfigured. It redirects to the site in HTTP… Shame on me…

In my defense, I didn’t make this particular website. Sorry to waste so much time on this. Twice it was a rewrite rule issue. Can I buy you a coffee (or a huge bag of coffee beans) to make amends?

3 Likes

don’t worry about it, this is all great documentation for the next person facing the same problem!

It was because we had a rewrite rule like that :

It’s all good when you call https://www.example.com in the web browser because the HTTP_HOST will be www.example.com, exactly like the pattern in the URL Rewrite Rule. So no redirect will happens.

But Lucee call the domain with the port like that : https://www.example.com:443, so the HTTP_HOST will be www.example.com:443 and it doesn’t match the pattern, so a redirect happens. But the developer forgot to edit this URL Rewrite Rule when he migrated the site from http to https. The redirection was in http.

I’ll make the correction and maybe adjust the pattern too.

By the way, thank you for your answers and time guys!

2 Likes

Come on @TonyMonast!!! No shame at all!!! We’ve all been there in very similar situations. That’s been a very interesting read and a very valuable post for future refernce. Glad you worked it out with @martin s and the others help.