Cfcontent type pdf

OS: centos 8
Java Version: 11.0.4 (AdoptOpenJDK) 64bit
Tomcat Version: Apache Tomcat/9.0.24
Lucee Version: Lucee 5.3.3.62

I’m trying to solve a problem with a PDF download. The problem occurs only in Firefox - Android.

Desktop + Chrome Android works fine.

The file is downloaded but if I click Open … Firefox says “this file cannot be opened.”

Im delivering a pdf document with cfcontent ("/a.cfm").

When I dowload the same file direct from the webserver ("/aa.pdf") everything works as expected.

My assumption is there is something with the http header while download, that something prevents Firefox from open the file.

The Code im using: (a.cfm)

<cfheader name="Content-Disposition" value="attachment;filename=""aa.pdf"";">
<cfheader name="Content-Length" value="100569">
<cfheader name="Content-Type" value="application/pdf">

<cfcontent type="application/pdf" file="#fn#" deletefile="no" reset="yes">
</cfcontent>

I tried some things with content type, added content-length , removing Composition …

I debugged the Request with wget:

The ony difference I found is the content Type:

Lucee: Content-Type: application/pdf;charset=UTF-8
Direct: Content-Type: application/pdf

  • Did someone experimence something similar?
  • Someone has a hint how to change the content type with lucee witout the “;charset=UTF-8” (wich I assume is the Problem for Firefox)?
  • Someone has another idea why firefox dosent like the PDF Download?

Complete Request:

Per direct httpd:


---request begin---
GET /aa.pdf HTTP/1.1
User-Agent: Wget/1.19.5 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: devsite.lan
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Server: nginx/1.14.1
Date: Mon, 01 Feb 2021 23:20:59 GMT
Content-Type: application/pdf
Content-Length: 100569
Last-Modified: Fri, 18 Oct 2019 22:11:10 GMT
ETag: "5daa387e-33d0"
Strict-Transport-Security: max-age=63072000
Accept-Ranges: bytes

Per Lucee:


---request begin---
GET /aa.cfm HTTP/1.1
User-Agent: Wget/1.19.5 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: devsite.lan
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200
Server: nginx/1.14.1
Date: Mon, 01 Feb 2021 23:44:54 GMT
Content-Type: application/pdf;charset=UTF-8
Content-Length: 100569
Connection: keep-alive
Set-Cookie: cfid=ecc15606-c88e-4978-be44-10e3671ccd8e;Path=/;Expires=Mon, 22-Feb-2021 01:22:58 UTC;HTTPOnly
Set-Cookie: cftoken=0;Path=/;Expires=Mon, 22-Feb-2021 01:22:58 UTC;HTTPOnly
Content-Disposition: attachment;filename="aa.pdf";
Content-Language: de-CH
Strict-Transport-Security: max-age=63072000

there’s already bug about the extra charset stuff with cfcontent

https://luceeserver.atlassian.net/browse/LDEV-2995

there’s a possible workaround here

Hello Zac, Thank you for your quick reply!

I tried the Hack. But it didn’t work for my PDF Problem.

I will investigate further.

Another workaround would be to try to change the content-type header to clear that server header at the fronting ngnix web server. Did some similar header manipulation with apache2 in the past (adding same-site cookie stuff). Here is a module for that in ngnix.

hello andreas, thanks a lot for that hint! I will give it a try.

1 Like

@rolf.anderegg, Can you share an example pdf if possible to replicate the issue? Because, for me works fine with both the firebox and chrome.

Shure… here is a slightly modified code to include a pdf in the same directory.

example.zip (1.5 KB)

For me this still reproduces the Error ONLY on Android Firefox

I was able to reproduce the issue on a Galaxy M51 with Firefox. Looks like Lucee/Tomcat is sending the webcharset along and that is not applicable for PDFs, because PDF is a binary format. See: what is the correct charset for a application/pdf content-type http response header? - Stack Overflow

While Google Chrome handles the files OK, I’ve seen Firefox on Mobile as being very sensitive with non-RFC compliant definitions (in my case it were cookies with response-headers values being set twice - one cookie attribute in Tomcat and one as a “set header” rule in Apache2, which resulted in cookie attributes being set twice).

I wish Tomcat would have such a mod_headers module.

@rolf.anderegg

Please try the following code:

<cfset strPath = ExpandPath( "./" ) />
<cfset strPath = GetDirectoryFromPath( GetCurrentTemplatePath() ) />
<cfset fn="#strPath#/sample.pdf">

<!--- cfheader name="Content-Type" value="application/pdf" charset=" "--->
<!--- cfheader name="Content-Disposition" value="attachment;filename=""aa.pdf"";"--->
<cfscript>
pc = getpagecontext().getresponse();
pc.reset();
pc.setcontenttype('application/pdf');
</cfscript>
<cfcontent  file="#fn#" deletefile="no" reset="yes"></cfcontent>

I’ve added this problem to the task with a link back

so when exactly is adding the charset appropriate / required? only for text types?

https://www.w3.org/International/articles/http-charset/index

1 Like

I can’t say for sure Zac, but to me it looks like Lucee gets the default web charset from server administrator (I think the same would apply to application.cfc) Lucee silently adds it to cfcontent creating those headers, ignoring also preceding cfheader charset. But I didn’t look further into it. When I’ve added web charset ISO-8859-1 to the server administrator, the server responded with application/pdf;charset=ISO-8859-1. That is also non-RFC compliant for application/pdf mime types. Thus I’ve tried finding a way to leave charset some how “blank” with no encoding. Then I’ve found this old but very good post and looked at ServletResponse (Java(TM) EE 7 Specification APIs) . Really not sure, but it worked for me on my local dev.

I think this applies for documents of type text only. But PDF RFC doesn’t mention at all charset encoding (see here). Encoding PDF would be simlar to charset encoding files like jpgs or mp4 files. Of courde PDFs contain text, but these are binary files.

I guess this should also be configurable per cfcontent usage (for text types), i.e. add a charset attribute?

text types are sniffed by the following logic

Oh hell yeah!!! That would have been usefull. It would give great power and control. Just that developers might add non-compliant charsets also to all kind of files. Another thing would be also to have a value at cfheader like "charset=“none” or charset=“false” to reset the charset attribute to be able to remove/strip it from the server response.

@andreas

WoooooT ! :wink: Thank you andreas! This Workarround works and all headers are set perfect.

@Zac_Spitzer

IMHO Yes as far as i did google about this problem. Only “text/json css html …” should have a charset. All binary files “application/xxxx image/xxx video/…” should never have a charset because the format is binary.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types

If i had to implement this I would maybe put for “text/…” Types the charset of the page / Setting in lucee as it is now (for the programmers who dont want to get deeper in into it).

If someone adds explicit a charset to the type attribute then i would deliver that charset (maybe he has a webpage which is UTF-8 and has to deliver some existing text content not UTF-8 encoded.

2 Likes

ok, I’ve proposed three new attributes: compress, isBinary and charset
https://luceeserver.atlassian.net/browse/LDEV-2995?focusedCommentId=47589

one question, what happens when you include charset in the mime type? what does ACF do?

I’ve seen that in a few samples?

1 Like

alas <cfdocument> makes the same mistake

https://luceeserver.atlassian.net/browse/LDEV-3262

1 Like

Setting

<cfcontent file="#fn#" type="application/pdf" deletefile="no" reset="yes"></cfcontent>

results in Content-type: application/pdf;charset=ISO-8859-1 (this is the web charset I’ve set in my server administrator for testing this issue)

I’ve just tested the code with ACF 2016, 2018 and 2021 with (“awesome”) CommandBox and all versions were behaving just like Lucee by adding also the charset encoding to the content-type application/pdf (no matter if content-type is set on cfheader or cfcontent).

1 Like

Ok, let’s see who fixes this first!

I originally found this problem via either the Lighthouse (or another tool, I forget) webpage analysis in Chrome Dev tools, which flagged this header as being wrong.

What I can say is that many browser vendors work pretty well with all this non-compliant code response headers etc by just ignoring them or taking decisions how to act. E.g. Google Chrome chooses to interpret the charset more by analyizing the loaded text and not by simply getting the response headers.

As an example: I had a case with setting cookies where I also set the server response headers “httponly”, “secure”,“samesite” in Apache2, just to ensure they are always being set by default. The result was that some cfcookies were setting these attributes also. The result was somehting like this:

Set-Cookie: … HttpOnly; HttpOnly; Secure; Secure;

This isn’t allowed by the specifications. But that worked very well with a good variety of browsers, they just dropped these attributes. When I’ve installed Firefox on a new mobile all my sessions got lost on that development app all the time. Had to debug it with my mobile in dev mode and USB cable connected to my notebook. Desktop Firefox worked without any issues just like all other tested browsers did.

While I think Lucee should do this right and in compliance, I’m pretty sure Firefox Mobile will soon take these response headers and just drop that charset, just like Firefox Desktop does.