Page encoding (UTF-8) randomly fails

Occasionally a page (cfm) is displayed in gibberish characters - broken (UTF-8) encryption. This is despite specifying at the beginning of each page

<!DOCTYPE html>
<html lang="en-US">

and

<meta http-equiv="content-type" content="text/html; charset=utf-8"><meta charset="utf-8">

in the <head> of each page, and in the Lucee admin character encoding settings, and added URIEncoding="UTF-8" to each of the connectors in the server.xml

It happens randomly, if I open the same page in a private window it (sometimes) gets fixed all by itself. Refreshing the same page over and over sometimes fixes it, without doing anything else. Sometimes it’s the entire page, other times it is just parts of it.
I’m using the AJP/1.3 connector btw:

<Connector protocol="AJP/1.3" port="8009" redirectPort="8443" address="127.0.0.1" connectionTimeout="15000" maxThreads="150" secretRequired="false" allowedRequestAttributesPattern=".*" socket-binding="ajp" scheme="http" URIEncoding="UTF-8" />

Any suggestions?

My guess is you are having mixed encoded content displaying on your page. I’ve seen Chrome guessing encodings when it has mixed encodings having similar effect. That might even override such headers or html encoding directives, because many developers just think that will fix the issue, but its not at all about just setting those html directives. What I’d do is inspect all the template encodings that are being read (your cfm templates, include files, components and even datasources delivering dynamic content).

Look at the following answer as a starting point:

1 Like

Thank you for the feedback, but if that would be true, the issue wouldn’t “solve itself” after a refresh or reloading of the same page in a private window, without doing anything else. Would it?

What I’d do is: Check at Jira if there are similar issue affecting encodings of your Lucee version. If you are using a stable release and there is no issue added there, your cfml code is pretty straightforward… then I’m pretty sure it has to do something with the settings of your application and having mixed encodings.

What I’ve read in the past is, that the html directives for encodings you’ve mentione above are the most bypassed whenever they become untrustworthy (I understand that browsers try detecting encoding if there are mismatched encodings in response headers, content, meta tags etc).

1 Like

UPDATE: I have isolated the issue, seems to have something to do with the cache module of my web server (reverse proxy) - OpenLiteSpeed - that sits in front of Lucee.
When I disable the LScache the problem disappears. While I still don’t know why the module randomly breaks the encryption of (supposedly) cached pages, we can rule out that it’s something to do with Lucee.

@stevieosaurus thanks for posting back! I’m pretty sure this post will help others having the same issue in future.

1 Like