Lucee charset UTF-8 issue

Hello,
we use Lucee on Redhat Linux with an Apache web server 2.4. We have installed the latest stable verson of Lucee. The client requests come in to the Apache instance have a lot of header variables - all are encoded with UTF-8. But during the process:
Apache-Tomcat-Lucee-Lucee_App it seems, that an involved system decode it with ISO 8859.
We have German names as part of these header variables and e.g. from the original name “Bäcker” we get “B%C3%A4cker” as UTF-8 encoded name (that’s correct). But at the end the name “Bäcker” is shown in the Lucee web app. That is the encoded “B%C3%A4cker” decoded with ISO 8859.
We already set an URIEncoding in the server.xml and web.xml of the tomcat instance; but it didn’t solve our issue. Do you have an idea, where should we insert an UTF encoding entry in the Apache config, Tomcat config or Lucee config, so that the name never will be decoded with ISO 8859 in the communication process on the server?!

BR,
Andreas

Check your Apache configuration, in /etc/httpd/conf/httpd.conf you more than likely have
“AddDefaultCharset UTF-8”

Comment that out, then restart Apache
systemctl restart httpd

Hi Terry,

thanks for the response. But I think, I have to explain it a little bit more in detail.
What have we done:

  1. The request from the browser comes to the apache.
  2. In the apache config the line “AddDefaultCharset UTF-8” is active.
  3. We ahve established a forensic log file on the apache - the incoming request has a valid value in the header filed - UTF-8 encoded.
  4. The request will be forwarded from apache to the tomcat via ajp.
  5. In the tomcat config (server.xml) we have activated the following entries:
<Connector port="8888" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443"
	       URIEncoding="UTF-8" />
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" URIEncoding="UTF-8" />
  1. In the tomcat config (web.xml) we have activated the following entries:
<filter>
        <filter-name>setCharacterEncodingFilter</filter-name>
        <filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>
        <init-param>
            <param-name>encoding</param-name>
            <param-value>UTF-8</param-value>
        </init-param>
        <async-supported>true</async-supported>
    </filter>
<filter-mapping>
        <filter-name>setCharacterEncodingFilter</filter-name>
        <url-pattern>/*</url-pattern>
    </filter-mapping>
  1. We have increased the log level for the incoming requests on tomcat side (server.xml):
<Valve className="org.apache.catalina.valves.AccessLogValve"
     directory="logs" prefix="xxx." suffix=".txt" 
     pattern="%{X-lastname}i" resolveHosts="false" />
  1. In the created log file we can see, that the header content is still UTF-8 encoded as input paramter for the tomcat instance.
  2. Within the Lucee application the header value is shown as wrong, ISO 8859 decoded value.

So from my point of view tomcat or an application within tomcat is decoding the header value in a wrong way - ISO 8859 instead of UTF-8. But I can’t identify, which process step it is in the tomcat container.

BR,
Andreas

Hi Grobi13,
do you have set Web Charset to UTF-8 in the Lucee-Admin (/lucee/admin/server.cfm?action=server.charset) ?

I have a pretty default apache/tomcat-setting and i dont can reproduce your issue, maybe you can give us a example code, to make sure that your code is valid?

Hi,

yes, in the Luccess Admin-Gui there UTF-8 also is stored.
I can give the app to you; no problem. How can I do that? Do you have an upload-server or can I send it via email?

BR,
Andreas

I don’t think you want to do #6. If Lucee isnt generating UTF-8, this could cause the browser to interpret the result incorrectly.

Make sure Lucee is generating the right content with the right encoding and headers, don’t use a servlet filter to override it.

Thanks; I will try it.

Hi Grobi13,
have you solved the encoding issue somehow? I have the same problem here. Header information is incorrectly encoded. All special characters are replaced with the same combination “�”.

BR,
Thomas

@tab You always need to make sure that the encodings are set correctly over the complete communication chain from wherever it gets the payload. You may switch encodings if applicable (as long as they can be reconverted without information loss) to another encoding, but then you need to pass this information. Take a look at this post on SO, it may help you.