Ignore non printing characters when importing from txt file

This is an issue I have had several times in the past but I have only now managed to get to the bottom of the cause.

I am importing some data by uploading a csv file and then using cfhttp to load the csv into a query variable. So far so good. Now when I try to read the query which has a column called “title” I sometimes get an error claiming that “Title” doesn’t exist, available columns are “[TITLE,…” which puzzled me for ages.

If I open the CSV in Notepad, Notepad++ or Excel everything looks fine.

This morning I opened the file in the command line with more and discovered that the file actually started “´╗┐Title”… With these special characters at the front that do not show up in any other text editor!

Is there a way to get Lucee to ignore these (it doesn’t display them either!) or failing that a way for Notrepad++ do display them so I can delete them before the upload?

I usually use a hex editor in that situation, HxD could do the job

It’s probably the Byte Order Marker. Lucee should read and honor it when parsing text… but apparently doesn’t?

https://notepadunix2dos.info/removebom.html

https://en.m.wikipedia.org/wiki/Byte_order_mark

You should submit a Jira ticket for the BOM to be processed correctly.

Done. [LDEV-2400] - Lucee

As I asked on the bug, what encoding did the http headers describe the file content encoding to be?