As another alternative method, I have used the following function in the
past to parse csv style strings. I tested it against your sample file and
it seemed to work OK:
<cfargument name=“escapeChar” type=“string” default=“”“” required=“false”>
<cfargument name=“escapedEscapeChar” type=“string” default=“”“”“”
required=“false”>
var currentVal = “”; // string containing the value currently being parsed
var inEscapedCell = false; // set to true when we are parsing an escaped
cell value
var i = 1;
var currentChar = “”;
var rowDelimCheck = “”;
var escapeDelimCheck = “”;
var escapedEscapeCheck = “”;
var aDataSet = arrayNew(1);
var aRowSet = arrayNew(1);
// we parse our string a character at a time
for(i=1; i LTE len(arguments.csvString); i=i+1){
currentChar = mid(arguments.csvString, i, 1);
rowDelimCheck = mid(arguments.csvString, i, len(arguments.rowDelim));
escapeDelimCheck = mid(arguments.csvString, i, len(arguments.escapeChar));
escapedEscapeCheck = mid(arguments.csvString, i,
len(arguments.escapedEscapeChar));
// are we in an escaped cell???
if(inEscapedCell){
// we are in an escaped cell, we need to check for our ending escape
character
if((NOT escapeDelimCheck IS arguments.escapeChar) OR (escapedEscapeCheck IS
arguments.escapedEscapeChar)){
if(escapedEscapeCheck IS arguments.escapedEscapeChar){
currentVal = currentVal & arguments.escapeChar; // log our character
i = i + len(arguments.escapedEscapeChar) - 1;
}else{
currentVal = currentVal & currentChar; // log our character
}
}else{
i = i + len(arguments.escapeChar) - 1;
inEscapedCell = false; // break from our escaped cell
}
}else{
// are we at the end of our cell???
if((currentChar IS arguments.itemDelim) OR (rowDelimCheck IS
arguments.rowDelim)){
arrayAppend(aRowSet, currentVal);// commit our current value
currentVal = “”; // get ready for our next value
if(rowDelimCheck IS arguments.rowDelim){
arrayAppend(aDataSet, duplicate(aRowSet));// commit our current row
aRowSet = arrayNew(1); // get ready for our next row
i = i + len(arguments.rowDelim) - 1;
}
}else{
if(escapeDelimCheck IS arguments.escapeChar){
i = i + len(arguments.escapeChar) - 1;
inEscapedCell = true; // open our escaped cell clause
}else{
currentVal = currentVal & currentChar; // log our character
}
}
}
}
return aDataSet;
<cfset csv = fileRead(expandPath(“./sample.csv”))>
<cfset parsed = csvToArray(csv, “|”, chr(10))>
All the best,
MartinOn Monday, 5 September 2016 12:06:33 UTC+1, Peter Whitby wrote:
Evening, All.
I’m converting a truly ancient application from CF6 to lucee.
Everything works great, the actual code migration went very smoothly and
I’ve managed to get everything working, as well as cleaning up a few old
irritations. So far, so good.
However, I’m having a problem when using the production data. The app uses
cfhttp to create an sql-like result set from a CSV file and, when a single
or double quote appears in a field, lucee appears to ignore the field
separators, line feeds , etc and keeps reading until it finds another
instance of the same character. This may be several lines/records later and
usually ends with a an error stating that there is an incorrect number of
columns in the input line.
I’m running on Ubuntu server 16.10 with the latest Apache2. Lucee was
installed with ‘lucee-5.0.0.252-pl0-linux-x64-installer.run’.
cfhttp is:
I’ve tried encasing the fields of the csv file in different textdelimiters
(’ " ` and others); tried different separator characters and I’m all but
tearing my hair out. I could write a preprocessing filter script but that
wasn’t needed in CF6 and I’m wondering why it’s needed in lucee.
What have I missed?
Let me know if more doco is required and I’ll be happy to post it.
Thanks,
Pete