I’ve just discover that one my regexp check is not working anymore … no
change have been made to the code, only an upgrade from railo to lucee
(using railo for long and now lucee )
the regexp is :
nom: {regexp:“^[éèëêàäâûüùïîöôÿç’a-zA-Z\
_-]{2,50}$”,erreur=“invalid_nom”, msg=“Votre nom n’est pas valide. Sa
longueur doit être comprise entre 2 et 50 caracteres.”},
Having special characters inside a cfm/cfc file can be problematic when files are read back in, and should be avoided imho. It’s better to have the regex something like this:
“^[\x8C\x9C\xC0\xC2\xC6-\xCB\xCE\xCF\xD4\xD9\xDB\xDC\xE0\xE2\xE6-\xEB\xEE\xEF\xF4\xF9\xFB\xFC 'a-zA-Z_-]{2,50}$”
Where all those \x characters are unicode representations of these characters: http://character-code.com/french-html-codes.php
Also, I see your regex is currently not checking for the uppercase variants of the special characters. The regex I suggested here will check for the uppercase variant as well.
Kind regards,
Paul KlinkenbergOp 22 feb. 2016, om 16:51 heeft Stéphane MERLE <@Stephane_MERLE> het volgende geschreven:
Hi,
I’ve just discover that one my regexp check is not working anymore … no change have been made to the code, only an upgrade from railo to lucee (using railo for long and now lucee )
the regexp is :
nom: {regexp:“^[éèëêàäâûüùïîöôÿç’a-zA-Z\ _-]{2,50}$”,erreur=“invalid_nom”, msg=“Votre nom n’est pas valide. Sa longueur doit être comprise entre 2 et 50 caracteres.”},
StéphaneLe mardi 23 février 2016 10:32:43 UTC+1, Paul Klinkenberg a écrit :
Well, I have seen numereous occasions where special characters were
garbled in a cfml template (or any text document for that matter)
That usually occured after the file was ftp’d, updated in an editor with
different character set, or any other action which causes the file to
change character set.
Off course, measures can be taken to prevent this from happening, but I
have had my fair share of bugs at different companies and platforms, with
this exact problem.
Well, I have seen numereous occasions where special characters were garbled in a cfml template (or any text document for that matter)
That usually occured after the file was ftp’d, updated in an editor with different character set, or any other action which causes the file to change character set.
Off course, measures can be taken to prevent this from happening, but I have had my fair share of bugs at different companies and platforms, with this exact problem.
PaulOp 23 feb. 2016, om 00:24 heeft Adam Cameron <@Adam_Cameron> het volgende geschreven:
On Monday, 22 February 2016 21:28:55 UTC, Paul Klinkenberg wrote:
Having special characters inside a cfm/cfc file can be problematic when files are read back in, and should be avoided imho.
Of course, for nearly any language besides English, “special” characters
are normal. English is the outlier here. Interfaces will have words
containing them, and these characters simply can’t be avoided without
misspellings - or using images in place of text - for Chinese words as an
example. Both workarounds aren’t at all ideal. Dealing with various and
varying character sets is a pain, but unavoidable in my opinion - unless
one works only in English.
The issue I typically run across in Switzerland is that someone will give
me a text in an encoding other than utf-8, and then I have to convert it.
If it’s data, I’ll import it into mySql and convert the charset there.
Aria Media Sagl
+41 (0)76 303 4477 cell
skype: ariamediaOn Sat, Feb 27, 2016 at 9:08 PM, Adam Cameron <@Adam_Cameron> wrote:
On Monday, 22 February 2016 21:28:55 UTC, Paul Klinkenberg wrote:
Having special characters inside a cfm/cfc file can be problematic when
files are read back in, and should be avoided imho.
What?
Why?
On Tuesday, 23 February 2016 09:32:43 UTC, Paul Klinkenberg wrote:
Well, I have seen numereous occasions where special characters were
garbled in a cfml template (or any text document for that matter)
That usually occured after the file was ftp’d, updated in an editor with
different character set, or any other action which causes the file to
change character set.
Well yeah, all valid observations. I think it’s more just something to be
mindful of, than actively avoid. That said… having this sort of content
in a code file kinda suggests there’s hard-coded content in the code - I
imagine this is where this sort of thing mostly comes from - which is
probably rather more an issue.
I do find that charset encoding is a topic that a lot of CFML devs
(perhaps not just CFML ones) do seem to struggle with.
I s’pose it’s just more complexity and “moving parts” that can
contribute to possible problems.
Having special characters inside a cfm/cfc file can be problematic when
files are read back in, and should be avoided imho.
What?
Why?On Monday, 22 February 2016 21:28:55 UTC, Paul Klinkenberg wrote:
On Tuesday, 23 February 2016 09:32:43 UTC, Paul Klinkenberg wrote:
Well, I have seen numereous occasions where special characters were
garbled in a cfml template (or any text document for that matter)
That usually occured after the file was ftp’d, updated in an editor with
different character set, or any other action which causes the file to
change character set.
Well yeah, all valid observations. I think it’s more just something to be
mindful of, than actively avoid. That said… having this sort of content
in a code file kinda suggests there’s hard-coded content in the code - I
imagine this is where this sort of thing mostly comes from - which is
probably rather more an issue.
I do find that charset encoding is a topic that a lot of CFML devs (perhaps
not just CFML ones) do seem to struggle with.
I s’pose it’s just more complexity and “moving parts” that can contribute
to possible problems.
Of course, for nearly any language besides English, “special” characters
are normal.
Yeah, I do wish people would stop using such jingoistic terms. I assure you
to a lot of English speakers, there’s nothing “special” about some other
language’s character set.
It’s even worse when some muppets - usually when talking about password
strength - refer to punctuation as “special characters”. I know IT people
are - on the whole - reasonably poor at written communication, but even to
them how are things like comma and fullstops “special”?
[sigh]On Saturday, 27 February 2016 22:37:46 UTC, Nando Breiter wrote:
I get your point, Adam, but you have to admit that in particular in the single-language English lands of the UK, AU, NZ and the US for a lot of English speakers, “special” (non ASCII) characters are NOT the norm and quite frankly a lot of applications are not being built to handle non-ASCII character sets.
Hence the point Nando makes is absolutely correct.
Cheers
Kai>
On Saturday, 27 February 2016 22:37:46 UTC, Nando Breiter wrote:
Of course, for nearly any language besides English, “special” characters are normal.
Yeah, I do wish people would stop using such jingoistic terms. I assure you to a lot of English speakers, there’s nothing “special” about some other language’s character set.
It’s even worse when some muppets - usually when talking about password strength - refer to punctuation as “special characters”. I know IT people are - on the whole - reasonably poor at written communication, but even to them how are things like comma and fullstops “special”?