Allowed characters in Lucee installation path

Don’t forget to tell us about your stack!

OS: Win 10
Java Version: 1.8.0_271
Tomcat Version: Whichever comes with lucee-express-5.3.8.201
Lucee Version: 5.3.8.201

Hello there,
I was experiencing Problems with a CommandBox server the other day, where the PDF Extension would not load. With the help of Brad Wood I could boil it down to being a problem with Lucee, which does not seem to like “special characters” in its installation path. Just renaming the directory I got from extracting the zip file of the express version from “lucee-express-5.3.8.201” to “lucee-express-5.3.8.201[brackets]” results in Java exception being thrown when running startup.bat.
So the question here is: Is that a bug? Or is there some place in the docs pointing to an allowed character set for the installation path of lucee?

what’s the exception / stack

stack_tomcat.txt (57.9 KB)

looks like com.sun.org.apache.xerces is internally escaping the [,

Caused by: com.sun.org.apache.xerces.internal.util.URI$MalformedURIException: Opaque part contains invalid character: \

I’d suggest just don’t use brackets in filenames (KISS principle)

Ah I see. Same thing is happening with ( parentheses ) if anyone else stumbles upon this. Thanks.

Even though some file systems do support naming conventions that include special charaters, for performance and sanity its always best to

In Addition to the K.I.S.S. principle as mentioned by @Zackster

this is what We suggest for development.

  • keep the path short
  • keep the path name consistent
  • avoid spaces
  • avoid special characters
  • always use ONE CASE, suggested always use lower case, but do not use MiXeD CaSE
  • use application setup default path when ever possible
  • setup a sub site / app path outside the default path for development or production values
1 Like

In all installation paths (and path’s in general). It is good ( and in my opinion Proper ) to stick to DOS Legacy Rules. This means:

  • No spaces
  • No special characters
  • Alpha Numeric only with underscores ( no dashes/hyphens )
  • and for filenames a single . (dot before the extension)

Call me old fashioned if you like… But as a systems architect I’ve seen the slow Devolution of file and path names over the last 3 decades. We’ve all taken up bad habits… when really it takes a micro effort to be proper and avoid potential: Macro Problems.

Maybe this is your problem maybe not. But it is certainly a contributing factor to the confusion.

You’re probably right about being cautious in general, but it would have been nice to have some sort of feedback that something is not working. Tracking down such bugs can be very cumbersome, since Lucee has started apparently without any issue but some of the modules have not. And since these modules belong to the default setup, Lucee obviously does not support weird paths, which is perfectly fine by me. But maybe it should tell the user so from the get-go and not start at all.

nah, honestly, just don’t use silly characters in paths

you learnt your lesson and anyone else can find this post by google

Whether or not it’s common, I believe this is still a bug. I’m not sure if it’s a bug in Lucee’s core or in the underlying library, but it’s likely a mishandling of the File or Path object as a URL. I’ve seen java code that treats a path as a URL which is fine until you get a special char and then it reveals the bug.

If we look at the Lucee source code, we can see the TagLibFactory is passing the path into the inputSource like so:

is.setSystemId(res.getPath());

and the docs for inputSource say the systemID should be a URI
https://docs.oracle.com/javase/7/docs/api/org/xml/sax/InputSource.html#setSystemId(java.lang.String)

Look at this example from the CommandBox REPL to see the difference in how a Path differs from a URI

CFSCRIPT-REPL: f = createOBject( 'java', 'java.io.File' ).init( 'C:/foo [bar]/baz' )
[Object java.io.File]
CFSCRIPT-REPL: f.getPath()
C:\foo [bar]\baz
CFSCRIPT-REPL: f.toURI().toString()
file:/C:/foo%20%5Bbar%5D/baz

Chances are, the fix is a simple as passing the correctly escaped URI into the inputSource method.

Ok then @aremus can you file a bug, link back to this thread and post the link to the bug here

https://luceeserver.atlassian.net/browse/LDEV-3684

Personally I think allowing chrs that aren’t really needed is a regression and opens more avenues for hackers. Maybe it should be implicitly stated that the installation path should follow conventional file naming convention instead?

1 Like