I can see that Apache Tika is included with Lucee 5.2, which can presumably extract the contents of RTF documents, among other file formats. In my case there’s a
org.apache.tika.core-1.10.0.jar in the bundles directory.
However, it does not seem to get used - or it doesn’t work the way I expect. After indexing a few RTF documents with
<cfindex type="file">, I can search them with
<cfsearch>, but the “Summary” in the search result always contains raw RTF markup:
Is this expected behavior? Shouldn’t I get the plain text contents of the file, i.e. something that can actually be shown to the user?