i have problems with collections. pdfs are not properly indexed, i see just a few words form the whole pdf text in the collection files.
I tested it on diffrenent Windows Server (2012, 2016) and different Lucee Versions
(Lucee 18.104.22.168, Lucee 22.214.171.124)
i creat a collection:
<cfcollection action="create" categories="no" collection="collectionName" path="C:\Projekte\xxx\pdfs\">
i update the collection:
<cfindex collection="collectionName" action="refresh" key="C:\Projekte\xxx\pdfs\" type="path" urlpath="" extensions=".pdf, .doc, .docx, .xls, .xlsx" recurse="yes">
i created a little search to test it:
<form name="fCollSearch" method="POST">
<input type="text" name="osp" value="#osp#">
<cfif osp neq "">
name = "mySearch"
collection = "collectionName"
criteria = "#osp#"
contextpassages = "1"
maxrows = "100">
the same test on a ColdFusion server works fine.
don’t bother, use elastic search instead… the search stuff in lucee is very old and crufty
thx for the reply. this is not an option for us. we will store highly sensitive data. we do not want to have it indexed by an elasticsearch server. i guess we will switch back to ColdFusion, because of too many problems that lucee is causing us. but thanks a lot zac. you have often helped me in this forum!
Are you assuming that the Elasticsearch server will be a third party service? It’s open source software that you can easily install locally. It’s similar to Solr which is what ColdFusion and Lucee use, and in fact both Solr and Elastic are based on Lucene. We’ve found Elastic better to work with and more performant than the built-in Solr engines. Setting it up separately is a little more work, but it’s been well worth it.
Sorry to hear you’ve had other problems with Lucee. We had to work though quite a few issues as well, but in the end we’re very glad we switched away from ACF.
Correction: Lucee uses Lucene directly, not Solr.