Highlighting search terms in cfsearch

I don’t seem to be able to get search terms highlighted in cfsearch results in the summary field. I have indexed a query and can get results but I can’t highlight them.

The attributes contexthighlightbegin and contexthighlightend are listed on the Lucee docs. In the Abode docs the terms are stated to be bold in the summary field by default. I have tried setting the attributes as bold as the highlighting doesn’t appear by default.

I created the collection on Lucee (Use the web admin/search or CFML)

<cfcollection
action = "create"
collection = "lexicon"
path= "c">

(this created a directory ‘c’ for the index in the same directory)

I then indexed a query…

<cfquery name="getLexicon" datasource="sastra">
    select le_id, le_entry, le_definition from sas_lexicon
</cfquery>

<cfindex
query="getLexicon"
collection="lexicon"
action="Update"
type="Custom"
key="le_id"
title="le_entry"
body="le_definition"
>

<h2>Indexing Complete</h2>

Then I searched the collection using the cfsearch tag and output with coldbox html.table

<form action="" method="POST">
    <cfoutput><input name="search" value="#search#"></cfoutput>
    <input type="submit">
</form>

<cfsearch
 name = "mySearch"
collection = "lexicon"
criteria = "#search#"
contexthighlightbegin="<b>"
contexthighlightend="</b>"
contextpassages="1"
contextbytes="500"
maxrows = "100">

<br/>
 
<cfoutput>
    #html.table( data=mySearch, includes="key,title,summary", class="table table-striped" )#
</cfoutput>

But alas no context highlighting in the summary.

Any suggestions most appreciated. Or is this simply not working in the engine?

Also why is the version of Lucene used so old? The Apache Lucene is now at 8.10. Is it worth updating Apache Lucene used by Lucee so as to support more languages?

Update: I’ve downloaded the Lucee source and found the file Lucee/core/src/main/java/lucee/runtime/tag/Search.java seems to contain the code for handling the criteria highlighting. I’m not a Java wiz but I’m interested in it. If someone might be able to cast their eye over this file it would be awesome as a functioning full text search would be so useful! It would be great to see the search term highlighted in context.

I’m wondering if there might be an issue setting the additional attributes or status.

I have found a page that lists the criteria syntax for those interested in writing search queries using cfsearch:
https://lucene.apache.org/core/2_9_4/queryparsersyntax.html

1 Like

I found the following functions in Lucee/core/src/main/java/lucee/runtime/search/AddionalAttrs.java

public AddionalAttrs(int contextBytes, int contextPassages, String contextHighlightBegin, String contextHighlightEnd) {
	this.contextBytes = contextBytes;
	this.contextPassages = contextPassages;
	this.contextHighlightBegin = contextHighlightBegin;
	this.contextHighlightEnd = contextHighlightEnd;
}

public static void setAddionalAttrs(int contextBytes, int contextPassages, String contextHighlightBegin, String contextHighlightEnd) {
	setAddionalAttrs(new AddionalAttrs(contextBytes, contextPassages, contextHighlightBegin, contextHighlightEnd));
}

Does it make sense that setAddionalAttrs() is calling itself with a new object? Maybe I don’t understand Java well, but I thought maybe this might trip up the setting on the context variables, all of which I can’t get to work.

First thing I would do is debug the form as most of the time form.XXXX is incorrectly being called by your tag / script.

as I am read this

page 1 (SEARCH) is pushing form.search to page 2 (cfsearch)
page 2 is not looking for form.search and instead is looking for just search

I use the following code (Credit @bennadel ?) to debug items

<cfloop
list="application,session,variables,client,url,form,request,server,cgi"
index="i">
<cfdump var=#evaluate(i)# label="#i#">
</cfloop>
1 Like

It would be great to update to the latest version of Lucene, but we just need somebody to volunteer to do it.

Are there instructions how to do it?

1 Like

Does Lucee uses Lucene and AdobeCF Solr?

Thanks Terry,

I used cfparam early to set a default value for the form. I had been experimenting with coldbox and worked out I could run templates with the virtual route without coldbox complaining so in this instance the template was posting to itself and was working.

<cfparam name="form.search" default="lim">

<form action="" method="POST">
    <cfoutput><input name="search" value="#search#"></cfoutput>
    <input type="submit">
</form>

<cfsearch
name = "mySearch"
collection = "lexicon"
criteria = "#search#"
status="r"
suggestions="always"
contextPassages = "1"
ContextHighlightBegin="<b>"
ContextHighlightEnd="</b>"
maxrows = "100">

Thanks for the debugging snippet. I will give it a go! :slight_smile:

Hi @Zackster, I’m pretty rusty on a lot of fronts but interested in getting it going. I haven’t built large Java applications, but I’d like to try to build Lucee, and potentially figure out what is happening here and try to get a newer version going. It all requires a lot of fiddling and experimenting!

I downloaded maven and the latest Oracle SE JDK (17) onto Ubuntu 20.04 server. I’m using Visual Code Studio with the MS Java plugins installed on Windows and connecting to the Ubuntu vm with the remote explorer and ssh.

I tried to build Lucee with Maven but I got an error which I believe relates to javax (a javascript engine?) being removed from the JDK so perhaps I needed an older version.

Is there a cheat sheet or guide for building Lucee that you know of? In other words what can I download to produce a working build of Lucee?

Some things would be marvelously useful (which may exist!?):

  • A description of the Lucee application. How it works and its structure. How to add to it.
  • A one pager explaining how to build Lucee on their machine including what software, versions, where to download and what to do!
  • A video on the Lucee development environment setup and build process
  • Information about how to contribute

I can’t promise a result but I used to do lots of CF up to around CF8 and I’m just getting back to it and having fun. I’d like to learn more.

Thanks to everyone pushing Lucee forward. I have found it very heartening to know CFML doesn’t rely on a corporation but a community and seems to have a bright future.

I think latest version supported is 11, see
https://luceeserver.atlassian.net/browse/LDEV-3526

1 Like

Did you try it with ANT also? I did some builds successfully with ANT. I also never was capable of building Lucee with Maven, but my Java experience is pretty low.

I’ve managed to create a dev environment with Eclipse to play around with the Lucee source in the past, but I didn’t find out how to “hot update” the classes of the Lucee engine in Tomcat. Building the complete source takes time,

I’ve seen that there is Lucee debug loader in the code, and I’ve managed to run it, but also without hot update capability.

1 Like

It’s a bit complicated with extensions, but I have developed a toolchan which allows building/compiling an extension and running the relevant lucee testcases using a simple bash/batch file.

There is already an automated GitHub action in place, so if you clone that repo and commit a change it will automatically do the while Processing for you.

To compile any lucee extension, you just need to run ant in the root directory.

I’ll write up some documentation tomorrow.

1 Like

Use the -DdeployLco ant option (see the above building docs) to hot deploy (which is just dropping the .lco file into the /deploy folder which auto installs every 60s

1 Like

Hi @andreas, I haven’t tried with ant yet. Thank you. I will give that a try. Wish me luck. :slight_smile:

I have been reading about changes to Java in recent years. It sounds like there is a lot to keep on top of. The javax and jakarta namespace changes seem like interesting developments. I don’t understand the impact on Lucee but it would make sense to be built with free software and the future unencumbered by creative lawyers.

I was using Oracle SE JDK (17) but it ran into a snag building with maven.

The same as this Build failure Lucee Java returned: 1

I’ll try ant but I read maven is preferred by some.

I’m rusty with Java too… All though I lived in Jakarta for a long time :slight_smile:

I’m not sure this is the place for it but I managed to build Lucee with ant on Ubuntu Server 20.04.

edward@cfml:~/src/Lucee/loader$ mvn -v
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
Apache Maven 3.8.3 (ff8e977a158738155dc465c6a97ffaf31982d739)
Maven home: /opt/apache-maven-3.8.3
Java version: 11.0.11, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
Default locale: en_US, platform encoding: UTF8
OS name: "linux", version: "5.4.0-88-generic", arch: "amd64", family: "unix"

edward@cfml:~/src/Lucee/loader$ ant -version
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
Apache Ant(TM) version 1.10.11 compiled on July 10 2021

Java SE JDK 17

$ ant fast
/home/edward/src/Lucee/loader/build.xml:55: The following error occurred while executing this line:
/home/edward/src/Lucee/ant/build-core.xml:89: Unable to create javax script engine for javascript

BUILD FAILED

I installed a few JVMs. I saw on the forum that 11 was used to produce a build so I gave up on 17.

edward@cfml:~/src/Lucee/loader$ sudo update-alternatives --list java
/usr/lib/jvm/java-11-openjdk-amd64/bin/java
/usr/lib/jvm/jdk-11.0.12/bin/java
/usr/lib/jvm/jdk-17/bin/java

I set java to be /usr/lib/jvm/java-11-openjdk-amd64/bin/java

Running ant returned:

[javac] /home/edward/src/Lucee/ant/build-core.xml:380: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds

So I added to loader/build.xml

  <presetdef name="javac">
    <javac includeantruntime="false" />
  </presetdef>

I re-ran ant… looks like just a difference in the number of decimal places in the failing tests

$ ant 

.
.
.

[java]    [script] Failed: test.functions.ACos
     [java]    [script]         checking acos() function
     [java]    [script]         Expected [0.795398830184] but received [0.7953988301841436]
     [java]    [script]                 /home/edward/src/Lucee/test/functions/ACos.cfc:6
     [java]    [script]                 /home/edward/src/Lucee/test/_testRunner.cfc:328
     [java]    [script]                 /home/edward/src/Lucee/test/run-tests.cfm:201
     [java]    [script] 
     [java]    [script] Failed: test.functions.ASin
     [java]    [script]         Checking ASin() function
     [java]    [script]         Expected [0.304692654015] but received [0.3046926540153975]
     [java]    [script]                 /home/edward/src/Lucee/test/functions/ASin.cfc:5
     [java]    [script]                 /home/edward/src/Lucee/test/_testRunner.cfc:328
     [java]    [script]                 /home/edward/src/Lucee/test/run-tests.cfm:201
     [java]    [script] 
     [java]    [script] Failed: test.functions.Atn
     [java]    [script]         Checking Atn() function
     [java]    [script]         Expected [0.291456794478] but received [0.2914567944778671]
     [java]    [script]                 /home/edward/src/Lucee/test/functions/Atn.cfc:5
     [java]    [script]                 /home/edward/src/Lucee/test/_testRunner.cfc:328
     [java]    [script]                 /home/edward/src/Lucee/test/run-tests.cfm:201
     [java]    [script] 
     [java]    [script] Failed: test.functions.Cos
     [java]    [script]         checking Cos() function
     [java]    [script]         Expected [-0.9899924966] but received [-0.9899924966004454]
     [java]    [script]                 /home/edward/src/Lucee/test/functions/Cos.cfc:5
     [java]    [script]                 /home/edward/src/Lucee/test/_testRunner.cfc:328
     [java]    [script]                 /home/edward/src/Lucee/test/run-tests.cfm:201
     [java]    [script] 
     [java]    [script] Failed: test.functions.CreateTimeSpan
     [java]    [script]         checking CreateTimeSpan() function
     [java]    [script]         Expected [1.042372685185:] but received [1.0423726851851853:]
     [java]    [script]                 /home/edward/src/Lucee/test/functions/CreateTimeSpan.cfc:5
     [java]    [script]                 /home/edward/src/Lucee/test/_testRunner.cfc:328
     [java]    [script]                 /home/edward/src/Lucee/test/run-tests.cfm:201
     [java]    [script] 
     [java]    [script] Failed: test.tickets.LDEV1824
     [java]    [script]         checking Numeric member function
     [java]    [script]         Expected [1.047197551197] but received [1.0471975511965979]
     [java]    [script]                 /home/edward/src/Lucee/test/tickets/LDEV1824.cfc:11
     [java]    [script]                 /home/edward/src/Lucee/test/_testRunner.cfc:328
     [java]    [script]                 /home/edward/src/Lucee/test/run-tests.cfm:201

FAILS.

$ export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8
$ ant fast

→ [javac] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8

not complaining about US ASCII now…


BUILD SUCCESSFUL


Fire all trumpets!

1 Like
4 Likes

Lots of handy information here. This is great.

That is AWESOME @Zackster !!! Really great! Going to test it as soon as I can!

If you are interested in an alternative, client-side solution, I recommend using the Mark.js javascript library to perform highlighting.

Mark.js “can be used to dynamically mark search terms or custom regular expressions and offers you built-in options like diacritics support, separate word search, custom synonyms, iframes support, custom filters, accuracy definition, custom element, custom class name and more.”

I like this because I can make highlighting interactive on a single webpage that uses ajax without depending on the server to hard-code the highlights using HTML.