Solr

Geoff_Parkhurst · March 11, 2015, 3:31pm

Hi folks

Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.

I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.

SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.

It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…

Any input appreciated.

Best,
Geoff

Igal · March 11, 2015, 3:35pm

Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/

Igal Sapir
Lucee Core Developer
Lucee.org http://lucee.org/On 3/11/2015 8:31 AM, Geoff Parkhurst wrote:

Hi folks

Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.

I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.

SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.

It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…

Any input appreciated.

Best,
Geoff

Geoff_Parkhurst · March 11, 2015, 10:13pm

Many thanks for the input all. I’ll definitely take a look at
ElasticSearch and the cfsolrlib.

The SO question was this by the way - not that old (2013) - someone
trying to connect ACF10 to ElasticSearch as it turns out:

Understanding Persistent HTTP Connections in ColdFusion - Stack Overflow 11 March 2015 at 19:30, Robert Munn <@Robert_Munn> wrote:

Maybe the SO post is out of date. If you use cfsolrlib, which uses
Java-based solrj under the covers, you are already using connection pooling/
See this post:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201311.mbox/<5294B1E2.6000202@elyograg.org>

My only issue with cfsolrlib is that it seems to be using the XML format for
indexing and querying, which was good five years ago but is now unnecessary
as Solr supports JSON. Might be worth investing some time to fork Shannon’s
cfsolrlib repo and patch the library to use JSON as an optional format.

If you want to roll your own, you could use:

http://adiabata.com/cfx_http5.cfm

or

http://hc.apache.org/httpcomponents-client-ga/index.html

I haven’t used ElasticSearch, but it seems to be more popular among new
projects than Solr.

On Mar 11, 2015, at 8:31 AM, Geoff Parkhurst <@Geoff_Parkhurst> wrote:

Hi folks

Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.

I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.

SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.

It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…

Any input appreciated.

Best,
Geoff

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAC0HRKmXWD4RuZC0xCaxieFoFzpBcFmjMM6Fq0R0ceXvFX0z1w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/67A4CCA6-D15E-4F7B-AB2D-B515C2B10B56%40gmail.com.

For more options, visit https://groups.google.com/d/optout.

Robert_Munn · March 11, 2015, 7:30pm

Maybe the SO post is out of date. If you use cfsolrlib, which uses Java-based solrj under the covers, you are already using connection pooling/ See this post:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201311.mbox/<5294B1E2.6000202@elyograg.org>

My only issue with cfsolrlib is that it seems to be using the XML format for indexing and querying, which was good five years ago but is now unnecessary as Solr supports JSON. Might be worth investing some time to fork Shannon’s cfsolrlib repo and patch the library to use JSON as an optional format.

If you want to roll your own, you could use:

http://adiabata.com/cfx_http5.cfm

or

http://hc.apache.org/httpcomponents-client-ga/index.html

I haven’t used ElasticSearch, but it seems to be more popular among new projects than Solr.On Mar 11, 2015, at 8:31 AM, Geoff Parkhurst <@Geoff_Parkhurst> wrote:

Hi folks

Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.

I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.

SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.

It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…

Any input appreciated.

Best,
Geoff

–
You received this message because you are subscribed to the Google Groups “Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/CAC0HRKmXWD4RuZC0xCaxieFoFzpBcFmjMM6Fq0R0ceXvFX0z1w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Geoff_Parkhurst · March 12, 2015, 10:22am

Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:

Use cfhttp (slow - create connection, authenticate, get data, close
connection)
Roll your own persistent connection pool with Java
invoke some pre-built wrapped java driver

Was there a Railo elasticsearch extension at one time or did I imagine
that? Can’t seem to find one under Lucee…

Many thanks
GeoffOn 11 March 2015 at 15:35, Igal @ Lucee.org <@Igal> wrote:

Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/

Julian_Halliwell · March 12, 2015, 11:39am

I looked at Jason’s client too a while ago. Nice piece of work, but it
was written for an older ES release and I had issues with 1.x

Having previously used the embedded Solr in CF9 I ended up writing my
own wrapper which mimics the CF/Solr behaviour. It uses (cf)http and
seems to perform well, but then my needs are fairly small-scale and
I’m not using the full clustering/sharding capability.On 12 March 2015 at 11:01, Andrew Dixon <@Andrew_Dixon> wrote:

I was actually looking at ES last night and on the ES site there is a link
to this CFML project on Github:

https://github.com/jasonfill/ColdFusion-ElasticSearch-Client

It says it is still beta but there hasn’t been a commit for 5 months, so I’m
not sure what is going on with it. I did tweet at Jason Fill and asked but
I’ve not heard back. It appears however to work ok for what I wanted to do,
but looking in the source it is using http requests, but it honestly didn’t
feel slow, but then I guess it depends what you are doing.

Igal · March 12, 2015, 9:33pm

well, when I spoke with the people at Elastic a couple of years ago they
said that since we use Apache HttpClient the connections are reused by
default, so we’re good (unlike other platforms where a new connection
was created for each request). TBH I never tested that myself because I
never bumped into performance issues.

I’m not sure what you mean by “authenticate”? are you planning to front
elasticsearch with a proxy server? are you planning to use elastic’s
Shield? (I imagine your servers sit behind a firewall and are
communicating between themselves on a LAN).

you should definitely run some tests first, and please share with us
your results when you have them.

if performance is an issue then look into the Bulk API that I mentioned
in a previous email on this thread.

Igal Sapir
Lucee Core Developer
Lucee.org http://lucee.org/On 3/12/2015 2:20 PM, Geoff Parkhurst wrote:

On 12 March 2015 at 15:09, Igal @ Lucee.org <@Igal> wrote:

how many requests per second do you expect?
Well, right now, 5 front-end web servers are maintaining about 200
connections to our DBs and servicing about 200 requests per second.

The majority of those will be front-end catalogue requests - so maybe
30 - 40 cfhttp calls per second per server. (But we get 10x this
traffic at Christmas / Valentine’s - which is why we need the
shardability)

It’s the response time that I’m most interested in maintaining though

you’re right, I’d need to do some testing before writing off cfhttp

just feels like a lot of setup / authenticate / close connection
traffic (lag) we could do without…

Geoff_Parkhurst · March 12, 2015, 12:34pm

Thanks for that. It’s that underlying connectivity that concerns me; I
can’t shake the feeling that cfhttp is not the right method for
performance due to all the overheads with connection / auth etc.

I think we’d hit performance problems at both ends - getting fast
response times for e-commerce customers, and bulk inserting / updating

(One of) the PHP SOLR libraries (solarium) has a sub-set of connection
methods (“adaptors”) so you can choose to cURL, or http, or zend etc.

I’ve not yet looked at Jason’s code but if that connection method is
abstracted into its own chunk, perhaps I could build on that with a
persistent java connection pool or something…

We’re not yet ready to turn our whole ecom site AJAX’y and make the
client call SOLR / ES directly…

Still digging anyhow - many thanksOn 12 March 2015 at 11:01, Andrew Dixon <@Andrew_Dixon> wrote:

Hi Geoff,

I was actually looking at ES last night and on the ES site there is a link
to this CFML project on Github:

https://github.com/jasonfill/ColdFusion-ElasticSearch-Client

It says it is still beta but there hasn’t been a commit for 5 months, so I’m
not sure what is going on with it. I did tweet at Jason Fill and asked but
I’ve not heard back. It appears however to work ok for what I wanted to do,
but looking in the source it is using http requests, but it honestly didn’t
feel slow, but then I guess it depends what you are doing.

Kind regards,

Andrew
about.me
mso - Lucee - Member

On 12 March 2015 at 10:22, Geoff Parkhurst <@Geoff_Parkhurst> wrote:

On 11 March 2015 at 15:35, Igal @ Lucee.org <@Igal> wrote:

Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/

Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:

Use cfhttp (slow - create connection, authenticate, get data, close
connection)

Roll your own persistent connection pool with Java

invoke some pre-built wrapped java driver

Was there a Railo elasticsearch extension at one time or did I imagine
that? Can’t seem to find one under Lucee…

Many thanks
Geoff

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAC0HRKnZ5jM_p2eZU-vPZvtXwX%3Du4WsX1PzoHyWXS8UeiM-X2w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAG1WijVBQBZmZkY2nDhjw_-gbnc_wcODBzhz9wK9SVfBokauSA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Geoff_Parkhurst · March 12, 2015, 9:20pm

Well, right now, 5 front-end web servers are maintaining about 200
connections to our DBs and servicing about 200 requests per second.

The majority of those will be front-end catalogue requests - so maybe
30 - 40 cfhttp calls per second per server. (But we get 10x this
traffic at Christmas / Valentine’s - which is why we need the
shardability)

It’s the response time that I’m most interested in maintaining though

you’re right, I’d need to do some testing before writing off cfhttp
just feels like a lot of setup / authenticate / close connection
traffic (lag) we could do without…On 12 March 2015 at 15:09, Igal @ Lucee.org <@Igal> wrote:

how many requests per second do you expect?

Igal · March 12, 2015, 3:09pm

I’ve toyed with the Bulk API in the past –
http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/bulk.html
– but TBH the http client is rather efficient and I mostly use it, so
test it before you conclude that it’s too slow.

how many requests per second do you expect?

Igal Sapir

Lucee Core Developer
Lucee.org http://lucee.org/On 3/12/2015 3:22 AM, Geoff Parkhurst wrote:

On 11 March 2015 at 15:35, Igal @ Lucee.org <@Igal> wrote:

Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/
Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:

Use cfhttp (slow - create connection, authenticate, get data, close
connection)

Roll your own persistent connection pool with Java

invoke some pre-built wrapped java driver

Was there a Railo elasticsearch extension at one time or did I imagine
that? Can’t seem to find one under Lucee…

Many thanks
Geoff

andrew · March 12, 2015, 11:01am

Hi Geoff,

I was actually looking at ES last night and on the ES site there is a link
to this CFML project on Github:

It says it is still beta but there hasn’t been a commit for 5 months, so
I’m not sure what is going on with it. I did tweet at Jason Fill and asked
but I’ve not heard back. It appears however to work ok for what I wanted to
do, but looking in the source it is using http requests, but it honestly
didn’t feel slow, but then I guess it depends what you are doing.

Kind regards,

Andrew
about.me http://about.me/andrew_dixon
mso http://www.mso.net - Lucee http://lucee.org - MemberOn 12 March 2015 at 10:22, Geoff Parkhurst <@Geoff_Parkhurst> wrote:

On 11 March 2015 at 15:35, Igal @ Lucee.org <@Igal> wrote:

Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/

Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:

Use cfhttp (slow - create connection, authenticate, get data, close
connection)

Roll your own persistent connection pool with Java

invoke some pre-built wrapped java driver

Was there a Railo elasticsearch extension at one time or did I imagine
that? Can’t seem to find one under Lucee…

Many thanks
Geoff

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAC0HRKnZ5jM_p2eZU-vPZvtXwX%3Du4WsX1PzoHyWXS8UeiM-X2w%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.

alexskinner · March 12, 2015, 7:12pm

I recommend this

https://github.com/DominicWatson/cfelasticsearchSent from my phone
On 12 Mar 2015 10:22, “Geoff Parkhurst” <@Geoff_Parkhurst> wrote:

On 11 March 2015 at 15:35, Igal @ Lucee.org <@Igal> wrote:

Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/

Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:

Use cfhttp (slow - create connection, authenticate, get data, close
connection)

Roll your own persistent connection pool with Java

invoke some pre-built wrapped java driver

Was there a Railo elasticsearch extension at one time or did I imagine
that? Can’t seem to find one under Lucee…

Many thanks
Geoff

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAC0HRKnZ5jM_p2eZU-vPZvtXwX%3Du4WsX1PzoHyWXS8UeiM-X2w%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.

micstriit · March 13, 2015, 4:00pm

Fyi Lucee 5 will move the search (lucene) to an extension, so you could
even do a extension that replaces the current lucene implementation

MichaAm Mittwoch, 11. März 2015 schrieb Geoff Parkhurst :

Hi folks

Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.

I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.

SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.

It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…

Any input appreciated.

Best,
Geoff

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com <javascript:;>.
To post to this group, send email to lucee@googlegroups.com <javascript:;>
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAC0HRKmXWD4RuZC0xCaxieFoFzpBcFmjMM6Fq0R0ceXvFX0z1w%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.

Dominic_Watson · March 13, 2015, 10:03am

I recommend this

GitHub - DominicWatson/cfelasticsearch: A CFML wrapper (and more) to http://www.elasticsearch.org/. Providing stupendously simple and powerful search engine building in CFML applications.

That hasn’t been touched since 2012 and I believe only just got started. I
wouldn’t recommend it! Indeed, I might just take it down. (this is not just
me being defensive about my own code).

DOn 12 March 2015 at 19:12, Alex Skinner <@Alex_Skinner> wrote:

I recommend this

GitHub - DominicWatson/cfelasticsearch: A CFML wrapper (and more) to http://www.elasticsearch.org/. Providing stupendously simple and powerful search engine building in CFML applications.

Sent from my phone
On 12 Mar 2015 10:22, “Geoff Parkhurst” <@Geoff_Parkhurst> wrote:

On 11 March 2015 at 15:35, Igal @ Lucee.org <@Igal> wrote:

Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/

Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:

Use cfhttp (slow - create connection, authenticate, get data, close
connection)

Roll your own persistent connection pool with Java

invoke some pre-built wrapped java driver

Was there a Railo elasticsearch extension at one time or did I imagine
that? Can’t seem to find one under Lucee…

Many thanks
Geoff

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAC0HRKnZ5jM_p2eZU-vPZvtXwX%3Du4WsX1PzoHyWXS8UeiM-X2w%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.

–
You received this message because you are subscribed to the Google Groups
“Lucee” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to lucee+unsubscribe@googlegroups.com.
To post to this group, send email to lucee@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/lucee/CAFrbJ5VSJuMt8jUCL5zU8A8y1djFXJOr7ZmjL8cnMQnV5t7AFA%40mail.gmail.com
https://groups.google.com/d/msgid/lucee/CAFrbJ5VSJuMt8jUCL5zU8A8y1djFXJOr7ZmjL8cnMQnV5t7AFA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

–
Pixl8 Interactive, 3 Tun Yard, Peardon Street, London
SW8 3HT, United Kingdom

T: +44 [0] 845 260 0726• W: www.pixl8.co.uk• E: info@pixl8.co.uk
Follow us on: Facebook http://www.facebook.com/pixl8 Twitter
http://www.twitter.com/pixl8 LinkedIn
http://www.linkedin.com/pixl8CONFIDENTIAL
AND PRIVILEGED - This e-mail and any attachment is intended solely for the
addressee, is strictly confidential and may also be subject to legal,
professional or other privilege or may be protected by work product
immunity or other legal rules. If you are not the addressee please do not
read, print, re-transmit, store or act in reliance on it or any
attachments. Instead, please email it back to the sender and then
immediately permanently delete it. Pixl8 Interactive Ltd Registered in
England. Registered number: 04336501. Registered office: 8 Spur Road,
Cosham, Portsmouth, Hampshire, PO6 3EB