Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.
I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.
SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.
It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…
Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.
I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.
SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.
It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…
Maybe the SO post is out of date. If you use cfsolrlib, which uses
Java-based solrj under the covers, you are already using connection pooling/
See this post:
My only issue with cfsolrlib is that it seems to be using the XML format for
indexing and querying, which was good five years ago but is now unnecessary
as Solr supports JSON. Might be worth investing some time to fork Shannon’s
cfsolrlib repo and patch the library to use JSON as an optional format.
I haven’t used ElasticSearch, but it seems to be more popular among new
projects than Solr.
On Mar 11, 2015, at 8:31 AM, Geoff Parkhurst <@Geoff_Parkhurst> wrote:
Hi folks
Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.
I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.
SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.
It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…
Maybe the SO post is out of date. If you use cfsolrlib, which uses Java-based solrj under the covers, you are already using connection pooling/ See this post:
My only issue with cfsolrlib is that it seems to be using the XML format for indexing and querying, which was good five years ago but is now unnecessary as Solr supports JSON. Might be worth investing some time to fork Shannon’s cfsolrlib repo and patch the library to use JSON as an optional format.
I haven’t used ElasticSearch, but it seems to be more popular among new projects than Solr.On Mar 11, 2015, at 8:31 AM, Geoff Parkhurst <@Geoff_Parkhurst> wrote:
Hi folks
Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.
I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.
SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.
It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…
I looked at Jason’s client too a while ago. Nice piece of work, but it
was written for an older ES release and I had issues with 1.x
Having previously used the embedded Solr in CF9 I ended up writing my
own wrapper which mimics the CF/Solr behaviour. It uses (cf)http and
seems to perform well, but then my needs are fairly small-scale and
I’m not using the full clustering/sharding capability.On 12 March 2015 at 11:01, Andrew Dixon <@Andrew_Dixon> wrote:
I was actually looking at ES last night and on the ES site there is a link
to this CFML project on Github:
It says it is still beta but there hasn’t been a commit for 5 months, so I’m
not sure what is going on with it. I did tweet at Jason Fill and asked but
I’ve not heard back. It appears however to work ok for what I wanted to do,
but looking in the source it is using http requests, but it honestly didn’t
feel slow, but then I guess it depends what you are doing.
well, when I spoke with the people at Elastic a couple of years ago they
said that since we use Apache HttpClient the connections are reused by
default, so we’re good (unlike other platforms where a new connection
was created for each request). TBH I never tested that myself because I
never bumped into performance issues.
I’m not sure what you mean by “authenticate”? are you planning to front
elasticsearch with a proxy server? are you planning to use elastic’s
Shield? (I imagine your servers sit behind a firewall and are
communicating between themselves on a LAN).
you should definitely run some tests first, and please share with us
your results when you have them.
if performance is an issue then look into the Bulk API that I mentioned
in a previous email on this thread.
On 12 March 2015 at 15:09, Igal @ Lucee.org <@Igal> wrote:
how many requests per second do you expect?
Well, right now, 5 front-end web servers are maintaining about 200
connections to our DBs and servicing about 200 requests per second.
The majority of those will be front-end catalogue requests - so maybe
30 - 40 cfhttp calls per second per server. (But we get 10x this
traffic at Christmas / Valentine’s - which is why we need the
shardability)
It’s the response time that I’m most interested in maintaining though
you’re right, I’d need to do some testing before writing off cfhttp
just feels like a lot of setup / authenticate / close connection
traffic (lag) we could do without…
Thanks for that. It’s that underlying connectivity that concerns me; I
can’t shake the feeling that cfhttp is not the right method for
performance due to all the overheads with connection / auth etc.
I think we’d hit performance problems at both ends - getting fast
response times for e-commerce customers, and bulk inserting / updating
(One of) the PHP SOLR libraries (solarium) has a sub-set of connection
methods (“adaptors”) so you can choose to cURL, or http, or zend etc.
I’ve not yet looked at Jason’s code but if that connection method is
abstracted into its own chunk, perhaps I could build on that with a
persistent java connection pool or something…
We’re not yet ready to turn our whole ecom site AJAX’y and make the
client call SOLR / ES directly…
Still digging anyhow - many thanksOn 12 March 2015 at 11:01, Andrew Dixon <@Andrew_Dixon> wrote:
Hi Geoff,
I was actually looking at ES last night and on the ES site there is a link
to this CFML project on Github:
It says it is still beta but there hasn’t been a commit for 5 months, so I’m
not sure what is going on with it. I did tweet at Jason Fill and asked but
I’ve not heard back. It appears however to work ok for what I wanted to do,
but looking in the source it is using http requests, but it honestly didn’t
feel slow, but then I guess it depends what you are doing.
Well, right now, 5 front-end web servers are maintaining about 200
connections to our DBs and servicing about 200 requests per second.
The majority of those will be front-end catalogue requests - so maybe
30 - 40 cfhttp calls per second per server. (But we get 10x this
traffic at Christmas / Valentine’s - which is why we need the
shardability)
It’s the response time that I’m most interested in maintaining though
you’re right, I’d need to do some testing before writing off cfhttp
just feels like a lot of setup / authenticate / close connection
traffic (lag) we could do without…On 12 March 2015 at 15:09, Igal @ Lucee.org <@Igal> wrote:
On 11 March 2015 at 15:35, Igal @ Lucee.org <@Igal> wrote:
Geoff,
I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/
Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:
Use cfhttp (slow - create connection, authenticate, get data, close
connection)
Roll your own persistent connection pool with Java
invoke some pre-built wrapped java driver
Was there a Railo elasticsearch extension at one time or did I imagine
that? Can’t seem to find one under Lucee…
I was actually looking at ES last night and on the ES site there is a link
to this CFML project on Github:
It says it is still beta but there hasn’t been a commit for 5 months, so
I’m not sure what is going on with it. I did tweet at Jason Fill and asked
but I’ve not heard back. It appears however to work ok for what I wanted to
do, but looking in the source it is using http requests, but it honestly
didn’t feel slow, but then I guess it depends what you are doing.
Fyi Lucee 5 will move the search (lucene) to an extension, so you could
even do a extension that replaces the current lucene implementation
MichaAm Mittwoch, 11. März 2015 schrieb Geoff Parkhurst :
Hi folks
Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.
I can’t seem to see an extension or plugin to do the remote connection
unless I’m missing some config option somewhere.
SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we’d need to roll
our own java-based persistent http connection pool to get good
performance.
It’s something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we’d not be
reinventing the wheel…
That hasn’t been touched since 2012 and I believe only just got started. I
wouldn’t recommend it! Indeed, I might just take it down. (this is not just
me being defensive about my own code).
DOn 12 March 2015 at 19:12, Alex Skinner <@Alex_Skinner> wrote:
–
Pixl8 Interactive, 3 Tun Yard, Peardon Street, London
SW8 3HT, United Kingdom
T: +44 [0] 845 260 0726• W: www.pixl8.co.uk• E: info@pixl8.co.uk
Follow us on: Facebook http://www.facebook.com/pixl8 Twitter http://www.twitter.com/pixl8 LinkedIn http://www.linkedin.com/pixl8CONFIDENTIAL
AND PRIVILEGED - This e-mail and any attachment is intended solely for the
addressee, is strictly confidential and may also be subject to legal,
professional or other privilege or may be protected by work product
immunity or other legal rules. If you are not the addressee please do not
read, print, re-transmit, store or act in reliance on it or any
attachments. Instead, please email it back to the sender and then
immediately permanently delete it. Pixl8 Interactive Ltd Registered in
England. Registered number: 04336501. Registered office: 8 Spur Road,
Cosham, Portsmouth, Hampshire, PO6 3EB