einnocent (19) [Avatar] Offline
#1
No mention of Terracotta is made in the book. Some info on how to integrate Hibernate Search with TC would be great, esp since TC's memory distribution seems like a natural solution for a Lucene Directory in a clustered environment.

To be fair, TC doesn't yet support HS, though Compass does, and apparently the solution that allows Compass to work with TC also works with HS. Not surprising, since it's a TC Directory object that interfaces with Lucene independent of the framework.

https://jira.terracotta.org/jira/browse/CDV-558
emmanuel.bernard (101) [Avatar] Offline
#2
Re: With Terracotta
Terracotta and JBoss Cache both offer this kind of replicated / distributed Lucene Directory. But due to the nature of Lucene index and the need for a global pessimistic lock, there is little advantage in using these approaches IMO. As a matter of fact, I have never seen such deployments. I will add a mention to this kind of approach in the Clustering chapter.

One cool thing that JBoss Cache does now is the ability to search the actual Cache in full-text mode thanks to Hibernate Search, that is pretty cool smilie
emmanuel.bernard (101) [Avatar] Offline
#3
Re: With Terracotta
BTW for the sake of completion, here are a few pros and cons.
If your index does not change too often and is small enough to fit in memory, that's a nice way to replicate it.
When the index is updated frequently, the pessimistic lock is acquire on the whole cluster and can lead to scalability issues.
When the index is too big, the "virtual heap space" faults frequently leading to a lot of network traffic.
ikarzali (3) [Avatar] Offline
#4
Re: With Terracotta
With the TC Directory, there is no global pessimistic lock. Compass and Lucene on TC both perform VERY WELL.

Cheers,

--Ari
emmanuel.bernard (101) [Avatar] Offline
#5
Re: With Terracotta
If there is no global pessimistic lock, then Lucene ain't work smilie The Directory data structure has to be protected by it and Lucene does acquire a pessimistic lock.

From what I know from Terracotta they do implement or precisely bytecode enhance the JDK (which is not allowed by the Java license BTW unfortunately - but that's another story). Particularly the concurrent package is enhanced to transform VM locks into inter VM locks. Since Shay uses a ConcurrentHashMap to store Lucene locks, this map is replicated across the Terracotta cluster and the semantic of ConcurrentHashmap is respected => the Lucene lock is shared across the cluster.
I am not entirely certain a concurrent hashmap is good enough though as you are only guaranteed to see the last committed change but a change could be happening in //. It's likely to be a small window but is unsafe. Using either synchronize or a Lock object (including when reading) would make the code safe but is probably slower on the Terracotta cluster (the semantic is much stronger and are not easy to implement cluster wise in an efficient manner).

http://svn.compass-project.org/svn/compass/trunk/src/main/src/org/compass/needle/terracotta

This is fine for low write applications
emmanuel.bernard (101) [Avatar] Offline
#6
Re: With Terracotta
oops I haven't finished. Assuming the lock semantic cluster wise is respected, performance will likely be totally fine for low or medium write applications.

The part I haven't really tested is the performance difference between an in-memory distributed directory (with page-fault when the Directory is too big) and a traditional local file system access ; this for a variety of Directory sizes. If the in-memory model is good, then it would be worth to propose it in Hibernate Search as an alternative to the FSMaster / FSSlave DirectoryProviders (the lock problem being solved by the JMS approach in HSearch).
manik (2) [Avatar] Offline
#7
Re: With Terracotta
CHMs don't provide adequate locking for Lucene Directory impls. Lucene writes indexes by:

1. Acquiring a WL
2. Reading indexes
3. Updating the indexes read
4. Writing the indexes
5. Releasing the WL

CHMs (and any distributed synchronization based on CHM's boundaries) will only guarantee thread safety for step 2 and 4, independent of each other. Which is no good, and completely breaks any guarantees Lucene expects when it calls lock and unlock on the directory impl.
manik (2) [Avatar] Offline
#8
Re: With Terracotta
The JBC based directory I have implemented uses tokens to demarcate sync boundaries and these are cluster-wide. This is ok for low to medium write apps as you said.

I've got a more efficient distributed lock impl based on cooperative locks in the pipelines (for some daft reason a lot of people want one!) which will make the JBC backed directory quicker, but the fundamental scalability issue with the very concept of a cluster-wide lock still exists.

I did consider a more optimistic approach to implementing the directory, a distributed analogy to a CAS spin-lock if you want to call it that, but the Directory interface doesn't help at all in that there is no support for retries. My next plan is to fix the Directory interface (extend it, perhaps) and submit it to Lucene, but this may take a while before it is accepted and usable. Once this is available, optimistic approaches would work which will make things scale pretty well in a cluster.

Cheers,
Manik
emmanuel.bernard (101) [Avatar] Offline
#9
Re: With Terracotta
Manik,
Compass stores the lock name in a CHM so 3 is mainly safe too (use putIfAbsent and remove)
The code is safe as long as nobody relies on isLocked() which only does a read. Digging through the Lucene code more, only NativeFSLock makes active use of isLocked(). The isLocked method is also exposed to the public API. So a CHM (where a lock is represented as an entry) is safe as long as you dont use the NativeFSLock and your application makes no sensitive use of isLocked()

I guess I was a bit too alarmist smilie

I think that's the technique you are using in JBoss Cache too.

It would be nice if Lucene would qualify the "quality" of isLocked() requested. The Javadoc seems to indicate that constraints are loose.

Interesting stuff smilie
eellis (1) [Avatar] Offline
#10
Re: With Terracotta
Hi guys,

I'm a former Terracotta development engineer who had worked on early attempts to create a Terracotta Integration Module (TIM) for Lucene. This was several years ago so the implementation may have improved, as Ari had stated, but in principal my findings where exactly on par with what emmanuel is saying. Sharing the RAMDirectory across a cluster (Clustered Concurrent HashMap) works fine if all you want to do is cluster Lucene for the sake of having residual support for another clustered app; which doesn't heavily rely on Lucene. For example: to cluster JIRA. If your goal of having Lucene in a cluster is to distribute load and improve performance for heavy read/write intensive applications... you may be up for a disappointment.

My testing (limited and several years ago) had shown that real time distributed locking/updates to an index under heavy load, as Emmanuel has stated, produces far too much network traffic and latency. At the time there was no simple solution to this incredibly difficult problem.

My best suggestion for scaling Lucene (implemented while working for a large ISP) would be to take the more difficult route of letting go of the support frameworks (Hibernate, etc.) and having Lucene run standalone on a dedicated set of servers, possibly maintaining their indexes independently or with some sort of batch multicast for updates. Possibly RMI or HTTP to some rudimentary load balancer choosing from let's say 3 Lucene search servers. Basically, run in parallel or scale up (not out).
kimchy (1) [Avatar] Offline
#11
Re: With Terracotta
The Compass implementation of Lucene Directory implementation does not use Lucene RAMDirectory, but instead uses a much more optimized memory based directory that works well with Terracotta (and should work better also as pure RAM directory...).

It works well with Lucene, as Emmanuel found, isLocked is not required to be implemented. Also, there is an experimental lock factory that is based on Terracotta internal ManagerUtil class. In any case, with Terracotta and Compass integration, there is no broken lock semantics that might break Lucene, it actually works very well.

The performance itself should be pretty good, also for write heavy operations, by the way. I got very good numbers with the Compass implementation (don't use the one that comes with Terracotta, which is based on RAMDirectory). Locking semantics in Lucene is the same regardless of where you store your index, and you solve that by sharding your index. Sharding your index works even better when you have in memory image of the index.

With terracotta (and also with GigaSpaces/Coherence, but in a different manner), you don't have to have the whole index in memory.

Last, all of Compass directories implementations can be used completely on their own, or integrated with pure Lucene, Hibernate Search, Solr, ... .

Cheers,
Shay Banon
Compass
ikarzali (3) [Avatar] Offline
#12
Re: With Terracotta
Right. As Kimchy (Shay Banon, the inventor of Compass) explains, we wrote a new TCDirectory for Lucene / Compass. The locking strategy is totally different now. The RAMDirectory implemented by eellis was completely drop-in, but once we built tools to visualize the lock and locality issues in a clustered application, Shay (kimchy) quickly found that RAMDirectory didn't work well at all in high-write scenarios. That necessitated use of MgrUtil to acquire and release locks explicitly, not transparently, and we moved from a global writelock (which was IIRC acquired for every byte written to an index) to a fine-grained writelock per index which was acquired once for the entire index update. We even broke the once-required global lock into striped locks across map segments.

It went over 100X faster than RAMDirectory and is plenty fast for people to use now.

Sorry, but I am hazy on the details. Shay did this work while at a conference together, sitting on the floor, in the span of 3 hours end to end. Was quite remarkable to watch. But, all the assertions that Lucene clustering is slow / bad / can be made better on TC were correct on eellis's implementation but are no longer valid claims if one takes the TCDirectory from inside Compass and uses it with vanilla Lucene (or just upgrade to Compass).

Cheers,

--Ari
ikarzali (3) [Avatar] Offline
#13
Re: With Terracotta
BTW, emmanuel,

Terracotta's relationship with SUN is in good standing--we are partners. Asserting we violate the license is not good form, and not true. Please refrain from doing so in the future.

I would appreciate it.

--Ari
einnocent (19) [Avatar] Offline
#14
Re: With Terracotta
Wow! This thread has been really informative. Thanks to everyone for their input.

If I may attempt to summarize, it looks like the clustering issue can be broken down into two basic issues (assuming you follow the master/slave model):
1 -- Informing the master of data changes so that it may update the Lucene index.
2 -- Propagating the updated Lucene index to the slaves.

I list below the solutions I've come across for each, both in HSiA and in discussion at work (I have to implement a distributed search solution).

Issue #1: Informing the master
a -- Hibernate Search+JMS [in HSiA]. Nearly-immediate index updates on master; out-of-the-box solution.
b -- Terracotta Hibernate Module propagates object changes across the cluster. Seems like this should work but I have not done proof-of-concept. Index updates should also be nearly immediate.
c -- Timed job (w/ Quartz, etc) on the master just updates based on lastModifiedDate (or whatever) every however often.

Issue #2: Propagating the index
n -- FS[Master/Slave]DirectoryProvider [in HSiA]. Rsync-ish file copy after configurable refresh timeout. Out-of-box with HS. Updates not immediate.
o -- TCDirectory: thread contributors claim this works well. I believe them. Index updates immediate. Requires TC.
p -- JdbcDirectory: updates immediately but doesn't seem suitable for heavy traffic. See HSiA 10.1 circa 18 Aug 2008. (Seems like this might work better if it used tables/rows/columns instead of BLOBs? Perhaps easier said than done?)
q -- rsync indexes: Prolly not suitable for large indexes unless you're sharding. (??) Not really necessary when you've got "option n".
r -- SAN: If you got the hardware, why not? Smoke 'em if you got 'em.

Note that to have updates be (nearly) immediately searchable, both issues must provide an "immediate" solution.

Any errors or omissions? Also, I'm probably revealing my ignorance here, but why wouldn't distributed ehcache work to solve issue #1? Thanks!

Erik Innocent
emmanuel.bernard (101) [Avatar] Offline
#15
Re: With Terracotta
Ari
I was just referencing to the standard Java License

http://java.sun.com/j2se/1.5.0/jdk-1_5_0_16-license.txt

D. Java Technology Restrictions. You may not create,
modify, or change the behavior of, or authorize your
licensees to create, modify, or change the behavior of,
classes, interfaces, or subpackages that are in any way
identified as "java", "javax", "sun" or similar convention
as specified by Sun in any naming convention designation.


I am glad you have a different agreement with Sun. To be honest, it would be cool if everybody could benefit from the same agreement, though I understand why Sun put it in place in the first place
emmanuel.bernard (101) [Avatar] Offline
#16
Re: With Terracotta
Hi Erik
A few remarks

Issue #1. b. The index sync is truly immediate unlike #1 a.

Issue #2 o. I believe there is some kind of GigaSpace equivalent too. AFAIR Shay did both Terracotta and GigaSpace integrations at the same time. There is also a JBoss Cache solution https://svn.jboss.org/repos/jbosscache/jbosscache-lucene/jbosscache/ that Manik developed based on some discussions I had with him.
#2 r SAN only solves part o the problem but true you can share the same directory for all slaves. (though you network traffic will increase)

Shay is right, if you shard your indexes, this will smooth things for all solutions.
emmanuel.bernard (101) [Avatar] Offline
#17
Re: With Terracotta
Thank Ari and Shay for stepping up on the in memory approach.
To be honest I have always been skeptical of this approach but I stand corrected and I revised my position. It's a good tool to have in ones arsenal.
The best advice is test and see what fits you need... as usual.