Purely negative queries don't work as one might expect. To get this to work, AND this negative query to a MatchAllDocsQuery (as a MUST or SHOULD clause, won't matter).
The book covers Lucene 3.0 and above, so this is expected behavior.
Personally, I think Solr (1.4) is the best thing you can evaluate for your company smilie It's Lucene, but with all the infrastructure that you'd end up writing yourself on top of Lucene otherwise.
You have to remove stop word filtering from the analyzer you're using with the highlighter.
That seems right - are you changing the analyzer you're indexing with or the analyzer you're passing to the highlighter? Best to make sure you change both instances.
We're going into pure Lucene support here, and it's best if you use the java-user@lucene.apache.org mailing list for your specific highlighting questions.

This forum is purely about supporting the Lucene in Action book.

Thanks for understanding.
We won't be including the javadocs for any 3rd party code, but rather refer to the homes of those projects. Maybe we need to include more information on those 3rd party APIs with pointers and version numbers? Also, I don't think the javadoc of the LIA2 code would be very helpful as we have an entire book to "doc" that code.

Was there something in particular that you were looking for and found difficult?
Yes, all such reference oddities should be resolved before going to print.
Great points, Jason! I always try to make any example code be the best example of what I'd do "for real" also - thus why we take unit testing so seriously in the bulk of the example code.

As for throws Exception - typically this is fine in JUnit tests at least (not sure if you're saying we shouldn't do that on testXXX methods, or just main() examples) since if a test throws an exception it's a failure and failures are all that we're looking for and hopefully never seeing.

Points taken!
Further - I'm working (slowly, sorry) on a Solr chapter for the book. And as Otis mentioned, I'll also write a case study for "LucidFind", our all-things Lucene search engine at http://www.lucidimagination.com/search
All of those are correct.
best bet is to inquire on the java-user list, which I see you've done.
That's great feedback on the first edition. We'll be taking that into consideration before the second edition goes final.

Erik
I just got the latest PDF and looked at the Solr section. Ummm.... what about the solr-ruby library? It would reduce the code shown dramatically, handle errors, guarantee that fields with XML tags in them would index properly, and much more.
And I got the print book in the mail yesterday, and the Solr examples are not up to par. And there are a few mentions of Solr that make it sound unstable, etc - which is absolutely not true.
I see you asked and were answered more appropriately on the general@ e-mail list.
This is a question best suited for the java-user e-mail list, so please take further questions there.

However, sure, Lucene could do what you're asking if you index file name as a field and then also the pages that it refers to in a separate field for display purposes.
Yeah, Rossetti Archive files are TEI-like, a fork from TEI to implement relationships in a custom way. But it's close enough for discussion.

As for XQuery, eXist, that kinda thing - it all depends on the types of queries you need to perform. For Rossetti, they originally had it in Tamino doing XPath queries that were really ridiculous and slow, but all they really wanted were some simple fielded query options (title, author, etc) and full-text - so Lucene fit perfectly. Users don't want to enter XPath's or XQuery's, and in general it is unnecessary for findability. However, the structural stuff is important for sophisticated rendering, and this is where XTF really shines.

I definitely would encourage you to go to Solr first - it's where I do all my work these days and what Collex is built upon. There is some effort in Solr (see Solr's issue tracker for "payload") to get structural info in there and usable for rendering and highlighting.

This thread is now a bit off-topic for this forum, but one of strong interest to me. Feel free to e-mail me off list to take it further if you like.
You mean like this? http://www.nines.org/collex smilie and http://www.rossettiarchive.org/rose

This was a project I worked on for a long while. It is indexing content of various formats. The Rossetti Archive is entirely in TEI(-like) format, and a lot of the content in NINES originates from TEI as well.

There is also XTF, the eXtensible Text Framework, which does a spectacular job with TEI and uses Lucene underneath.
This forum is for questions regarding the Lucene book, not Lucene in general. Please inquire on the java-user Lucene e-mail list for general support.

However, look into SpanQuery's or the Highlighter for the capabilities you're after.
Please ask general Lucene questions on the java-user e-mail list. This forum is for discussions related to the book only.

However....

You are indexing untokenized fields, therefore they are indexed literally as you specified them, and thus there is only a single hit for your *single-term* PhraseQuery. You could have just as easily used a TermQuery since you aren't really searching for a phrase at all.

Please review the Analysis and Searching chapters in Lucene in Action for further details on these points.
Please ask general Lucene questions (unrelated to the book "Lucene in Action") on the java-user@lucene e-mail list.

But, look at IndexReader's API - there are delete methods there.
That variation was intentional in some spots, though I don't have the book handy to say exactly where or why that was the case. I was aware of the phrase and reason it uses "jumps", but had diverged, possibly for stemming in some cases.

Message was edited by:
ErikHatcher
Otis can chime in too, but I'll answer first - the 2nd edition effort has just recently begun. I would not expect it to be published until late 2008 at the earliest.
The code is available here:

http://www.ehatchersolutions.com/downloads/LuceneInAction.zip

Sorry for the inconvenience.
Solr is built on Lucene, and it would be fairly easy to leverage the HTML tokenizers from it's code in a Lucene application. Just for the record.
There is an HTML stripping Analyzer built into Solr that you could use.

Your question is best asked in the java-user or solr-user forums though. This forum is strictly for support of the Lucene book contents.
This is not the appropriate forum for Lucene support; this place is dedicated to support the book only.

However, a PageFilter isn't really necessary. You can page through Hits or use other .search() methods to get pages of results of a Query. Your PageFilter would need a Query to filter and may not give the same number of documents for each "page", depending on where in the index the results matched.

Join the java-user e-mail list to continue discussion of this topic.
drawback - takes up more disk space

advantage - you store the value for later retrieval
The answer is above in this same thread.
In Lucene 1.x, Field.Keyword is untokenized. Field.Text is tokenized. Whether you store a field or not depends on your needs. Field.Text is stored if a String value is passed, unstored if a Reader is passed.
Andy - its actually not a big deal to divulge such information. I just asked our publisher and they said our current sales are between 10k and 15k. 20k, here we come!
Ah, so NetBeans got in the way initially. Thanks for being patient and glad things are working for you now.
Give these steps a try:

- Unzip the Lucene In Action code ZIP file into a clean directory.
- Type "ant"
- Type "ant test"
- Copy/paste the entire results of those two ant runs back here.
Any chance you've got a different/newer version of Lucene in your CLASSPATH somewhere?
This is the wrong forum for such a question. Please post to the java-user@lucene.apache.org e-mail list. This forum is strictly for support of the Lucene in Action book.
you need to subscribe (see the lucene website for details). i just approved your message you sent though, so it's already out there. but you'll want to subscribe to see replies smilie
Please provide details of how you ran the code and we'll help out.

Have you tried simply running "ant" as stated in the code download instructions? It's preconfigured to run all the examples easily.
you have likely pointed the indexer index directory to your data directory. be careful with the command-line arguments.

try out the code download from here: http://www.lucenebook.com/LuceneInAction.zip

it'll make your life easier and you'll have a ready-made launching platform for all of the examples making it easier for you to get a handle of how it works and then borrow the bits you need to build your own applications.
Lucene is a general purpose full-text search engine. It does not, itself, know anything about files, databases, or any other source of information - it has to be fed information, which ultimately are fields of documents which are Strings (in the java.lang.String sense). So, you can certainly index a database, easily. Check out the Compass framework, or write some custom code to interact with Lucene and your database and you'll realize how easy it is.

Also, pick up a copy of Lucene in Action! smilie

Another tip is to post general Lucene questions to the appropriate Lucene e-mail list (java-user@lucene, for example).

Erik
This is a question best asked on the java-user list, but one option is to use the new FunctionQuery that is in Solr, though not yet migrated to Lucene itself. This would allow runtime score adjusting in the manner you specified. Well, I dunno if you'd be able to use score per se, but you could add additional factors.
Please keep this discussion on the java-user list. This forum is strictly for topics related to the book "Lucene in Action", not a general Lucene support forum.
The score does not take into account how well a term matched for FuzzyQuery. That's just the way its built currently. The score is based on term frequency of the actual matching term. FuzzyQuery gets rewritten as a BooleanQuery with all matching terms OR'd.

If you want to get at the raw difference that Lucene uses, its the Levenstein distance algorithm, you can use the code in FuzzyTermEnum.java as a starting point. You can probably use that code directly somehow, or at least borrow the similarity computation.
Sorry abou the site... having server memory issues. Here's a link to download the examples:

http://www.ehatchersolutions.com/downloads/LuceneInAction.zip
These questions are general Lucene usage questions. This forum is focused purely on discussion pertaining to "Lucene in Action". Please ask these on the java-user@lucene.apache.org (subscription required before posting, see Apache's mailing list page for info on signing up) mailing list where you'll find a lot of helpful folks and practically immediate replies probably.
see above! smilie
Yes, we are planning a 2nd edition. The road from LIA1 to LIA2 has been a combination of busyness, lazyness, and now its just plain pragmatic. Lucene has been undergoing massive changes (for the better, if you can imagine that Lucene could actually get better!!!) which are still works in progress. Otis and I have decided that it is best to wait for at least the next 6 months or so and let Lucene stabilize in functionality and we'll shake it out in the 2nd edition. Don't hold your breath, except only briefly at times. I can't say when for sure, but I'd put money on seeing a new edition before 2010 smilie
To try out the examples, I'd stick with the download you'll find from lucenebook.com, but for new code I'd certainly recommend upgrading to Lucene 2.x. You can find some notes about what it takes to adapt our example code to the Lucene 2.0 API here:

http://www.nabble.com/Lucene-in-Action-examples-complie-problem-tf2418478.html#a6743189
PhraseQuery will also score higher by proximity. Depending on how your generate your queries, you may want to OR the original query with a PhraseQuery or SpanNearQuery of all the terms the user enters, with some slop factor.
http://www.manning.com/about/faq_misc.html#free

In other words, you can get the pdf version but you'll have to pay directly at manning.com for it.
Yes, you'll want to index a separate document for each record in your case. And it's quite doable. Lucene laughs at 50k documents smilie
You won't need to step into Lucene's code to troubleshoot this.

Have a look at your index with Luke and see if you can search using its interface. If that works, then something is awry in how you're using Searcher I'd guess... maybe your query is not being parsed as you expect?
Vishnu,

This forum is strictly for comments/questions/issues about the Lucene in Action book. It is not a general Lucene support forum. The java-user e-mail list at Apache is the appropriate place to post your question.

Erik
I see you posted this to the java-user list also, so we'll let the follow-up occur there.

Erik
A second edition is in the works, but is not expected to be ready until mid-to-late 2007. Lucene's API has changed a bit in version 2.0, but the code is easily adaptable (see a recent post from myself to the java-user e-mail list).
Lucene does not have any direct connection to your database. It is up to you, the developer, to tie the two together somehow. For example, you can index your database primary keys as a field for every document and then fetch the rows when you display the hits.

I hope that helps. If you need more direction, sign up for the Lucene user e-mail list where you'll find many helpful people that have advice on database integration with Lucene.
Since in the example data set there is only a single instance of a each term per document, the length of the vector will always be 1. Maybe I'm missing the details you're describing though. I'd be curious if you have a concrete example of where the computation fails to pick the best category. Thanks!
Also, the .properties files contain the sample book data, which is what gets indexed into build/index when you run "ant".
The index of the sample data is built the first time you run "ant" and it ends up in build/index.
The example data used by the book is in the "data" directory of the source code structure after it is unzipped.
Thanks for bringing this to the attention of readers. It is a known issue, of course. The book was written against Lucene 1.4.x, and the code is backwards compatible compilation-wise with Lucene 1.9.x. Lucene 2.0 removed lots of deprecated methods by design and community discussion.

Months ago before the release of Lucene 2.0 I updated all of the code examples to run with Lucene 2.0, noting all the changes required (which were all minor adjustments like you've found). I had intended to release that code, but haven't yet. I'll make that a priority and get it published as soon as possible.
See:

http://www.nabble.com/Lucene-in-Action-examples-complie-problem-tf2418478.html#a6743189

This details all the steps necessary to update the existing 1.4-based code to the new 2.0 API.

Let me know if you still have questions about the new API.
You've posted this to the wrong forum. This is for "Lucene in Action".
Please use this forum for questions regarding the book "Lucene in Action". For general Lucene questions, post to the official Java Lucene users list (java-user).
Could you provide the specific command and error message and stack trace that you encountered?

Are you using our code in your own context, or running the downloadable demo code as-is?
This is a general Lucene question, but this forum is specifically for topics about the book. Please inquire on the java-user e-mail list instead.
If a single document contains both Greek and English, you have some tough choices to make... do you want it indexed twice, once with each analyzer? Or do you want the analysis process to detect the language of each token (which is nearly impossible) and choose the right analyzer at that point? Or some other way?

As for doing two passes - you can pass an analyzer to the addDocument() method, it is overloaded and allows per-document control. Perhaps that is what you want to do.

In the future, please ask this type of general Lucene question on the java-user e-mail list. This forum is for questions specifically about Lucene in Action, not general Lucene questions.
This forum is strictly for issues related to Lucene in Action's text or supplied code.

However, I'll reply briefly here - it seems that Field-centric boosts would be a good thing to try. I really don't recommend MultiFieldQueryParser, personally.

Another option is to utilize Sort, first sorting by score, secondarily by the fields you desire to break ties with.

Use of Explanation will help greatly in understanding how scores are working.

Please follow up to this topic by signing up for the java-user e-mail list and asking further there.
This forum is strictly for questions about the book "Lucene in Action" and the associated code. For general Lucene issues, please ask on the java-user e-mail list after subscribing to it.
I'm the java-user moderator, and I allowed your message to pass through. Be sure to subscribe officially though (see the Lucene website for details) to have your messages flow through unmoderated.
Your question is a general Lucene usage question and is best asked on the java-user e-mail list. This forum is for questions directly pertaining to "Lucene in Action".
I recommend if you have such an enormous list that you page through it (perhaps using JDBC if coming from a database) rather than returning all matches at one time. No matter what you'll need to initially create the BitSet, which is known to be a potentially expensive operation. This is why it is recommended that Filters be cached and reused for subsequent searches and only invalidated when the filter needs to change.
Just to add a bit of trivia beyond referring you to java-user, you can adjust the scoring such that term frequency is a major component (perhaps even the only?) by toying with a custom Similarity implementation. We did not feel this was common enough to discuss in the first edition of LIA.
Though I can't imagine why it would hang, you do have to be careful with encoding. The default JVM setting for file encoding will be used with the FileInputStream by default - so the example code is not going to work properly if you're indexing files of various encodings. You may need to modify the example code to have the file read appropriately.
The necessary TextMining JAR files are in the Lucene in Action free code download available from http://www.lucenebook.com - if that is what you need.
Sure.... pull the text out of the database, toss it into Lucene, then search it smilie

Really there isn't anything special about indexing database data versus indexing any other type of content. There are stock solutions available. Hibernate, for example, has a hook to index to Lucene. DBSight is a full solution for indexing DB content with Lucene, and searching it.
> Hi,
> In the book Lucene in action page 181 to 182,
> there is an example about how to use multisearcher to
> search many indices. I have a few questions about
> this example:
>

> 1. What's the RMI interface in that example?
> 2. What's the class for that interface?

RemoteSearchable is the RMI interface. It implements UnicastRemoteObject.

> 3. Page 183, there is a line of code
>

> Hits hits = searcher.search(query);
>

> It seems that hits result can be sent back from RMI
> server to the client, but Hits is not a Serializable
> class, how can hits be sent back?

RemoteSearcher is a Searchable, not an IndexSearcher. It gets wrapped inside a MultiSearcher (which is used here because it takes an array of Searchable's and delegates to them). Right, Hits is not Serializable, so it works entirely locally using TopDocs under the covers.

I hope this info helps.

Erik
Lucene does build up caches for sorting. If you use the same IndexReader/IndexSearcher for searching then successive searches with sorting should be fast. I presume you're using new IndexSearcher instances? Try warming up the instance and reusing it and see if that fixes things.
> is it possible for lucene to "resort" the index while
> update it ?
>
> (i use the indexorder to save time and memory)

Resort what exactly? Documents are ordered by insertion time. Terms are sorted during indexing.

Please bring this issue to the general java-user@lucene.apache.org e-mail list as this is really a general Lucene topic rather than one related to the book. Thanks.
There is an ISOLatin1AccentFilter now in Lucene's Subversion repository and will be built into the core for the next release. You can use to inside a custom analyzer to flatten accented characters to their ASCII equivalent.
I'm sorry to defer this, but your questions are general questions unrelated to the book itself. This forum is specific to the book, such as errata.

Please ask on the java-user e-mail list where you'll find a very helpful and responsive community, including myself and Otis.

A couple of quick responses... you cannot write to an index from two different processes simultaneously. You'll need to queue or centralize this in some manner. As for incremental indexing, you could certainly do that, but you will be responsible for managing the IndexSearcher instances which will determine how fresh the content is during queries.

Erik


> Hi,
>
> I need to index content in a database. The content in
> database can be searched and updated by two separate
> applications on a frequent basis (with different user
> interfaces, one is a web application whereas the
> other one is not). I have two questions to ask:
>
> 1- Is it possible for two separate applications to
> simultaneously update the same index? (so that as
> soon as the content changes through either app, the
> index is updated immediately)
>
> 2- Is it a good idea to use the incremental indexing
> provided by Lucene to keep the index updated if the
> content is being changed on a Frequent basis? (I've
> read somewhere that in such cases the search engine
> will be adding or updating content all of the time.
> If an error occurs during the indexing process, the
> index may be out of synchronization with the
> content).
>
> Any suggestions for the best strategy to update index
> in my scenario will be greatly appreciated.
>
> Regards
bib - there is no difference in highlighting a search result and any other text based on search terms. the main trick is having position offsets handy, which can come from (re-)analyzing a document. in order to highlight the entire document rather than have it fragmented, you'll need to create a NullFragmenter (we use this on lucenebook.com) to hand to the Highlighter.

i hope this helps. with the examples in the book and the above hints it should be possible to do what you need, but if you need more assistance, let us know.
Here's the NullFragmenter we use with lucenebook.com:

package lia.web;

import org.apache.lucene.search.highlight.Fragmenter;
import org.apache.lucene.analysis.Token;

public class NullFragmenter implements Fragmenter {
public void start(String s) {
}

public boolean isNewFragment(Token token) {
return false;
}
}

The Highlighter API should lead you to how to use this fragmenter.
I recommend just posting a message to the java-dev e-mail list where all the experts lurk. If you don't want it to be identified to any particular company or yourself, do so somewhat anonymously with a generic e-mail account perhaps.

I don't know what other Lucene savvy folks do consulting-wise, so I can't recommend anyone in particular. There are several folks that contribute heavily that are not committers at this time also.

As for Doug - it is tough being someone like him when everyone assumes he's too good or busy or expensive to be asked. Then no one asks! The top folks are also regular folks and quite approachable. smilie
I'm not quite following what you're after. You want faceted browsing?

I'm doing faceted browsing (complete with counts per facet) using Lucene by caching BitSets representing each facet value, and using the cardinality of ANDing the BitSets with the query for counts per facet. I'm using term enumeration via an IndexReader to build the cached sets.

I'm sure this or some other techniques could be used to achieve what you're after.
> hey Erik,
> thanks for the reply,
> yes, i am looking for faceted browsing,
> is there anyway you can send me a code sample of what
> you are doing? or maybe elaborate a little on exactly
> what you are doing?

Hopefully this will come out ok as I paste in some code below. I have a method that loads facets when my search server starts up:

private void loadFacets() throws IOException {
System.out.println("Loading facets for " + reader.numDocs() + " documents ...");

facetCache = new HashMap();

String[] fields = {"creator", "date", "type", "archive"};
String[] unvalued = {"<unknown>", "<undated>", "<untyped>", "<unknown>"};

TermDocs termDocs = reader.termDocs();

for (int i = 0; i < fields.length; i++) {
String field = fields[i];
System.out.println(" for field " + field);
Map bitsetValues = new HashMap();

BitSet catchall = new BitSet(reader.numDocs());

TermEnum termEnum = reader.terms(new Term(field, ""));
while (true) {
Term term = termEnum.term();
if (! term.field().equals(field)) break;

termDocs.seek(term);
BitSet bitSet = new BitSet(reader.numDocs());
while (termDocs.next()) {
bitSet.set(termDocs.doc());
}

catchall.or(bitSet);

bitsetValues.put(term.text(), bitSet);

if (! termEnum.next()) break;
}

// invert catchall bitset so it indicates all the documents that do _not_ have any values for this facet
catchall.flip(0, reader.numDocs());

String unvaluedLabel = unvalued[i];
if (bitsetValues.containsKey(unvalued[i])) {
System.err.println("Field " + field + " already contains a value " + unvaluedLabel);
}
bitsetValues.put(unvaluedLabel, catchall);

facetCache.put(field, bitsetValues);
}


System.out.println("Done loading facets.");
}

And then when I search, I use this method:

private Hashtable facets(final BitSet constraintMask) {
// Loop over all facets, applying the constraint mask to every one
Hashtable constrainedFacets = new Hashtable();

Set keySet = facetCache.keySet();

BitSet tempBitSet = new BitSet(reader.numDocs());
for (java.util.Iterator facetIterator = keySet.iterator(); facetIterator.hasNext()smilie {
String key = (String) facetIterator.next();
Map valueMap = (Map) facetCache.get(key);

Hashtable constrainedValues = new Hashtable();

Set valueKeys = valueMap.keySet();
for (java.util.Iterator valueIterator = valueKeys.iterator(); valueIterator.hasNext()smilie {
String value = (String) valueIterator.next();
BitSet valueBitSet = (BitSet) valueMap.get(value);
tempBitSet.clear();
tempBitSet.or(constraintMask);
tempBitSet.and(valueBitSet);
int count = tempBitSet.cardinality();
if (count > 0) {
constrainedValues.put(value, new Integer(count));
}
}

constrainedFacets.put(key, constrainedValues);
}

return constrainedFacets;
}


The constraintMask is a BitSet built by AND'ing all user supplied constraints. I realize that this is just a bit of code and not standalone functional, but hopefully you can distill what you need from it, or use it as a starting point somehow. BitSet's may not be the best long-term data structure to use though, so be wary of memory if you've got a lot of facets.
ze'ev,

it would be perfectcly acceptable to e-mail java-user soliciting for a contract programmer for such a thing. or to java-dev to narrow the scope to the more hard-core types. you could even e-mail the committers directly (their contact info is on the Lucene site) - i doubt any of them would mind be solicited smilie

Erik
ze'ev - you're right that it won't scale with that kind of situation. in my environment, facets are common across many documents and thus there are many fewer facet values than documents.

The sparse filter included here - http://issues.apache.org/jira/browse/LUCENE-328 - might work out for you, though I have no experience with it yet.
> do you consult?

I do, but my schedule is currently very full. Feel free to e-mail me at erik@ehatchersolutions.com to discuss.
> Any other suggestions?

You gave me this bait, so I'll take it....

change operating systems ?! smilie

Seriously though, Windows is notorious for hanging on to files whereas *nix operating systems have no issues in this regard. I bet if exited the JVM, and then ran another program to delete the files, it'd work just fine. The JVM and Windows somehow hang on I've heard numerous times.
You want to delete the entire index, not just documents within it, right?

I've never had such a requirement, so I've not tried it myself. My hunch is that you're running on Windows and you're getting some kind of file-in-use kind of error. You certainly should make sure any reader, searcher, or writer is closed.

One option is to open an IndexWriter on the index, but set the create flag to true, and then close it. At the very least this would wipe out the existing index, but leave a directory out there still.

Sorry I'm not of more here here though.

Otis?
I'm not a log4j wiz myself, and have fought with its configuration numerous times in the past. Please refer to the log4j and PDFBox documentation for further assistance with this issue. I don't believe that warning should prevent any indexing from occuring though - though maybe there is another error that it is attempting to log that prevents indexing?
FileIndexer creates the index from scratch each time it runs. Look at the "true" argument to the IndexWriter constructor. Simply changing it to false won't be sufficient - but what you want is for it to be false if the index already exists, and true if it does not.
> Document doc = new Document();
> doc.add(Field.Keyword("indexdate",new Date() ));
>
> but while running the program it gave me the
> following error.
>
> Exception in thread "main"
> java.lang.NoSuchMethodError:
> org.apache.lucene.docume
> nt.Field.Keyword(Ljava/lang/String;Ljava/util/DatesmilieLo
> rg/apache/lucene/document/
> Field;

I'm not understanding that error. Field.Keyword(String,Date) is a valid method signature in Lucene. Something is amiss. Please post this error to the java-user e-mail list if you haven't solved it. Are you simply running the LIA examples using Ant and getting that error? Or how are you running it?

> Also could u elaborate on using YYYYMMDD format.

There is a lot of elaboration on this in the book, so please reference that discussion for more details. The short explanation is that YYYYMMDD is the recommended way to index dates that do not need times attached to them. Using Field.Keyword(String,Date) indexes down to the millisecond granularity and can cause issues when doing range queries, for example.
You need all the other .java files there to compile as well. Read the README and follow along with those instructions and everything will be compiled nicely for you. Adapting to your environment is really in your hands, though - all the code is there and its a matter of doing the right Ant, javac, Eclipse, IDEA, or whatever, configuration to get things compiled in your situation. We have the Ant side of things covered smilie
What errors are you seeing?

Neither Otis nor I used Eclipse during the writing of the book, and the build system was designed for use by Ant at the command-line.

Try typing "ant" from the command-line and ensure it works there. You'll need to work through the Eclipse issues on your own though, I'm afraid - it surely is a configuration issue. I use IDEA and the source code works fine for me.
You e-mailed me this exact question and I replied to you already.

Look at the code for Lucene in Action, specifically how our <index> task from Ant works to index the example .properties files.
Gaston, we cannot support your 3rd party server environment from this forum, unfortunately. My hunch is that you do not have the Lucene JAR file in the WEB-INF/lib directory. Keep in mind that the Lucene web demo is just an example, and a lousy example at that - and that to truly use Lucene you'll want to build a custom application around it rather than use the web demo.

I see you've already brought this topic up on the e-mail list - I'll reply over there from now on.