The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

kwib (6) [Avatar] Offline
#1
Hi all,

When I changed the variable "searchText" from the word "highlighting" to sentence like "whereby specific words in context of their" and then ran the program, only "whereby specific words context" highlighted, the stop words like "in" & "their" got filtered out.

So, what change do I need to make in order to let common stop words to show?

Also other occurrences of "context", "specific" and "word" have been highlighted - how can I specify only to highlight the matching whole sentence?



Thanks,

Message was edited by:
kwib
kwib (6) [Avatar] Offline
#2
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
Tried to post the example code, but it cannot maintain the format properly.
ErikHatcher (211) [Avatar] Offline
#3
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
You have to remove stop word filtering from the analyzer you're using with the highlighter.
kwib (6) [Avatar] Offline
#4
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
Thanks for replying.

The example uses StandardAnalyzer.

I tried to use "public StandardAnalyzer(string[] stopWords)" to override the default English stop words with mine(space or other valid words) but with no luck. They seemed only to add up to the existing default stop words.

So how could I "remove stop word filtering from analyzer"?


Thanks,

Message was edited by:
kwib
ErikHatcher (211) [Avatar] Offline
#5
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
That seems right - are you changing the analyzer you're indexing with or the analyzer you're passing to the highlighter? Best to make sure you change both instances.
kwib (6) [Avatar] Offline
#6
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
Thanks for the helpful tip.

It turned out I didn't apply the same analyzer to both QueryParser() and TokenStream() in the example, after I made that change and instantiated an empty stop words analyzer, I got expected result.

But still a remaining issue - not only the whole sentence got highlighted but those tokens partially match with the sentence also highlighted. From the research I've done it seemed this(only the matching whole sentence highlighted) is impossible to achieve?



Thanks,
kwib (6) [Avatar] Offline
#7
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
Is the PhraseQuery a valid method to use with the Highlighter, with the appropriate slop to filter out the rest of partially matched terms?



Thanks,
ErikHatcher (211) [Avatar] Offline
#8
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
We're going into pure Lucene support here, and it's best if you use the java-user@lucene.apache.org mailing list for your specific highlighting questions.

This forum is purely about supporting the Lucene in Action book.

Thanks for understanding.
kwib (6) [Avatar] Offline
#9
Re: Highlight the default stop words (8.4.3 Highlighting with CSS example )
Sorry, didn't realize this is bit of off the topic.

Already redirected to Lucene user mailing list.

Thanks for the help so far.