Susan Harkins (332) [Avatar] Offline
#1
Please post errors in the published version of Elasticsearch in Action here. We'll compile and publish a comprehensive list for everyone's convenience. Thank you!

Susan Harkins
Errata Editor
Jelmer Kuperus (1) [Avatar] Offline
#2
In chapter 10 the recommendation is to

"Combine filters that use bitsets in a bool filter and filters that don’t in and/or/ not filters."

This seems not to be needed for some time now : see https://github.com/elastic/elasticsearch/issues/7228
radu.gheorghe (54) [Avatar] Offline
#3
Thanks for reporting! I missed that update for some reason. I think at least part of the recommendation stands for users of Elasticsearch 1.x (as in 2.x we only have the bool query), in that and/or/not don't use bitsets. Though one could use the bool filter alone, it also works if you use it as recommended in chapter 10, I've seen many cases where and/or/not are used as a shortcut.

So while I would definitely write that part differently, I wouldn't add an errata entry for this correction, as it might introduce more confusion.
finleycw1 (1) [Avatar] Offline
#4
In section 5.4.2. Tokenization, I believe some of the samples are incorrect.

For example, the keyword tokenization of 'Hi, there.' should be 'Hi, there.', but the book says the tokens are 'Hi' and 'there'.

Similarly, the whitespace tokenization should be 'Hi,' and 'there.', but the book shows 'Hi' and 'there'.
ice_lc (23) [Avatar] Offline
#5
On the same page under Lowercase the command should be:
% curl -XPOST 'localhost:9200/_analyze?tokenizer=lowercase' -d 'Hi, there.'
ice_lc (23) [Avatar] Offline
#6
Page 133: "The pattern that’s specified should match the spacing characters;
for example, if you wanted to split text on any two-digit number, you could create a
custom analyzer that breaks tokens at wherever the text .-. occurs, which would
look like this:"

What 2 digits? The separator is .-., not 2 digits.
radu.gheorghe (54) [Avatar] Offline
#7
Thanks for reporting! We'll add those to the errata.
petera (3) [Avatar] Offline
#8
Install Problem
I have downloaded the zip version and unpacked it, onto a machine running windows 7, 64 bit.
However when I try to enter bin\elasticsearch.bat I get "The system cannot find the path specified"
If I type dir bin\elasticsearch.bat, I see the expected response.
If I create my own bat file, and place it in the bin directory, it executes fine.
If I rename elasticsearch.bat to, say, test.bat, I again get "The system cannot find the path specified".
If I try saving the elasticsearch.bat file as UTF-8, rather than ANSI, it executes through to the point where it is looking for the "elasticsearch.in.bat" file, which produces the same error message as above.
Note that first time up elasticsearch.bat executed to the point where it told me that JAVA_HOME was not set.
I went away and installed java (jre1.8.0_73) set up JAVA_HOME, and started getting the above behaviour.

Cheers,

Peter Adams
padams@actrix.co.nz
radu.gheorghe (54) [Avatar] Offline
#9
Hi Peter,

That's strange, though I haven't worked with Elasticsearch on Windows 7 in a while, so I can't say much. And I don't have a Windows 7 box to reproduce now.

Can you try running it as a service instead? https://www.elastic.co/guide/en/elasticsearch/reference/2.2/setup-service-win.html It should work with a recent version of Elasticsearch.

Best regards,
Radu
petera (3) [Avatar] Offline
#10
Thank you for your reply. Now running ok.
Complete story.
1. Downloaded and unpacked 64 bit zip file onto Windows 7.
2. Ran bin\elasticsearch.bat and got JAVA_HOME not defined message.
3. Installed java (jre1.8.0_73) from Oracle and defined JAVA_HOME (not done as part of install). Deleted elasticsearch folders and unzipped download file again.
4. Ran bin\elasticsearch and got "The system cannot find the path specified". Dir bin\elasticsearch.bat showed that the file was present.
5. Edited (open and save as) each bat file in the bin folder with notepad. Saved file as UTF-8.
6 Ran bin\elasticsearch. Script had errors because notepad seemed to put extra junk characters at front of bat file.
7. Repeated step 5 with Atom editor, and everything ran OK.

My conclusion: Installing Java somehow interfered with the running of bat files.

Cheers,
radu.gheorghe (54) [Avatar] Offline
#11
Thanks for the follow-up! This sounds strange indeed, but I'm glad you managed to sort it out.
franamergerm (6) [Avatar] Offline
#12
Listing 6.18 starting on page 170 Chapter 6 needs to be updated to reflect the change recently made to mapping.json and populate.sh concerning the location_group and location_event fields. Line 34 of the listing needs to be changed from
 ...
"gauss": {
    "geolocation": {
        "origin": "40.018528, -105.275806",
        "offset": "100m",
        "scale": "2km",
        "decay": 0.5
    }
}
...
to
...
"gauss": {
    "location_event.geolocation": {
        "origin": "40.018528, -105.275806",
        "offset": "100m",
        "scale": "2km",
        "decay": 0.5
    }
}
...


Also, to avoid raising an exception Missing value for field [reviews] on line 19
...
"field": "reviews",
...

either a filter needs to be inserted as follows
curl -XPOST "http://localhost:9200/get-together/_search?pretty" -d'{
{
    "query": {
        "function_score": {
            "query": {
                "match_all":{}
            },
            "filter": {
                "exists": {
                   "field": "location_event.geolocation"
                }
            }, 
...

or the URI needs to be
curl -XPOST "http://localhost:9200/get-together/event/_search?pretty" -d'{

There are probably other ways to accomplish the same goal but the above works in ES 2.2.0.
ice_lc (23) [Avatar] Offline
#13
A duplicate text on pages 94-95: "Additionally, Elasticsearch gives you the ability to manually specify whether a filter should be cached, as well as the ability to manually specify whether a filter should be cached."
ice_lc (23) [Avatar] Offline
#14
Page 95: "Other types of filters aren’t automatically cached if Elasticsearch can tell they’ll
never be used again or if the bitsets are trivial to recreate. An example of a query
that’s hard to cache is a filter that limits the results to all documents of the last hour.
This query changes every second when you execute it and therefore there’s no reason
to cache it. Check listing 4.17 to see an example."

Listing 4.17 on page 98 does not contain a filter at all.
Susan Harkins (332) [Avatar] Offline
#15
Susan Harkins (332) [Avatar] Offline
#16
Susan Harkins (332) [Avatar] Offline
#17
507601 (1) [Avatar] Offline
#18
Up to (and including) Chapter 4 I have the following remarks:

Chapter 1.2.6 „Structuring your data in Elasticsearch“
Here something is run togetheter which doesn‘t fit together (something pasted over the correct content?):
„If you have an SQL background, you might miss the ability to use joins. Unfortu- nately, they’re not supported, at least in version 1.76 installed. Once that’s in place, you’re typically only a download away from getting Elasticsearch ready to start.

Chapter 2.2.4 „Distributed indexing and searching“
In Figure 2.8 „Indexing operation is forwarded to the responsible shard and then to its replicas.“ the arrow for indexing a document to shard 1 should start from the box of node 1 and not the box of the shard get?together0.

(the same applies to Figure 2.4 „Documents are indexed to random primary shards and their replicas. Searches run on complete sets of shards, regardless of their status as primaries or replicas.“ of chapter 2.2.1 „Creating a cluster of one or more nodes“)

Chapter 3.5.2 „Implementing concurrency control through versioning“
In Figure 3.4 „Without concurrency control, changes can get lost.“
Sequence should be as in Figure 3.5, i.e. update 2 should finish first, due to the sleep in update 1.

Chapter 4.1.2 „Basic components of a search request“
Listing 4.4 „Limiting the fields from source that you want in the response“
The underscores are outside of the double quotes of the keys (e.g. _“index“ instead of „_index“)

Chapter 4.2.1 „Match query and term filter“
What is the other ability (the same ability is mentioned twice)?

„Additionally, Elasticsearch gives you the ability to manually specify whether a filter should be cached, as well as the ability to manually specify whether a filter should be cached.“

Chapter 4.2.2 „Most used basic queries and filters“
In the section „query_string query“ the following paragraph links to the note about the complexity of lucene queries which can lead to errors or not intended results. But the end of the paragraph is about the risk of users breaking the server with searches that will lead to too many results, so I don‘t see the connection here. So maybe it should be pointed out, why the note preceding the paragraph is an example for this.

The paragraph to which this applies:
„One big disadvantage with the query_string query is that it has great power. Giving your website users this power might put your Elasticsearch cluster at risk. If users start entering queries with the wrong format, they’ll get back exceptions; it’s also possible to make combinations that would return the world and that way put your cluster at risk. See the previous note for an example.“

In the succeeding paragraph „simple_query_string“ is spelled with hyphens instead of underscores:
„Suggested replacements for the query_string query include the term, terms, match, or multi_match queries, all of which allow you to search for strings within a field or fields in a document. Another good replacement is the simple-query-string query; this is meant to be a replacement with easy access to a query syntax using +, -, AND, OR. More on these queries in the sections that follow.“

Chapter 4.3.1 „bool query“
Listing 4.20 „Combining queries with a bool query“
The result is not correctly aligned (tabs vs spaces?)

Chapter 4.5.2 „Missing filter“
The function of the field „existence“ is not clear after reading the text and examples.
517501 (1) [Avatar] Offline
#19
On Listing 2.2 the field "location" doesnt exist. It should be location_group.
Susan Harkins (332) [Avatar] Offline
#20
An updated errata list is available at https://manning-content.s3.amazonaws.com/download/d/6215d9c-6166-4380-8924-815d2f1d085f/Gheorghe_ElasticSearchIA_Err6.html. Thanks everyone!

Susan Harkins
Errata Editor