John Armstrong (1) [Avatar] Offline
I was familiar with Solr coming into your book but elasticsearch was completely new to me. Also my experience with Solr mostly involved complex lucene queries of the field: value type and very little with free-form “google” queries. For both these reasons I found the hands-on orientation of Chapter 3 very useful. I particularly appreciated the canned dataset that made it possible to replicate (or at least nearly so) the results in the book and associated files.

I was therefore frustrated when I moved from Chapter 3 (Space Jam problem) to Chapter 4 (Star Trek problem) and no longer had canned data to work with. Maybe I was just confused, but if not, is there any chance you could put canned data (TMDB records) for Chapter 4 and beyond into your doc/code repository?

Also I have to say I found the dependency of the book (or at least the downloadable content) on iPython/Jupyter notebooks kind of a pain, first because it’s almost impossible to install without taking the whole of anaconda and second because it’s completely at odds with my script-based test-driven Python workflow in which capturing and diffing before-change and after-change results plays a central rule. My workflow is actually much more aligned with the test-driven approach to relevance engineering that you discuss towards the end of the book than to the spontaneous “play around/try stuff” mode of working that iPython/Jupyter seems to be build for.

I wonder if you could have incorporated at least an elementary level of test-driven development into the book? Whatever the exact implementation, I think it would have been interesting to have a single TMDB dataset used for both the Space Jam and Star Trek problems. That would have set the stage for testing the latest and greatest not only on the current problem but also checking to see what effect it hadfor previously solved problems – e.g. making sure that the Star Trek solution doesn’t break the previously developed Space Jam solution. This checking could be done informally but could be turned into a real test cycle by augmenting the canned data with some very simple judgments of the sort you describe in the testing chapter.

Questions and gripes aside, I think Relevant Search is very useful book that strikes a good balance between seeing the challenge of relevance close-up in actual code and seeing it from a broader strategic organization-level perspective.

John Armstrong in Cambridge MA
Doug Turnbull (15) [Avatar] Offline
Good feedback John! Hindsight is always 20/20 and you make a lot of good points. We struggled with the decision when to introduce test driven relevancy. Do we make the whole book test-driven and layer in concepts in an existing framework? This might have required us to consistently build up one example application over the book. It would have also required some time to weave those concepts early and throughout. I think its fair to say though for the parts in ch 5-7 that test driven relevancy could have been useful, as I'm mostly dorking around with different approaches to demonstrate the ideas.

We sort of intentionally decided to break things up into some chapters against one data set (TMDB) and others against different kinds of problems (ch 4 and 9 come to mind) so readers could see a variety of problems being solved. Interestingly our feedback ranged from "do something different each chapter" to "make a single application throughout the book" so we sort of took a middle path. It's actually really hard to stay with one data set and do everything you'd like with it.

Sorry installing the iPython notebooks was difficult. It was a tool we found conducive to writing, as we could write in iPython notebook with executable examples. So that's why we went that route. Many readers have said they love it as they can play with the examples. But I realize it does add some burden for folks not used to iPython notebook.

Anyway, thanks for the good stuff for 2nd edition. We'll definitely keep your feedback in mind should they convince us to write a second edition smilie