264139 (1) [Avatar] Offline
#1
Hi Alex,

It is really annoying read in a book for some interesting project and see that the project never started. I'm talking about your supposed hdfscompact project than after one year the only thing in the repository is the name of the project.
https://github.com/alexholmes/hdfscompact.

It's good to know that in some time in the past you had the intention of develop that tools, but to be honest I don't buy books to read about author plans and dreams. I expected more serious content, and in this kind of technical books, I expected that the author have done the dirty work to review the tools/frameworks and only publish about projects enough mature. Try to promote your projects with an empty project was a waste of my time.
Alex Holmes (47) [Avatar] Offline
#2
I understand your frustration at the project not being complete and I apologize for the inconvenience. However I will say that the language used in the book makes it pretty clear that the project is a work in progress:

"I’m writing a simple file compacter that’s compatible with Hadoop 2 at https://github.com/alexholmes/hdfscompact."

If you're in need of compaction right away, then you have a few options available to you:

- Write an identity MapReduce job with N reducers, where N is the number of output files you want to create
- Use Pig or Hive to run a similar identity job

If I get feedback from others that completing my project would be of help, then I'll re-prioritize my work to make it happen sooner.

Thanks,
Alex