Crimson Reason: MapReduce Book

Monday, January 21, 2013

MapReduce Book

The book "MapReduce Design Patterns" by Donald Miner and Adam Shook is a good intermediate resource on MapRedue.

Each pattern is explained in context, with pitfalls and caveats clearly identified to help avoid common design mistakes when modeling large data architecture. It also provides a complete overview of MapReduce that explains its origins and implementations, and why design patterns are so important. All code examples are written for Hadoop. They are:

Summarization patterns: get a top-level view by summarizing and grouping data
Filtering patterns: view data subsets such as records generated from one user
Data organization patterns: reorganize data to work with other systems, or to make MapReduce analysis easier
Join patterns: analyze different datasets together to discover interesting relationships
Metapatterns: piece together several patterns to solve multi-stage problems, or to perform several analytics in the same job
Input and output patterns: customize the way you use Hadoop to load or store data

This book does not have the step-by-step instructions of a "recipe" book, thus avoiding line-by-line breakdowns and delivering a lot of content in its 436 pages. (There is also a usable summary in 30 or so pages.)

Crimson Reason

Monday, January 21, 2013

MapReduce Book

No comments:

Useful Links

Topics

About Me

Blog Archive