Would you like to scale data-intensive tasks horizontally? Would you like an open source project that gave you that foundation?
Well, there is: Apache Hadoop. It's a Java software framework for supporting data-intensive distributed applications. The framework was inspired by Google papers on their MapReduce framework and Google File System.
Who uses Hadoop? Here's a short list: Yahoo!, A9.com, LinkedIn, Facebook, ImageShack, eHarmony, Hulu, Last.fm, and The New York Times. The highest profile user, Yahoo!, is also a major contributor to the project. They use it extensively in their web search and advertising divisions.
In this talk, titled "Is there room for another elephant in Tucson?", Andrew Lenards will tell us about Hadoop and describe how it could be applied to several practical problems, even if you aren't as big as Google.
16. MapReduce + RDBMS, not versus Source: "Hadoop: The Definite Guide, Tom White" Traditional RDBMS MapReduce Data Size Gigabytes Petabytes Access Interactive & batch Batch Updates Read and write many times Write once, read many times Structure Static schema Dynamic schema Integrity High Low Scaling Nonlinear Linear
45. http://creativecommons.org/licenses/by-nc-sa/3.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site.