This 65-page research contains 7 tables and 83 diagrams that compare the main features of the major Hadoop distributions and demonstrate performance results for 4-, 8-, 12-, and 16-node clusters—measured under 7 types of workloads.
Read on to discover:
- How cluster size affects the speed of data processing
- How clusters of different size behave under CPU and disk-bound workloads, such as Bayes, DFSIO, Hive aggregation, PageRank, Sort, TeraSort, and WordCount
- 83 diagrams that illustrate the overall cluster performance and performance per node in each of the seven scenarios
- 5 tables that demonstrate how the amount of data changes during the MapReduce process
- The limitations that may slow down a cluster and learn how to avoid them
For more benchmarks and research papers, visit www.altoros.com. To stay tuned for the latest updates, follow @altoros.