These are the introductory slides I used (in some form or another) for the Let's Talk Operations! sessions for the 2014 Hadoop Summits. No video for this one!
5. One big
grid
Grid per
project
• Pros!
• Lower ops overhead!
• One location for all data!
• Cons !
• Dev and Prod on one
system
• Pros!
• Capacity planning per project!
• Cons !
• More headcount to maintain!
• Multiple copies of data!
• Data ingress is a mess
10. • Common issues!
• Version incompatibilities!
• Network bandwidth consumption!
!
• Some tricks!
• Use WebHDFS!
• All modern versions support it!
• Read and write in both directions!
• Create a separate queue with hard limits!
• Pull from larger, push from smaller