Testing tools and AI - ideas what to try with some tool examples
20080115yahoobrickhouse
1.
2. The Facebook Data team
Infrastructure and insight
Jeff Hammerbacher
Engineering Manager
January 16, 2008
3. Agenda
1 Motivations
2 People
3 Products
4 Philosophy
4. Motivations
▪ Source data living on a horizontally partitioned MySQL tier
▪ Intensive historical analysis difficult
▪ Difficult to assess impact of changes to the site
▪ First try: Reporting and Analysis
▪ Second try: Data Infrastructure and Data Insight
▪ Third try: Data
5. People
Infrastructure
▪ Rama Ramasamy
▪ Suresh Antony
▪ Avinash Lakshman
▪ Hao Liu
▪ Joydeep Sen Sarma
▪ Ashish Thusoo
▪ Pete Wyckoff
6. People
Insight
▪ Rupen Parikh
▪ Itamar Rosenn
▪ Ravi Grover
▪ Roddy Lindsay
▪ Cameron Marlow
▪ Danny Ferrante
▪ Ding Zhou
9. Philosophy
▪ Data management as important as data analysis
▪ Know your data center: compute, storage, interconnect
▪ Shared nothing means the interconnect is the bottleneck
▪ Engineers are not analysts
▪ You must write code
▪ Don’t collect data without a purpose
▪ Basic statistics is more useful than advanced machine learning
▪ Visualization is as important as basic statistics
▪ Analysis and not reporting
▪ Developing products might be better than YAS
10. (c) 2008 Facebook, Inc. or its licensors. "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0