Slides from Curtis Huang's talk at the Couchbase Meetup in Mountain View on August 18th. Curtis is a Senior Software Engineer at Facebook working on Machine Learning, with experience in both ad tech and search.
"AI and machine learning have transformed the technology industry for the last decade, creating a foundation for web search, ranking/recommendation, and object/speech recognition. In this talk, I will discuss a collection of machine learning approaches to effectively analyzing and modeling large-scale data. From a hands-on practitioner's perspective, I will talk about the process of building a ML pipeline from idea to production, the challenges, and lessons learned. As an example, I will describe the infrastructure and components of a modern ML ranking system."
2. • Robotics
• Pricing and Optimization
• Big Data, Hadoop and Spark
• Data Science, ML in Display Advertising
• ML, Relevance in Sponsored Search
• Contenting Ranking for FB Posts
About Me
3. • Advantages from mining/learning patterns in data
• Cost of Storage and Compute
• Distributed Systems
Machine Learning
Why Now?
9. Example of a ML System
Datastore
ETL
Ad-hoc
Analysis
ML
Framework
Distributed
KV-Store
Snapshot Realtime
Features
Algorithm
Service
Logging
Service
10. • Ad-hoc Analysis
• Adding and Validating New Features
• Gap between Online/Offline Metrics
• System/Other Issues
Challenges and Lessons Learned
4 V’s of Big Data