Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 38 Publicité

Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks

Slides from JupyterCon 2018 in NYC on 8/23/2018.

Notebooks have moved beyond a niche solution at Netflix; they are now the critical path for how everyone runs jobs against the company’s data platform. From creating original content to delivering bufferless streaming, Netflix relies on notebooks to inform decisions and fuel experiments across the company. Netflix also uses notebooks to power its machine learning infrastructure and run over 150,000 jobs against its 100 PB cloud-based data warehouse every day. The goal is to deliver a compelling notebooks experience that simplifies end-to-end workflows for every type of user. To enable this, Netflix is investing deeply in notebook infrastructure and open source projects such as nteract.

In this talk, Michelle Ufford and Kyle Kelley share interesting ways Netflix uses data and some of the big bets the company is making on notebooks. Topics will include architecture, kernels, UIs, and Netflix’s open source collaborations with projects such as Jupyter, nteract, pandas, and Spark.

Slides from JupyterCon 2018 in NYC on 8/23/2018.

Notebooks have moved beyond a niche solution at Netflix; they are now the critical path for how everyone runs jobs against the company’s data platform. From creating original content to delivering bufferless streaming, Netflix relies on notebooks to inform decisions and fuel experiments across the company. Netflix also uses notebooks to power its machine learning infrastructure and run over 150,000 jobs against its 100 PB cloud-based data warehouse every day. The goal is to deliver a compelling notebooks experience that simplifies end-to-end workflows for every type of user. To enable this, Netflix is investing deeply in notebook infrastructure and open source projects such as nteract.

In this talk, Michelle Ufford and Kyle Kelley share interesting ways Netflix uses data and some of the big bets the company is making on notebooks. Topics will include architecture, kernels, UIs, and Netflix’s open source collaborations with projects such as Jupyter, nteract, pandas, and Spark.

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks (20)

Publicité

Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks

  1. 1. Notebooks @ Netflix. From analytics to engineering with notebooks. Michelle Ufford @MichelleUfford Kyle Kelley @rgbkrk
  2. 2. Data @ Netflix
  3. 3. 130 million members
  4. 4. * Well, almost anywhere. Anywhere in the world.*
  5. 5. Any device.
  6. 6. • 1 trillion events • 100PB data warehouse • 150,000 Genie jobs Data at scale.
  7. 7. $8 billion on content in 2018
  8. 8. Data Platform @ Netflix
  9. 9. Data driven.
  10. 10. subscriber activity 20180822
  11. 11. RAW data pipeline fast storage data viz events data data storage Pig DW RPT interactive query data movement data access subscriber activity 20180822
  12. 12. data scientists data engineers data viz engineers quantitative analysts product managers research scientists analytics engineers executivessoftware engineers algorithm engineers Data Products Data Insights Business Decisions & Product Improvements business analysts technical pgm mgr INSIGHTS DATA ML scientists
  13. 13. Notebooks @ Netflix
  14. 14. data scientists data engineers data viz engineers quantitative analysts research scientists analytics engineers algorithm engineers Data Products Data Insights INSIGHTS DATA ML scientists PRODUCTIONALIZATION product managers executivessoftware engineers Business Decisions & Product Improvements business analysts technical pgm mgr
  15. 15. Native support for parameterization.
  16. 16. What’s Next?
  17. 17. Better Scala Support
  18. 18. Scala support. • Kernel stability • Native data viz • Spark integration Goal: meet/exceed parity with top Scala notebook offerings
  19. 19. More Integration
  20. 20. Integration. • Scheduling • Surfacing logs & error messages • Native Goal: provide a single, cohesive platform experience
  21. 21. Improved Reliability
  22. 22. Reliability. Goal: build confidence in using notebooks for mission-critical workloads • Kernel stability • Visibility into kernel state • Automated source control
  23. 23. data scientists data engineers data viz engineers quantitative analysts research scientists analytics engineers algorithm engineers Data Products Data Insights INSIGHTS DATA ML scientists product managers executivessoftware engineers Business Decisions & Product Improvements business analysts technical pgm mgr
  24. 24. 1. Simple 2. Integrated 3. Collaborative Design Principles.
  25. 25. Open source.
  26. 26. Thank you. Michelle Ufford @MichelleUfford Kyle Kelley @rgbkrk Netflix Data @NetflixData Tech Blog techblog.netflix.com Netflix Jobs jobs.netflix.com

×