Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 81 Publicité

Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix

Télécharger pour lire hors ligne

In this conference session we share how we are using Tableau “out of the box” and also describe how it fits into our overall data environment. In addition, we’ll describe how we expect to use the Data Catalog and Object Model, our explorations of large-scale data stores, and challenges we are working on including governance and data lineage. Video of session can be viewed here: https://youtu.be/Nr24tw3dmZQ

In this conference session we share how we are using Tableau “out of the box” and also describe how it fits into our overall data environment. In addition, we’ll describe how we expect to use the Data Catalog and Object Model, our explorations of large-scale data stores, and challenges we are working on including governance and data lineage. Video of session can be viewed here: https://youtu.be/Nr24tw3dmZQ

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix (20)

Publicité

Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix

  1. 1. Binging on Data: Enabling Analytics at Netflix BLAKE IRVINE TABLEAU CONFERENCE 2018
  2. 2. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  3. 3. World Markets
  4. 4. World Markets
  5. 5. ANALYTICS BIG DATA BLAKE IRVINE | TABLEAU CONFERENCE 2018
  6. 6. Intro & Topics BLAKE IRVINE | TABLEAU CONFERENCE 2018
  7. 7. BLAKE IRVINE | TABLEAU CONFERENCE 2018 D A T A E N G I N E E R I NG + I N F R A S T R U C T U R E
  8. 8. ● Binging on Data ● Enabling Analytics ● Tableau Environment & Challenges Topics BLAKE IRVINE | TABLEAU CONFERENCE 2018
  9. 9. BLAKE IRVINE | TABLEAU CONFERENCE 2018 DATA ULTURE
  10. 10. Binging on Data BLAKE IRVINE | TABLEAU CONFERENCE 2018
  11. 11. ● Data and Analytics are embraced across the company ○ Engineering, UX, Customer Service, Finance, & more ● A/B Testing of almost everything... ○ Product, Signup Methods, Payments, Messaging, & more ● Algorithms for... ○ Recommendations, Content, Marketing, & more Data is Ubiquitous BLAKE IRVINE | TABLEAU CONFERENCE 2018
  12. 12. Employees BLAKE IRVINE | TABLEAU CONFERENCE 2018 5000 employees 300 in data teams 200+ in dedicated analytic teams
  13. 13. Analytic Ecosystem BLAKE IRVINE | TABLEAU CONFERENCE 2018
  14. 14. BLAKE IRVINE | TABLEAU CONFERENCE 2018 @
  15. 15. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  16. 16. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  17. 17. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  18. 18. Enabling Analytics BLAKE IRVINE | TABLEAU CONFERENCE 2018
  19. 19. How do we Enable Analytics? BIG DATA BLAKE IRVINE | TABLEAU CONFERENCE 2018
  20. 20. Enablement BLAKE IRVINE | TABLEAU CONFERENCE 2018
  21. 21. Big Data Portal
  22. 22. Data Platform BLAKE IRVINE | TABLEAU CONFERENCE 2018
  23. 23. Big Data Portal BLAKE IRVINE | TABLEAU CONFERENCE 2018
  24. 24. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  25. 25. Why not use Tableau Server? BLAKE IRVINE | TABLEAU CONFERENCE 2018
  26. 26. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  27. 27. Data Portal - Tables BLAKE IRVINE | TABLEAU CONFERENCE 2018
  28. 28. Data Portal - Tables 2 BLAKE IRVINE | TABLEAU CONFERENCE 2018
  29. 29. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  30. 30. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  31. 31. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  32. 32. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  33. 33. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  34. 34. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  35. 35. Enabling Analytics BLAKE IRVINE | TABLEAU CONFERENCE 2018
  36. 36. People
  37. 37. ● Highly Aligned, Loosely Coupled Alignment Process Context BLAKE IRVINE | TABLEAU CONFERENCE 2018
  38. 38. ● Vertical Teams Organization Content Marketing Growth Tech Data Engineering Science & Analytics Business Teams Analytic Teams Engineering Teams #content-analytics #marketing-analytics #growth-analytics #tech-analytics BLAKE IRVINE | TABLEAU CONFERENCE 2018
  39. 39. ● Growing user base ● We’ve started up: ○ A Tableau User Group ○ Education tracks ● Early days... much more to do here! ○ Office Hours ○ Tableau Days ○ Data Doctor & more Community BLAKE IRVINE | TABLEAU CONFERENCE 2018
  40. 40. Enabling Analytics BLAKE IRVINE | TABLEAU CONFERENCE 2018
  41. 41. Tableau Environment BLAKE IRVINE | TABLEAU CONFERENCE 2018
  42. 42. Overview 2000+ Users 250 Developers On version 10.4 Q4 → 2018.2 BLAKE IRVINE | TABLEAU CONFERENCE 2018
  43. 43. Tableau Servers BLAKE IRVINE | TABLEAU CONFERENCE 2018 7 x = > 448 vCPU > 1.8 TB RAM > 175 Gigabit IO
  44. 44. Cluster Config BLAKE IRVINE | TABLEAU CONFERENCE 2018
  45. 45. ● The vast majority of our data sources are Extracts ○ Very few live connections ● Why? ○ BIG DATA ○ Some direct connections to Presto or MPP ● Extracts provide an aggregation and caching layer We Love Data Extracts! BLAKE IRVINE | TABLEAU CONFERENCE 2018
  46. 46. Tableau Data Sources 1500 50% via Extract API 50% run on Server BLAKE IRVINE | TABLEAU CONFERENCE 2018
  47. 47. 1 Use Big Data Portal to develop query 2 Commit query to ETL repository & deploy 3 Configure ETL workflow so data dependencies are met 4 Use ETL job to publish TDE to server 5 Connect to TDE, Develop Viz, Publish to server, Share “Best Practice” Pattern BLAKE IRVINE | TABLEAU CONFERENCE 2018
  48. 48. 1 Use Big Data Portal to develop query 2 Paste the query into Tableau 3 Develop Viz 4 Publish, and Schedule data refresh on Tableau Server “Self-Serve” Pattern BLAKE IRVINE | TABLEAU CONFERENCE 2018
  49. 49. ● “Best Practice” Pattern is: ○ More robust ○ But complex ● “Self-Serve” Pattern is: ○ Easy and convenient ○ Less scalable ○ Harder to manage Dilemma... BLAKE IRVINE | TABLEAU CONFERENCE 2018
  50. 50. easy Publish & Refresh BLAKE IRVINE | TABLEAU CONFERENCE 2018
  51. 51. BIG DATA PORTAL Enabling Analytics BLAKE IRVINE | TABLEAU CONFERENCE 2018
  52. 52. Challenges at Netflix BLAKE IRVINE | TABLEAU CONFERENCE 2018
  53. 53. ● Data Scale ● Data Lineage ● Push Reporting Challenges BLAKE IRVINE | TABLEAU CONFERENCE 2018
  54. 54. Challenge 1: Data Scale
  55. 55. We have REALLY big data 1 Trillion New Data Events Daily 150 Petabyte Warehouse 300 Terabytes Written Daily 5 Petabytes Read Daily BLAKE IRVINE | TABLEAU CONFERENCE 2018
  56. 56. ● Data volume ● Level of Detail Constantly Balancing ● Speed of access ● Data prep BLAKE IRVINE | TABLEAU CONFERENCE 2018
  57. 57. Development Choices Choice 1 Choice 2 Choice 3 Data Engine MPP Cloud TDE Data Size < 1B rows < 10B rows < 100M rows Performance Up to many minutes Many minutes Up to many seconds BLAKE IRVINE | TABLEAU CONFERENCE 2018
  58. 58. ● For REALLY big data use cases ● For very fast interactivity ● For custom UI/UX/dataviz ● Custom Analytic Tools ○ Web app built with Javascript ○ Data stored in Druid Choice 4... BLAKE IRVINE | TABLEAU CONFERENCE 2018
  59. 59. ● Druid ○ An open source data system for analytic applications ○ Distributed, horizontally scalable architecture ○ VERY, VERY fast ○ Queries are in JSON format to REST endpoint Druid white paper: http://static.druid.io/docs/druid.pdf BLAKE IRVINE | TABLEAU CONFERENCE 2018
  60. 60. ● Can we connect Tableau to Druid? ○ All the performance benefits of Druid... ○ Tableau or web apps use same data store… ● We are exploring this... ○ There is now a Druid SQL layer based on Apache Calcite ○ Have done some testing, finding limitations Tableau ? BLAKE IRVINE | TABLEAU CONFERENCE 2018
  61. 61. ● TDE -> Hyper with 2018.2 upgrade ○ Happening now(ish) ○ Expectations: faster for small and medium data (<100M) ● Snowflake ○ Fast for “large” data stores (1B+) ● Data scale is always a challenge! In the meantime... BLAKE IRVINE | TABLEAU CONFERENCE 2018
  62. 62. Challenge 2: Data Lineage
  63. 63. ● Where did this data come from? ● Can I trust this data? Challenge 2: Data Lineage ● Tableau PRO: very easy to pull in data, analyze, and publish ● Tableau CON: very easy to pull in data, analyze, and publish BLAKE IRVINE | TABLEAU CONFERENCE 2018
  64. 64. Example BLAKE IRVINE | TABLEAU CONFERENCE 2018
  65. 65. Workbooks Data Sources Data Tables BLAKE IRVINE | TABLEAU CONFERENCE 2018
  66. 66. ● ...but not about Tableau We have Data Lineage... BLAKE IRVINE | TABLEAU CONFERENCE 2018
  67. 67. ● Can the upcoming Metadata APIs and Object Model help? ● Metadata APIs: ○ Inventory of workbooks, data sources, and metrics ○ Identify similar existing data and workbooks? ● Automate building of similar insights, and integrate to our existing data lineage system Metadata APIs BLAKE IRVINE | TABLEAU CONFERENCE 2018
  68. 68. Data Model
  69. 69. Data Model
  70. 70. ● Better practices across our “vertical” teams ● Manual / brute force methods ● Potentially evaluate Alation, Unifi, Collibra, AtScale, Dremio In the meantime... BLAKE IRVINE | TABLEAU CONFERENCE 2018
  71. 71. Challenge 3: Push Reporting
  72. 72. Challenge 3: Push Reporting BLAKE IRVINE | TABLEAU CONFERENCE 2018
  73. 73. What we do... BLAKE IRVINE | TABLEAU CONFERENCE 2018
  74. 74. ● Improved layout & pagination ● Export to different formats ● Distribution management: what, who, and when What we’d like BLAKE IRVINE | TABLEAU CONFERENCE 2018
  75. 75. Looking Forward BLAKE IRVINE | TABLEAU CONFERENCE 2018
  76. 76. In 2019 and Beyond easy BLAKE IRVINE | TABLEAU CONFERENCE 2018
  77. 77. Before we wrap up... BLAKE IRVINE | TABLEAU CONFERENCE 2018 Thank YOU!
  78. 78. BLAKE IRVINE | TABLEAU CONFERENCE 2018
  79. 79. Q&A Blake Irvine birvine@netflix.com @blakeirvine linkedin.com/in/blakeirvine/ Don’tforget theSurvey!

×