Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

"Building Data Foundations and Analytics Tools Across The Product" by Crystal Widjaja (GO-JEK)

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 39 Publicité

"Building Data Foundations and Analytics Tools Across The Product" by Crystal Widjaja (GO-JEK)

Télécharger pour lire hors ligne

Crystal is a data nerd, self-taught programmer, and avid non-fiction reader.
Having joined GO-JEK over two years ago, she has first-hand experience of the many challenges involved with scaling data-driven teams at Indonesia’s first unicorn startup. She currently leads the strategy and vision of the Business Intelligence team’s internal products and data culture across the company. Her team aims to produce actionable insights for all of the different verticals on the GO-JEK platform.

This slide was shared at Tech in Asia Product Development Conference 2017 (PDC'17) on 9-10 August 2017.

Get more insightful updates from TIA by subscribing techin.asia/updateselalu

Crystal is a data nerd, self-taught programmer, and avid non-fiction reader.
Having joined GO-JEK over two years ago, she has first-hand experience of the many challenges involved with scaling data-driven teams at Indonesia’s first unicorn startup. She currently leads the strategy and vision of the Business Intelligence team’s internal products and data culture across the company. Her team aims to produce actionable insights for all of the different verticals on the GO-JEK platform.

This slide was shared at Tech in Asia Product Development Conference 2017 (PDC'17) on 9-10 August 2017.

Get more insightful updates from TIA by subscribing techin.asia/updateselalu

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Publicité

Similaire à "Building Data Foundations and Analytics Tools Across The Product" by Crystal Widjaja (GO-JEK) (20)

Plus par Tech in Asia ID (20)

Publicité

"Building Data Foundations and Analytics Tools Across The Product" by Crystal Widjaja (GO-JEK)

  1. 1. Building Data Foundations and Analytics Tools Across the Product
  2. 2. Who am I? ● Started at GO-JEK in July 2015 as the first “data” hire First day: Creating a Data Dictionary without any reference tables Yesterday: Discussions for a more advanced experimentation platform, prototyping Growth ROI formulas, QAing new data marts
  3. 3. Agenda ● Infrastructure for Scale ● Data Model Foundations ● Tools for Business Users
  4. 4. Infrastructure
  5. 5. GO-JEK Data Today ~27% *This is only business metrics data collected by BI GROWING DATA VOLUME PER MONTH > 5000 METABASE CARDS AND TABLEAU SHEETS > 450 AVG DAILY BUSINESS USERS ON INTERNAL DATA TOOLS
  6. 6. 4 FULL TIME DATA WAREHOUSE DEVELOPERS > 30 BI DATA ANALYSTS 100s OF MICROSERVICES ACROSS GO-JEK GO-JEK Data Today
  7. 7. “The choices you made were the right choices given the facts that you had at the time.” - Ajey Gore, CTO at GO-JEK
  8. 8. Storage
  9. 9. Storage
  10. 10. crontabs are fun
  11. 11. Data Modeling
  12. 12. More data to more people
  13. 13. Staging Layer RAW Dataset Integration Layer Fact / Dimension dataset Access Layer Summary and roll-up data Datamart Layer Product-specialized dataset Current Data Architecture
  14. 14. Staging Layer RAW Dataset Integration Layer Fact / Dimension Dataset Access Layer Summary and roll-up data Datamart Layer Product-specialized dataset Current Data Architecture Why? 1. Transparency 2. Standardization
  15. 15. “Can I get a list of all full-time drivers? I want to [give them a reward | put them on a beta group | interview them | … ]”
  16. 16. What qualities make a driver a “full-time driver”? # of days the driver logs into the app in a week # of minutes a driver spends on a booking # of bookings a driver does per day on avg in the past X weeks # of minutes a driver spends logged into the app per day # of completed bookings a driver does in a particular service most common hour the driver logs into the app in the past month Keep the First Data Layer Factual
  17. 17. ● Star Schema ● Advantages ○ Clean and structured model Merchant Dimension id nama kategori_merchant 1 Warung Bu Iis TRADISIONAL Customer Dimension id nama nomor_telepon 123 Jo 628112345678 Driver Dimension id nama jenis_kelamin 456 Asep M 457 Doni M 458 Siti F Order Fact id id_customer id_driver id_merchant 10001 123 458 1 Item Fact id id_order nama_item harga 101 10001 Nasi Goreng 30000 102 10001 Es Teh Manis 5000 Driver Search Fact id id_driver nama status 1 456 Asep Rejected 2 457 Doni Rejected 3 458 Siti Accepted ● Disadvantages ○ Difficult to do data discovery for non-technical users ○ Needs a lot of joins, resulting in high computational resource needs
  18. 18. App Login Data Bid Data Completed Booking Data Income Data Driver Profile Data Factual Activity Data Daily Partition of Driver Activity and Profile Data in Denormalized & Nested Form The Data Model
  19. 19. avg_minutes_online_past_3_days total_minutes_online_past_3_days avg_minutes_online_past_7_days total_days_active_past_3_days avg_minutes_online_past_30_days total_orders_completed_past_7_days avg_income_past_3_days total_orders_completed_past_30_days avg_income_past_7_days total_services_completed_past_7_days total_completed_ride_past_7_days total_completed_send_past_7_days for each driver_id... … and +200 other data points
  20. 20. Tools for Scale
  21. 21. Lifecycle of a Data Point One Week Old One Month Old 3 Months Old
  22. 22. Let Analysts Define Events
  23. 23. Sample Events to Save on Costs Better sample that data point...
  24. 24. Take Away ● Build for the infrastructure you have, not what you think you’ll have ● Build simple step-by-step data models with transparency ● Build tools that work for all the different stages of the company

×