SlideShare une entreprise Scribd logo
1  sur  45
Télécharger pour lire hors ligne
Evolution of
Data at
Nubank
29/01/2019
André Tavares
Product Manager
Nubank
• Biggest fintech
outside Asia
• 5 million+ credit
card customers
• 2.5 million+
NuContas
• 1300 employees
• 60 squads
• 200 microservices and
30 models in production
• 40 Tb of data
processed everyday
• 550 DAUs on data tools
Team Mission
Provide reliable and efficient
platform, services and stewardship
for Nubank to
make better decisions with data
2013
2013
• Company started on May 2013
• 10 employees by the end of the year
• Mostly engineers, no one directly
working with data
• No product yet
2013
No Product = No Data
• Getting to product-market fit
is priority #1
• You won’t even have that
much data to work with
until you get there
• Early stage startups are not
the right place to work as a
Data Product Manager
Learning 1
2014
• First credit card transaction in April
2014
• Product launched for friends & family
• Manual credit approval
• From 10 to 35 employees, head of credit
and first 4 analysts hired
• 10.000 customers by the end of the year
2014
Credit is hard!
• Takes a long time for
credit decisions to be
evaluated (in our case,
several months)
• An incorrect policy could
cause the company to go
bankrupt before anyone
notices
Learning 2
2013
2015
• Product goes viral: from 10.000
customers to 400.000 in a single year!
• Surge in number of customers requires
very fast growth of customer service:
from 35 to 250 employees
• Business Analysts and Data Scientists
are now 10
• Squad data science created
2015
• First policies built to predict how much
customers would spend and how likely
they are to pay back their cards
2015
Data itself is a product
• Do we have all the data we
need? Obtaining it is part of
the problem
• Is it complete? Correct? Of
good quality? Do we need
backfills?
• Need to follow all regulations
Learning 3
Failure: “We don’t
need SQL”
2015
2016
• Hit a million customers during the year
• Finished the year with 400 employees
• 30 BAs and DSs,
• Squad DS is exploded, data people
working from various teams
• Some engineers start specializing on
data pipelines
2016
Centralized BI doesn’t
scale
• A central team can be
effective to establish
standards and best practices,
and to prioritize an
overwhelming number of
requests
• As the company grows, you
need to embed analytics into
each team to keep agile
Learning 4
• Model creation starts to become more
industrialized
• Automatizing key reports for central
bank leads us to creating our ETL and
our analytical environment
2016
22
ETL
• Extract: Data is extracted from the production
environment and sent to the analytical
environment
• Transform: Data is refined into cleaner and
easier to use datasets
• Load: Datasets are loaded into databases that
can be accessed by consumers
You need an ETL
• High latency, high
throughput
• Horizontally scalable
• High accessibility
• Heterogeneous data
• Pain on write
• Unified, global
Learning 5
2013
2017
• Over 3 million customers
• Launched our next two products:
Rewards and NuConta
• 700 employees
• 50 BAs and DSs,
• Squad data infra
2017
• Structuring our data warehouse
• Dimensional modeling
• Batch models running on the ETL
• First BI tool: metabase
2017
First BI tool: Metabase
• Open source, self-hosted
• Allows querying our data
warehouse (ETL results)
• Go-to tool for writing simple
queries and creating simple
dashboards
• Point and click interface
empowers users that don’t know
SQL
2017
Failure: Contribution
Margin Dataset
ETL Jobs
• Anyone in the company can
contribute ETL jobs by opening a PR
in our monorepo

• Teams are responsible for writing
and maintaining their jobs
• Jobs are written in scala (sparkSQL);
some DSLs are provided
• Use databricks to iterate on logic
• Peer review to ensure quality and
consistency
• 100 contributors making 400+
contributions per month
Focus on the Platform
Problem: Data team creating
datasets (tables) for the
entire company
• Lack fo context
• Hard to prioritize among
various teams
• Becoming a bottleneck
Learning 6
Solution: Empower vertical
teams to own dataset
creation
• Focus on tooling,
training and support
• Remove
interdependencies
Focus on the Platform
Learning 6
2018
• Over 5 million customers
• Launched debit cards
• 1200 employees
• 90 BAs and DSs,
• Squad data infra in Berlin office, squad
data access in São Paulo office
2018
• Models starting to pop on several areas
of the company
2018
Data Services
Trainings: Weekly trainings on SQL,
python or scala, new employee
onboarding, new tool rollout
Support: Dedicated slack support
channels; community of users support
each other
Meetings: Forums for sharing data
scientist and analyst work, monthly
meetings to discuss state of Data
Data Analysts: Function focused to
improving data usage in the company
(not SQL slaves!)
Invest on your people
Learning 7
• Training employees is not
only HR’s job
• Proactive investment on
training can avoid reactive
support work
• Sometimes the problem is
behavioral, not technological
Failure: Moving users to a
new BI tool too fast
2018
Building is not enough
• Internal launches are also
launches
• You need training and
support
• Do the benefits of your mew
internal product outweigh
the switching costs?
Learning 8
2013
2019and beyond
• Future: dozens of millions of customers
• Thousands of employees
• Hundreds of analysts, dozens of data
scientists
• Growing data org
2019
and beyond
• Things we’ll work on:
• New data protection law
• Giving employees even more data
ownership
• Data Portal
• New Data Warehouse
• Infra refactors to better support new
product and refactors
2019
and beyond
RECAP
No Product = No Data
Credit is hard!
Data itself is a product
Centralized BI doesn’t
scale
You need an ETL
Focus on the Platform
Invest on your people
Building is not enough
Interested in working
with us?
sou.nu/jobs-at-nubank
Evolution of Data at Nubank - Product.io Meetup 2019-01-29

Contenu connexe

Tendances

Considerations for Data Access in the Lakehouse
Considerations for Data Access in the LakehouseConsiderations for Data Access in the Lakehouse
Considerations for Data Access in the LakehouseDatabricks
 
Design Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDesign Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDenodo
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfAmazon Web Services
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...HostedbyConfluent
 
On-premise to Microsoft Azure Cloud Migration.
 On-premise to Microsoft Azure Cloud Migration. On-premise to Microsoft Azure Cloud Migration.
On-premise to Microsoft Azure Cloud Migration.Emtec Inc.
 
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesApache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)
Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)
Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)Aaron Zornes
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsHostedbyConfluent
 
Continuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the EnterpriseContinuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the EnterpriseDataWorks Summit
 
Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!DATAVERSITY
 

Tendances (20)

Considerations for Data Access in the Lakehouse
Considerations for Data Access in the LakehouseConsiderations for Data Access in the Lakehouse
Considerations for Data Access in the Lakehouse
 
Design Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDesign Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data Organizations
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
 
DataOps with Project Amaterasu
DataOps with Project AmaterasuDataOps with Project Amaterasu
DataOps with Project Amaterasu
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
 
On-premise to Microsoft Azure Cloud Migration.
 On-premise to Microsoft Azure Cloud Migration. On-premise to Microsoft Azure Cloud Migration.
On-premise to Microsoft Azure Cloud Migration.
 
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesApache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and Architectures
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)
Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)
Analyst field reports on top 15 MDM solutions - Aaron Zornes (NYC 2021)
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
 
Continuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the EnterpriseContinuous Data Ingestion pipeline for the Enterprise
Continuous Data Ingestion pipeline for the Enterprise
 
Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 

Similaire à Evolution of Data at Nubank - Product.io Meetup 2019-01-29

Tableau Conference 2014 Presentation
Tableau Conference 2014 PresentationTableau Conference 2014 Presentation
Tableau Conference 2014 Presentationkrystalstjulien
 
Accrosoft End of Year Presentation
Accrosoft End of Year PresentationAccrosoft End of Year Presentation
Accrosoft End of Year PresentationRachel Lindsay
 
Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019
Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019
Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019Karoliina Kettukari
 
Lyn's knowledge journeyv6
Lyn's knowledge journeyv6Lyn's knowledge journeyv6
Lyn's knowledge journeyv6Lyn Murnane
 
Totara User Group - Data and Your LMS
Totara User Group - Data and Your LMSTotara User Group - Data and Your LMS
Totara User Group - Data and Your LMSKineo
 
Key Elements for a Successful Service Analytics Program
Key Elements for a Successful Service Analytics ProgramKey Elements for a Successful Service Analytics Program
Key Elements for a Successful Service Analytics ProgramData Con LA
 
Re-orienting your business around data
Re-orienting your business around dataRe-orienting your business around data
Re-orienting your business around dataDani Solà Lagares
 
Building innovative digital platform dashboards to improve business and opera...
Building innovative digital platform dashboards to improve business and opera...Building innovative digital platform dashboards to improve business and opera...
Building innovative digital platform dashboards to improve business and opera...Steve Ng
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
Karthik - Profile
Karthik - ProfileKarthik - Profile
Karthik - Profilekarthikmrk7
 
KM lecture 09/09/2015
KM lecture 09/09/2015KM lecture 09/09/2015
KM lecture 09/09/2015Lyn Murnane
 
Five Ways to Ensure 100% Adoption of Office 365
Five Ways to Ensure 100% Adoption of Office 365Five Ways to Ensure 100% Adoption of Office 365
Five Ways to Ensure 100% Adoption of Office 365Christian Buckley
 
my km journey v9
my km journey v9my km journey v9
my km journey v9Lyn Murnane
 
Making Workflow Automation Personal: The Next Step in Digital Transformation...
Making Workflow Automation Personal:  The Next Step in Digital Transformation...Making Workflow Automation Personal:  The Next Step in Digital Transformation...
Making Workflow Automation Personal: The Next Step in Digital Transformation...Michael Oryszak
 
Mious case study presentation (2)
Mious   case study presentation (2)Mious   case study presentation (2)
Mious case study presentation (2)Emtec Inc.
 
Humana Case Study: Paradigm Shift in Reporting by Deploying Four OBIA Module...
Humana Case Study:  Paradigm Shift in Reporting by Deploying Four OBIA Module...Humana Case Study:  Paradigm Shift in Reporting by Deploying Four OBIA Module...
Humana Case Study: Paradigm Shift in Reporting by Deploying Four OBIA Module...Emtec Inc.
 
Rise of the Data Democracy
Rise of the Data DemocracyRise of the Data Democracy
Rise of the Data DemocracyBrendan Aldrich
 

Similaire à Evolution of Data at Nubank - Product.io Meetup 2019-01-29 (20)

Tableau Conference 2014 Presentation
Tableau Conference 2014 PresentationTableau Conference 2014 Presentation
Tableau Conference 2014 Presentation
 
Accrosoft End of Year Presentation
Accrosoft End of Year PresentationAccrosoft End of Year Presentation
Accrosoft End of Year Presentation
 
Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019
Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019
Office 365 User Adoption Roadmap Step by Step Kettukari SPS Brussels 19.10.2019
 
Lyn's knowledge journeyv6
Lyn's knowledge journeyv6Lyn's knowledge journeyv6
Lyn's knowledge journeyv6
 
Totara User Group - Data and Your LMS
Totara User Group - Data and Your LMSTotara User Group - Data and Your LMS
Totara User Group - Data and Your LMS
 
Key Elements for a Successful Service Analytics Program
Key Elements for a Successful Service Analytics ProgramKey Elements for a Successful Service Analytics Program
Key Elements for a Successful Service Analytics Program
 
Re-orienting your business around data
Re-orienting your business around dataRe-orienting your business around data
Re-orienting your business around data
 
Building innovative digital platform dashboards to improve business and opera...
Building innovative digital platform dashboards to improve business and opera...Building innovative digital platform dashboards to improve business and opera...
Building innovative digital platform dashboards to improve business and opera...
 
My KM journey
My KM journeyMy KM journey
My KM journey
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
HUSCO Intl Presentation 5/9/12
HUSCO Intl Presentation 5/9/12HUSCO Intl Presentation 5/9/12
HUSCO Intl Presentation 5/9/12
 
Karthik - Profile
Karthik - ProfileKarthik - Profile
Karthik - Profile
 
KM lecture 09/09/2015
KM lecture 09/09/2015KM lecture 09/09/2015
KM lecture 09/09/2015
 
Five Ways to Ensure 100% Adoption of Office 365
Five Ways to Ensure 100% Adoption of Office 365Five Ways to Ensure 100% Adoption of Office 365
Five Ways to Ensure 100% Adoption of Office 365
 
my km journey v9
my km journey v9my km journey v9
my km journey v9
 
Km journey v10
Km journey v10Km journey v10
Km journey v10
 
Making Workflow Automation Personal: The Next Step in Digital Transformation...
Making Workflow Automation Personal:  The Next Step in Digital Transformation...Making Workflow Automation Personal:  The Next Step in Digital Transformation...
Making Workflow Automation Personal: The Next Step in Digital Transformation...
 
Mious case study presentation (2)
Mious   case study presentation (2)Mious   case study presentation (2)
Mious case study presentation (2)
 
Humana Case Study: Paradigm Shift in Reporting by Deploying Four OBIA Module...
Humana Case Study:  Paradigm Shift in Reporting by Deploying Four OBIA Module...Humana Case Study:  Paradigm Shift in Reporting by Deploying Four OBIA Module...
Humana Case Study: Paradigm Shift in Reporting by Deploying Four OBIA Module...
 
Rise of the Data Democracy
Rise of the Data DemocracyRise of the Data Democracy
Rise of the Data Democracy
 

Dernier

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 

Dernier (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 

Evolution of Data at Nubank - Product.io Meetup 2019-01-29

  • 3. • Biggest fintech outside Asia • 5 million+ credit card customers • 2.5 million+ NuContas • 1300 employees
  • 4. • 60 squads • 200 microservices and 30 models in production • 40 Tb of data processed everyday • 550 DAUs on data tools
  • 6. Provide reliable and efficient platform, services and stewardship for Nubank to make better decisions with data
  • 8. • Company started on May 2013 • 10 employees by the end of the year • Mostly engineers, no one directly working with data • No product yet 2013
  • 9. No Product = No Data • Getting to product-market fit is priority #1 • You won’t even have that much data to work with until you get there • Early stage startups are not the right place to work as a Data Product Manager Learning 1
  • 10. 2014
  • 11. • First credit card transaction in April 2014 • Product launched for friends & family • Manual credit approval • From 10 to 35 employees, head of credit and first 4 analysts hired • 10.000 customers by the end of the year 2014
  • 12. Credit is hard! • Takes a long time for credit decisions to be evaluated (in our case, several months) • An incorrect policy could cause the company to go bankrupt before anyone notices Learning 2
  • 14. • Product goes viral: from 10.000 customers to 400.000 in a single year! • Surge in number of customers requires very fast growth of customer service: from 35 to 250 employees • Business Analysts and Data Scientists are now 10 • Squad data science created 2015
  • 15. • First policies built to predict how much customers would spend and how likely they are to pay back their cards 2015
  • 16. Data itself is a product • Do we have all the data we need? Obtaining it is part of the problem • Is it complete? Correct? Of good quality? Do we need backfills? • Need to follow all regulations Learning 3
  • 18. 2016
  • 19. • Hit a million customers during the year • Finished the year with 400 employees • 30 BAs and DSs, • Squad DS is exploded, data people working from various teams • Some engineers start specializing on data pipelines 2016
  • 20. Centralized BI doesn’t scale • A central team can be effective to establish standards and best practices, and to prioritize an overwhelming number of requests • As the company grows, you need to embed analytics into each team to keep agile Learning 4
  • 21. • Model creation starts to become more industrialized • Automatizing key reports for central bank leads us to creating our ETL and our analytical environment 2016
  • 22. 22 ETL • Extract: Data is extracted from the production environment and sent to the analytical environment • Transform: Data is refined into cleaner and easier to use datasets • Load: Datasets are loaded into databases that can be accessed by consumers
  • 23. You need an ETL • High latency, high throughput • Horizontally scalable • High accessibility • Heterogeneous data • Pain on write • Unified, global Learning 5
  • 25. • Over 3 million customers • Launched our next two products: Rewards and NuConta • 700 employees • 50 BAs and DSs, • Squad data infra 2017
  • 26. • Structuring our data warehouse • Dimensional modeling • Batch models running on the ETL • First BI tool: metabase 2017
  • 27. First BI tool: Metabase • Open source, self-hosted • Allows querying our data warehouse (ETL results) • Go-to tool for writing simple queries and creating simple dashboards • Point and click interface empowers users that don’t know SQL
  • 29. ETL Jobs • Anyone in the company can contribute ETL jobs by opening a PR in our monorepo
 • Teams are responsible for writing and maintaining their jobs • Jobs are written in scala (sparkSQL); some DSLs are provided • Use databricks to iterate on logic • Peer review to ensure quality and consistency • 100 contributors making 400+ contributions per month
  • 30. Focus on the Platform Problem: Data team creating datasets (tables) for the entire company • Lack fo context • Hard to prioritize among various teams • Becoming a bottleneck Learning 6
  • 31. Solution: Empower vertical teams to own dataset creation • Focus on tooling, training and support • Remove interdependencies Focus on the Platform Learning 6
  • 32. 2018
  • 33. • Over 5 million customers • Launched debit cards • 1200 employees • 90 BAs and DSs, • Squad data infra in Berlin office, squad data access in São Paulo office 2018
  • 34. • Models starting to pop on several areas of the company 2018
  • 35. Data Services Trainings: Weekly trainings on SQL, python or scala, new employee onboarding, new tool rollout Support: Dedicated slack support channels; community of users support each other Meetings: Forums for sharing data scientist and analyst work, monthly meetings to discuss state of Data Data Analysts: Function focused to improving data usage in the company (not SQL slaves!)
  • 36. Invest on your people Learning 7 • Training employees is not only HR’s job • Proactive investment on training can avoid reactive support work • Sometimes the problem is behavioral, not technological
  • 37. Failure: Moving users to a new BI tool too fast 2018
  • 38. Building is not enough • Internal launches are also launches • You need training and support • Do the benefits of your mew internal product outweigh the switching costs? Learning 8
  • 40. • Future: dozens of millions of customers • Thousands of employees • Hundreds of analysts, dozens of data scientists • Growing data org 2019 and beyond
  • 41. • Things we’ll work on: • New data protection law • Giving employees even more data ownership • Data Portal • New Data Warehouse • Infra refactors to better support new product and refactors 2019 and beyond
  • 42. RECAP
  • 43. No Product = No Data Credit is hard! Data itself is a product Centralized BI doesn’t scale You need an ETL Focus on the Platform Invest on your people Building is not enough
  • 44. Interested in working with us? sou.nu/jobs-at-nubank