SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
what’s in your workflow?
reproducible business analysis at Capital One
Emily Riederer
Sr. Analyst, Capital One
@EmilyRiederer / emily.riederer@capitalone.com
the reproducibility “brand”
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
The replicability crisis empowered the open science movement and turned reproducibility
and workflows into hot topics
"The Myth of Self-Correcting Science"
The Atlantic
Sarah Estes
December 20, 2012
"How science goes wrong"
The Economist
Cover Story
October 21, 2013
"Psychology's Replication Crisis has a Silver Lining"
The Atlantic
Paul Bloom
Feb 18, 2016
"When the Revolution Came for Amy Cuddy"
New York Times Magazine
Susan Dominus
October 18, 2017
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Many organizations emerged to develop and evangelize better practices for scientific
transparency and collaboration
• Offer two-day bootcamps on scientific computing
and reproducible research
• Target researchers in diverse academic disciplines
(e.g. ecology)
• Over 22,000 workshop attendees and 1,000
instructors since 2012
• “Good Enough Practices in Scientific Computing”
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Industry embraced advancements in reproducibility for gains to quality and efficiency
• Developed internal R package for tooling and
community building among data scientists
• Promotes efficient work and standardized style
• Share analyses on Knowledge Repo (built on
GitHub)
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Analyst
Scientist
Traditional business analysis’ focus on solving a specific case is a barrier to reproducible
thinking
Abstraction
Concrete Business Cases
Reproducibility Taxonomy
(Stodden, et al, 2013)
Open/Reproducible
Auditable
Confirmable
Replicable
Reviewable
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
The traditional lens of business analysis is reinforced by the standard tools of the trade
Abstraction
Concrete Business Cases
Spreadsheets expose
data, hide computation
Scripting logs computation,
abstracts data
Analyst
Scientist
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
The language business analysts use to articulate their problems obscures the link to
technology-driven solutions
Reproducibility
Version Control
Peer/Code Review
Re-Work
Tribal Knowledge
Team Work
Sanity Checking
the tidycf R package at Capital One
turning business analysis on its head by turning cashflows on their side
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Decision
Making
Validation
&
Monitoring
Modeling
Scenario
Analysis
Cashflow analysis is integral to many interrelated pieces of business analytics
Documentation
& Governance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Decision
Making
Validation
&
Monitoring
Modeling
Scenario
Analysis
Documentation
& Governance
Database
System
BI Visualization
Tool
Legacy Statistical
Computing Platform
Legacy Statistical
Computing Platform
Legacy Statistical
Computing
Platform
FTP Client
FTP Client
Spreadsheet
Software
Spreadsheet
Software
Word Processor
Word Processor
Spreadsheet
Software
Presentation
Software
• Black box
• Limited capability
• Manual documentation
• Highly manual process
• System-specific
knowledge
• Slow iteration
Patchwork processes lead to inefficiency, poor documentation, and limited reproducibility
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Decision
Making
Validation
&
Monitoring
Modeling
Scenario
Analysis
Building an end-to-end R package enabled an efficient and reproducible workflow
• Accessible code
• Extensible code
• Real-time
documentation
• Automated &
reproducible
• General versus system-
specific knowledge
• Rapid iteration
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Nuanced business decisions are driven by a remarkably standard analytical “engine”
Business Processes
Workflow
(R for Data Science)
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Cashflow statements are a typical representation of valuations models in the world of
financial analysis
Time Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Total Revenue 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$ 4.1$ 14.7$ 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$
Interchange Revenue 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$ 3.0$ 13.7$ 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$
Spend 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$ 304.0$ 1,366.2$ 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$
Interchange Rate 1.0%
Interest Revenue 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$ 0.7$ 0.5$ 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$
Fee Revenue 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$ 0.3$ 0.5$ 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$
Other Revenues 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$ 0.1$ 0.1$ 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$
Total Expense 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$ 16.0$ 19.9$ 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$
Operating Expenses 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$
Marketing Expenses 5.0$ 2.0$ -$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 5.0$ 2.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$
Credit Losses -$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 1.0$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$
Recoveries & Coll -$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 2.0$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$
Cost of Funds 12.0$ 8.1$ 0.5$ 16.8$ 1.3$ 1.6$ 5.5$ 2.6$ 1.6$ 0.4$ 5.2$ 11.7$ 7.4$ 20.6$ 4.8$ 12.4$ 3.5$ 5.0$
Outstandings 447.3$ 346.8$ 19.0$ 353.9$ 297.6$ 42.9$ 276.3$ 358.8$ 50.7$ 101.7$ 433.2$ 479.1$ 229.6$ 430.4$ 272.0$ 415.2$ 123.1$ 154.0$
Loan Rate 2.7% 2.3% 2.6% 4.7% 0.4% 3.7% 2.0% 0.7% 3.1% 0.4% 1.2% 2.5% 3.2% 4.8% 1.8% 3.0% 2.9% 3.2%
Other Expenses (3.8)$ (8.2)$ 8.2$ (17.1)$ 4.3$ 7.0$ (3.7)$ (3.3)$ 11.1$ 17.3$ 0.0$ (11.9)$ (0.8)$ (20.9)$ 0.7$ (3.9)$ (1.8)$ (5.7)$
NIBT (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$ (11.9)$ (5.2)$ (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$
Tax (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$ (7.7)$ (3.4)$ (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$
Tax Rate 36.0%
NIAT (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$ (4.2)$ (1.8)$ (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$
Equity Flow (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$
Cashflow (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$ (3.2)$ 4.0$ (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$
Discounted CF (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$ (3.0)$ 3.8$ (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$
Lifetime DCF 47.2$
TV 10.0$
PV 57.2$
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
However, cashflow statements are not optimized for either human or machine readability
Time
Mix of time series and pointwise fields; data and assumptions
Variable information contained in formatting – bold, italics, indentations
Mix of data and calculations, with every cell exposed to potential typos
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Data defined by specific location on sheet so adding a
new line-item can perturb downstream calculations
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Instead, we used tidy cashflows, applying the principles of tidy data analysis
• “Happy families are all alike; every unhappy family is unhappy in its own way “ – Leo Tolstoy, Anna Karenina
• “Like families, tidy datasets are all alike but every messy dataset is messy in its own way.” – Hadley Wickham, “Tidy Data”
Segment Time Tot_Rev Tot_Exp NIBT Tax NIAT Eq_Flow Cashflow
Super 1 4.69 14.20 -9.50 -6.18 -3.33 -5.00 -8.33
Super 2 21.08 3.19 17.90 11.63 6.26 1.00 7.26
Super 3 47.07 9.94 37.13 24.13 13.00 1.00 14.00
Super 4 34.35 1.96 32.39 21.05 11.34 1.00 12.34
Super 5 1.59 8.89 -7.30 -4.75 -2.56 1.00 -1.56
Prime 1 57.61 47.45 10.16 6.60 3.56 -5.00 -1.44
Prime 2 93.78 5.52 88.26 57.37 30.89 1.00 31.89
Prime 3 17.74 54.17 -36.43 -23.68 -12.75 1.00 -11.75
Prime 4 36.98 3.93 33.05 21.48 11.57 1.00 12.57
Prime 5 78.72 55.98 22.74 14.78 7.96 1.00 8.96
Time 1 2 3 4 5
Superprime
Total Revenue $4.69 $21.08 $47.07 $34.35 $1.59
Total Expense $14.20 $3.19 $9.94 $1.96 $8.89
NIBT -$9.50 $17.90 $37.13 $32.39 -$7.30
Tax -$6.18 $11.63 $24.13 $21.05 -$4.75
NIAT -$3.33 $6.26 $13.00 $11.34 -$2.56
Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00
Cashflow -$8.33 $7.26 $14.00 $12.34 -$1.56
Prime
Total Revenue $57.61 $93.78 $17.74 $36.98 $78.72
Total Expense $47.45 $5.52 $54.17 $3.93 $55.98
NIBT $10.16 $88.26 -$36.43 $33.05 $22.74
Tax $6.60 $57.37 -$23.68 $21.48 $14.78
NIAT $3.56 $30.89 -$12.75 $11.57 $7.96
Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00
Cashflow -$1.44 $31.89 -$11.75 $12.57 $8.96
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Tidy cashflows streamline the workflow to facilitate advanced analytics like bootstrapping
error bars and indifference curves
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Organically evolving the tidycf package while addressing business problems led to
efficient and empathetic development
Task:
Valuations
Process 1:
Data Validation
Process 2:
Data Exploration
Process n-1:
Model Validation
Process n+1 … z:
Analysis with Model
Framework 1:
Data Validation
Framework 2:
Data Exploration
Framework n-1:
Model Validation
Framework n+1 … z:
Analysis with Model
calc
functions
viz
functions
tbl
functions
… …
Process 3:
Model Building
Framework 3:
Model Building
Process n:
Model Intuition
Framework n:
Model Intuition
Business Problems R Markdown Templates R functions
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
RMarkdown templates enable package discoverability, immersion into the broader R
language, and contextual knowledge transfer while generating documentation
Code comments explain syntax and
suggest new functions to try
Text commentary facilitates knowledge
transfer of business context and intuition
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Functions manipulate tidy data and provide output consistent with their taxonomy and
compatible with any tidyverse pipeline
calc
functions
viz
functions
tbl
functions
data frame
data frame
graphic
pivoted data
get
functions
DB Conns,
Internet Proxies,
etc.
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
These functions are intuitively related to help users quickly generate standard views
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Integration with the tidyverse allows functions to provide both structure and flexibility
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
tidycf’s R Project template standardizes file structure for better project management
RStudio > File > New Project > New Directory
• /analysis/ : core scripts (.Rmd) and final outputs (.HTML)
• /data/ : raw data (or in our case, pulled from SQL)
• /doc/: text files providing context and documentation
• /ext/ : external files needed for project
• /output/ : intermediate/final data formats
• /src/ : other helper scripts (e.g. SQL, python)
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
RMarkdown templates document data transformations and demonstrate relative paths to
save intermediate artifacts in the appropriate directory
Data Validation
Exploratory Data Analysis
Model Validation
Model Analytics
(Multiple Modeling Steps)
…
Model Monitoring
raw
out1
out_t
out_t+1
model
model
out1
out2
out_t+1
model
R Markdown Templates Output DataInput Data
./data/
./analysis/
./output/
Directoryexternal
External Source
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Industry, academia, and the open-source community served as inspirations for our three-
pronged training approach
Community BuildingBootcampsIndividual Resources
Set-Up Guides, Self-Directed
Learning, and Prerequisites
Three day bootcamp –
conceptual and hands-on
Interaction (forum) and
engagement (contribution)
opportunities
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Our resulting package emphasizes reproducibility while immersing business analysts in R
odbc
DBI
dbplyr
rmarkdown
knitr
httr
devtools
plotly
tidyverse
what’s in your workflow?
reproducible business analysis at Capital One
Emily Riederer
Sr. Analyst, Capital One
@EmilyRiederer / emily.riederer@capitalone.com

Contenu connexe

Tendances

Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013DFitzmorris
 
Goldmoney inc june ir 06.07.16 (1)
Goldmoney inc   june ir 06.07.16 (1)Goldmoney inc   june ir 06.07.16 (1)
Goldmoney inc june ir 06.07.16 (1)BitGold Inc
 
Epic research malaysia daily klse report for 31st december 2015
Epic research malaysia   daily klse report for 31st december 2015Epic research malaysia   daily klse report for 31st december 2015
Epic research malaysia daily klse report for 31st december 2015Epic Research Pte. Ltd.
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016Nicole Chan
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016Nicole Chan
 

Tendances (6)

Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013
 
Goldmoney inc june ir 06.07.16 (1)
Goldmoney inc   june ir 06.07.16 (1)Goldmoney inc   june ir 06.07.16 (1)
Goldmoney inc june ir 06.07.16 (1)
 
Weekly Market Analysis
Weekly Market AnalysisWeekly Market Analysis
Weekly Market Analysis
 
Epic research malaysia daily klse report for 31st december 2015
Epic research malaysia   daily klse report for 31st december 2015Epic research malaysia   daily klse report for 31st december 2015
Epic research malaysia daily klse report for 31st december 2015
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
 

Similaire à What's in your workflow? Bringing data science workflows to business analysis at Capital One

ACG_Cup_-_Valuation
ACG_Cup_-_ValuationACG_Cup_-_Valuation
ACG_Cup_-_ValuationIke Ekeh
 
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docxRound 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docxjoellemurphey
 
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docxRound 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docxjoellemurphey
 
WatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable TechnologyWatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable TechnologyJohn W. Quinn
 
Looptworks case analysis
Looptworks case analysisLooptworks case analysis
Looptworks case analysisFred Wu
 
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPYBrunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPYJonathan Chang
 
Champlain College Financial Literacy- Art Woolf Presentation
Champlain College Financial Literacy- Art Woolf PresentationChamplain College Financial Literacy- Art Woolf Presentation
Champlain College Financial Literacy- Art Woolf Presentationmur12
 
Financial Forecasting
Financial ForecastingFinancial Forecasting
Financial ForecastingJamariHodges1
 
RR - Technicals 1.pdf
RR - Technicals 1.pdfRR - Technicals 1.pdf
RR - Technicals 1.pdfmerag76668
 
C3 comp redesign 6-3-13 (2)
C3 comp redesign  6-3-13 (2)C3 comp redesign  6-3-13 (2)
C3 comp redesign 6-3-13 (2)Mark Wolkove
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxgerardkortney
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxalfred4lewis58146
 
DiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication trainingDiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication trainingDustiBuckner14
 
Pat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market DownturnsPat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market DownturnsJohn Blue
 
21st Century Compensation Planning
21st Century Compensation Planning21st Century Compensation Planning
21st Century Compensation PlanningRelax, It's Handled
 
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docxRound 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docxdaniely50
 
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docx
Excerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docxExcerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docx
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docxcravennichole326
 

Similaire à What's in your workflow? Bringing data science workflows to business analysis at Capital One (20)

ACG Cup - Valuation
ACG Cup - ValuationACG Cup - Valuation
ACG Cup - Valuation
 
ACG_Cup_-_Valuation
ACG_Cup_-_ValuationACG_Cup_-_Valuation
ACG_Cup_-_Valuation
 
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docxRound 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
 
Sample LBO Model Template – 2
Sample LBO Model Template – 2Sample LBO Model Template – 2
Sample LBO Model Template – 2
 
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docxRound 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
 
WatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable TechnologyWatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable Technology
 
Looptworks case analysis
Looptworks case analysisLooptworks case analysis
Looptworks case analysis
 
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPYBrunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPY
 
Champlain College Financial Literacy- Art Woolf Presentation
Champlain College Financial Literacy- Art Woolf PresentationChamplain College Financial Literacy- Art Woolf Presentation
Champlain College Financial Literacy- Art Woolf Presentation
 
Financial Forecasting
Financial ForecastingFinancial Forecasting
Financial Forecasting
 
RR - Technicals 1.pdf
RR - Technicals 1.pdfRR - Technicals 1.pdf
RR - Technicals 1.pdf
 
C3 comp redesign 6-3-13 (2)
C3 comp redesign  6-3-13 (2)C3 comp redesign  6-3-13 (2)
C3 comp redesign 6-3-13 (2)
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
 
Sample LBO Model Template
Sample LBO Model TemplateSample LBO Model Template
Sample LBO Model Template
 
DiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication trainingDiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication training
 
Pat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market DownturnsPat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market Downturns
 
21st Century Compensation Planning
21st Century Compensation Planning21st Century Compensation Planning
21st Century Compensation Planning
 
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docxRound 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
 
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docx
Excerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docxExcerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docx
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docx
 

Plus de Domino Data Lab

The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataDomino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itDomino Data Lab
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationDomino Data Lab
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusDomino Data Lab
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceDomino Data Lab
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Domino Data Lab
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at ScaleDomino Data Lab
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data ScientistsDomino Data Lab
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceDomino Data Lab
 
Fuzzy Matching to the Rescue
Fuzzy Matching to the RescueFuzzy Matching to the Rescue
Fuzzy Matching to the RescueDomino Data Lab
 

Plus de Domino Data Lab (20)

The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
 
Fuzzy Matching to the Rescue
Fuzzy Matching to the RescueFuzzy Matching to the Rescue
Fuzzy Matching to the Rescue
 

Dernier

Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 

Dernier (20)

Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 

What's in your workflow? Bringing data science workflows to business analysis at Capital One

  • 1. what’s in your workflow? reproducible business analysis at Capital One Emily Riederer Sr. Analyst, Capital One @EmilyRiederer / emily.riederer@capitalone.com
  • 3. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md The replicability crisis empowered the open science movement and turned reproducibility and workflows into hot topics "The Myth of Self-Correcting Science" The Atlantic Sarah Estes December 20, 2012 "How science goes wrong" The Economist Cover Story October 21, 2013 "Psychology's Replication Crisis has a Silver Lining" The Atlantic Paul Bloom Feb 18, 2016 "When the Revolution Came for Amy Cuddy" New York Times Magazine Susan Dominus October 18, 2017
  • 4. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Many organizations emerged to develop and evangelize better practices for scientific transparency and collaboration • Offer two-day bootcamps on scientific computing and reproducible research • Target researchers in diverse academic disciplines (e.g. ecology) • Over 22,000 workshop attendees and 1,000 instructors since 2012 • “Good Enough Practices in Scientific Computing”
  • 5. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Industry embraced advancements in reproducibility for gains to quality and efficiency • Developed internal R package for tooling and community building among data scientists • Promotes efficient work and standardized style • Share analyses on Knowledge Repo (built on GitHub)
  • 6. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Analyst Scientist Traditional business analysis’ focus on solving a specific case is a barrier to reproducible thinking Abstraction Concrete Business Cases Reproducibility Taxonomy (Stodden, et al, 2013) Open/Reproducible Auditable Confirmable Replicable Reviewable
  • 7. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md The traditional lens of business analysis is reinforced by the standard tools of the trade Abstraction Concrete Business Cases Spreadsheets expose data, hide computation Scripting logs computation, abstracts data Analyst Scientist
  • 8. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md The language business analysts use to articulate their problems obscures the link to technology-driven solutions Reproducibility Version Control Peer/Code Review Re-Work Tribal Knowledge Team Work Sanity Checking
  • 9. the tidycf R package at Capital One turning business analysis on its head by turning cashflows on their side
  • 10. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Decision Making Validation & Monitoring Modeling Scenario Analysis Cashflow analysis is integral to many interrelated pieces of business analytics Documentation & Governance
  • 11. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Decision Making Validation & Monitoring Modeling Scenario Analysis Documentation & Governance Database System BI Visualization Tool Legacy Statistical Computing Platform Legacy Statistical Computing Platform Legacy Statistical Computing Platform FTP Client FTP Client Spreadsheet Software Spreadsheet Software Word Processor Word Processor Spreadsheet Software Presentation Software • Black box • Limited capability • Manual documentation • Highly manual process • System-specific knowledge • Slow iteration Patchwork processes lead to inefficiency, poor documentation, and limited reproducibility
  • 12. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Decision Making Validation & Monitoring Modeling Scenario Analysis Building an end-to-end R package enabled an efficient and reproducible workflow • Accessible code • Extensible code • Real-time documentation • Automated & reproducible • General versus system- specific knowledge • Rapid iteration
  • 13. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Nuanced business decisions are driven by a remarkably standard analytical “engine” Business Processes Workflow (R for Data Science)
  • 14. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Cashflow statements are a typical representation of valuations models in the world of financial analysis Time Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Total Revenue 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$ 4.1$ 14.7$ 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$ Interchange Revenue 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$ 3.0$ 13.7$ 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$ Spend 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$ 304.0$ 1,366.2$ 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$ Interchange Rate 1.0% Interest Revenue 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$ 0.7$ 0.5$ 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$ Fee Revenue 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$ 0.3$ 0.5$ 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$ Other Revenues 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$ 0.1$ 0.1$ 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$ Total Expense 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$ 16.0$ 19.9$ 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$ Operating Expenses 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ Marketing Expenses 5.0$ 2.0$ -$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 5.0$ 2.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ Credit Losses -$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 1.0$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ Recoveries & Coll -$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 2.0$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ Cost of Funds 12.0$ 8.1$ 0.5$ 16.8$ 1.3$ 1.6$ 5.5$ 2.6$ 1.6$ 0.4$ 5.2$ 11.7$ 7.4$ 20.6$ 4.8$ 12.4$ 3.5$ 5.0$ Outstandings 447.3$ 346.8$ 19.0$ 353.9$ 297.6$ 42.9$ 276.3$ 358.8$ 50.7$ 101.7$ 433.2$ 479.1$ 229.6$ 430.4$ 272.0$ 415.2$ 123.1$ 154.0$ Loan Rate 2.7% 2.3% 2.6% 4.7% 0.4% 3.7% 2.0% 0.7% 3.1% 0.4% 1.2% 2.5% 3.2% 4.8% 1.8% 3.0% 2.9% 3.2% Other Expenses (3.8)$ (8.2)$ 8.2$ (17.1)$ 4.3$ 7.0$ (3.7)$ (3.3)$ 11.1$ 17.3$ 0.0$ (11.9)$ (0.8)$ (20.9)$ 0.7$ (3.9)$ (1.8)$ (5.7)$ NIBT (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$ (11.9)$ (5.2)$ (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$ Tax (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$ (7.7)$ (3.4)$ (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$ Tax Rate 36.0% NIAT (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$ (4.2)$ (1.8)$ (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$ Equity Flow (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ Cashflow (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$ (3.2)$ 4.0$ (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$ Discounted CF (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$ (3.0)$ 3.8$ (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$ Lifetime DCF 47.2$ TV 10.0$ PV 57.2$ Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 15. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md However, cashflow statements are not optimized for either human or machine readability Time Mix of time series and pointwise fields; data and assumptions Variable information contained in formatting – bold, italics, indentations Mix of data and calculations, with every cell exposed to potential typos Fake data is provided for illustrative purposes only and does not represent Capital One performance Data defined by specific location on sheet so adding a new line-item can perturb downstream calculations
  • 16. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Instead, we used tidy cashflows, applying the principles of tidy data analysis • “Happy families are all alike; every unhappy family is unhappy in its own way “ – Leo Tolstoy, Anna Karenina • “Like families, tidy datasets are all alike but every messy dataset is messy in its own way.” – Hadley Wickham, “Tidy Data” Segment Time Tot_Rev Tot_Exp NIBT Tax NIAT Eq_Flow Cashflow Super 1 4.69 14.20 -9.50 -6.18 -3.33 -5.00 -8.33 Super 2 21.08 3.19 17.90 11.63 6.26 1.00 7.26 Super 3 47.07 9.94 37.13 24.13 13.00 1.00 14.00 Super 4 34.35 1.96 32.39 21.05 11.34 1.00 12.34 Super 5 1.59 8.89 -7.30 -4.75 -2.56 1.00 -1.56 Prime 1 57.61 47.45 10.16 6.60 3.56 -5.00 -1.44 Prime 2 93.78 5.52 88.26 57.37 30.89 1.00 31.89 Prime 3 17.74 54.17 -36.43 -23.68 -12.75 1.00 -11.75 Prime 4 36.98 3.93 33.05 21.48 11.57 1.00 12.57 Prime 5 78.72 55.98 22.74 14.78 7.96 1.00 8.96 Time 1 2 3 4 5 Superprime Total Revenue $4.69 $21.08 $47.07 $34.35 $1.59 Total Expense $14.20 $3.19 $9.94 $1.96 $8.89 NIBT -$9.50 $17.90 $37.13 $32.39 -$7.30 Tax -$6.18 $11.63 $24.13 $21.05 -$4.75 NIAT -$3.33 $6.26 $13.00 $11.34 -$2.56 Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00 Cashflow -$8.33 $7.26 $14.00 $12.34 -$1.56 Prime Total Revenue $57.61 $93.78 $17.74 $36.98 $78.72 Total Expense $47.45 $5.52 $54.17 $3.93 $55.98 NIBT $10.16 $88.26 -$36.43 $33.05 $22.74 Tax $6.60 $57.37 -$23.68 $21.48 $14.78 NIAT $3.56 $30.89 -$12.75 $11.57 $7.96 Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00 Cashflow -$1.44 $31.89 -$11.75 $12.57 $8.96 Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 17. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Tidy cashflows streamline the workflow to facilitate advanced analytics like bootstrapping error bars and indifference curves Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 18. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Organically evolving the tidycf package while addressing business problems led to efficient and empathetic development Task: Valuations Process 1: Data Validation Process 2: Data Exploration Process n-1: Model Validation Process n+1 … z: Analysis with Model Framework 1: Data Validation Framework 2: Data Exploration Framework n-1: Model Validation Framework n+1 … z: Analysis with Model calc functions viz functions tbl functions … … Process 3: Model Building Framework 3: Model Building Process n: Model Intuition Framework n: Model Intuition Business Problems R Markdown Templates R functions
  • 19. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md RMarkdown templates enable package discoverability, immersion into the broader R language, and contextual knowledge transfer while generating documentation Code comments explain syntax and suggest new functions to try Text commentary facilitates knowledge transfer of business context and intuition
  • 20. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Functions manipulate tidy data and provide output consistent with their taxonomy and compatible with any tidyverse pipeline calc functions viz functions tbl functions data frame data frame graphic pivoted data get functions DB Conns, Internet Proxies, etc.
  • 21. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md These functions are intuitively related to help users quickly generate standard views Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 22. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Integration with the tidyverse allows functions to provide both structure and flexibility Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 23. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md tidycf’s R Project template standardizes file structure for better project management RStudio > File > New Project > New Directory • /analysis/ : core scripts (.Rmd) and final outputs (.HTML) • /data/ : raw data (or in our case, pulled from SQL) • /doc/: text files providing context and documentation • /ext/ : external files needed for project • /output/ : intermediate/final data formats • /src/ : other helper scripts (e.g. SQL, python)
  • 24. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md RMarkdown templates document data transformations and demonstrate relative paths to save intermediate artifacts in the appropriate directory Data Validation Exploratory Data Analysis Model Validation Model Analytics (Multiple Modeling Steps) … Model Monitoring raw out1 out_t out_t+1 model model out1 out2 out_t+1 model R Markdown Templates Output DataInput Data ./data/ ./analysis/ ./output/ Directoryexternal External Source
  • 25. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Industry, academia, and the open-source community served as inspirations for our three- pronged training approach Community BuildingBootcampsIndividual Resources Set-Up Guides, Self-Directed Learning, and Prerequisites Three day bootcamp – conceptual and hands-on Interaction (forum) and engagement (contribution) opportunities
  • 26. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Our resulting package emphasizes reproducibility while immersing business analysts in R odbc DBI dbplyr rmarkdown knitr httr devtools plotly tidyverse
  • 27. what’s in your workflow? reproducible business analysis at Capital One Emily Riederer Sr. Analyst, Capital One @EmilyRiederer / emily.riederer@capitalone.com