SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
AGILE DATA
Christopher Bergh
Head Chef,
DataKitchen
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_
BOSTON 2015
@opendatasci
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
3K I T C H E N
DATA
Algorithm Nerd
Columbia, MIT, NASA-
Ames; ATC Automation
Into In 1990
Fuzzy Logic, Neural
Networks, Constraint
Satisfaction; Unix/C
Software Nerd
CTO, Dir Engineering, VP
Product Management
Into In 2000
Management of
Software Teams &
Startups; PowerPoint
Data Nerd
COO: ETL Engineers,
Analysts & Analytic Tool
Into In 2010
W. Edwards Deming,
Data, Bootstrapping;
Excel Hacking
WHO AM I
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
SO WHAT IS THE PROBLEM?
In one word ….
LOTSA Technologies in Analytics
LOTSA People In Analytic Teams
DATA SCIENTIST
REPORTING ANALYST
ETL ENGINEER
DATABASE ARCHITECT
DEV OPS ENGINEERData Governance
LOTSA Data & Analysis
ONE OFF
RE
USE
LOTSA Missed Expectations
Analyze
Prepare Data
C
Analyze
Prepare Data
Business Customer Expectation Analyst Reality
Communicate The business does not
think that Analysts are
preparing data
Analysts don’t want to
prepare data
Complexity
Another Field, Software Development, Ran into
the Same Problems With Complexity ...
… They Used Something Called
‘Agile’ To Solve The Problem
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
AGILEMANIFESTO.ORG
5/31/2015 12
AGILEMANIFESTO.ORG
AGILEMANIFESTO.ORG
13
analytics
s/software/analytics/
PRACTICES THAT ARE EASY TO APPLY
 Development Sprints
 User Stories
 Daily Meetings
 Defined Roles
 Retrospectives
 Pair Programming
 Burn Down Charts
SOME PRACTICES HAVE BEEN DIFFICULT TO APPLY
 Test Driven Development
 Branching And Merging
 Refactoring
 Small Releases
 Frequent Or Continuous Integration
 Experimentation For Learning
 Individual Development Environments
AGILE – WHAT IS UNIQUE TO ANALYTICS?
17
PUT THE
ANALYST AT
THE CENTER
AGILE – WHAT IS UNIQUE TO ANALYTICS?
ANALYICS
PERCIEVED
VALUE DECAY
CURVE
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
Why? Your work is just code: models, transforms, etc.
Use a source code control system (like GIT) to enable:
Branching
Merging
Diff
5/31/2015 20
1. MANAGE YOUR WORK LIKE CODE
2. TEST AND CONTAIN
1. Create and monitor tests
2. Test on separate data from production
3. Run tests early and often
4. Target 20% of code for tests
5/31/2015 21
Unit Tests & Systems Test … Keep Adding & Improving
1. Break up you work into components
2. Manage the environment for each
component (e.g. Docker, AMI)
3. Practice Environment Version
Control
3. PROVIDE SEPARATE ENVIRONMENTS FOR ANALYSTS
Why?
Analysts need
their data the
data to iterate,
develop &
explore.
5/31/2015 22
4. SUPPORT THREE TYPES OF WORKFLOWS
Small Team
Work directly on production
Feature Branch
Merge back to production branch
Data Governance
3rd party verification before production
merge
5/31/2015 23
Review
Test
Approve
5. GIVE ANALYSTS ABILITY TO EDIT DATABASE SAFELY
5/31/2015 24
Best-in-class companies take 12 days
to integrate new data sources into
their analytical systems; industry
average companies take 60 days;
and, laggards average 143 days
Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information
Figure out how to
do this in
minutes
CONCLUSION
CONCLUSION
AGILE DATA Christopher Bergh
cbergh@datakitchen.io
Questions?
Comments?
BOSTON 2015
@opendatasci

Contenu connexe

Tendances

Data ops in practice - Swedish style
Data ops in practice - Swedish styleData ops in practice - Swedish style
Data ops in practice - Swedish styleLars Albertsson
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceHakka Labs
 
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksNotebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksMichelle Ufford
 
Testing the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTesting the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTechWell
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelDataiku
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scaleLooker
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
Dataiku data science studio
Dataiku data science studioDataiku data science studio
Dataiku data science studioNorman Poh
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflixCody Rioux
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingCognizant
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku
 
Agile development of data science projects | Part 1
Agile development of data science projects | Part 1 Agile development of data science projects | Part 1
Agile development of data science projects | Part 1 Anubhav Dhiman
 
H2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicH2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicSri Ambati
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationInside Analysis
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Productioniguazio
 
Importance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLowImportance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLowDatabricks
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenChristopher Bergh
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)dtz001
 
WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software Patrick Van Renterghem
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 

Tendances (20)

Data ops in practice - Swedish style
Data ops in practice - Swedish styleData ops in practice - Swedish style
Data ops in practice - Swedish style
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data Science
 
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksNotebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks
 
Testing the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTesting the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big Problems
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
Dataiku data science studio
Dataiku data science studioDataiku data science studio
Dataiku data science studio
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflix
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
 
Agile development of data science projects | Part 1
Agile development of data science projects | Part 1 Agile development of data science projects | Part 1
Agile development of data science projects | Part 1
 
H2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicH2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom Kraljevic
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with Automation
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Production
 
Importance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLowImportance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLow
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
 
WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 

En vedette

Datalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven MarketingDatalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven MarketingDatalicious
 
Bridged Overview by CodeData
Bridged Overview by CodeDataBridged Overview by CodeData
Bridged Overview by CodeDataSam Sur
 
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosionactifio
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterMatt Tesauro
 
Chief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentChief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentCraig Milroy
 
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...DataKitchen
 
The Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle ManagementThe Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle Managementactifio
 
5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkersactifio
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 Dataiku
 

En vedette (10)

FlyData Koichi Fujikawa / FlyData Founder #SVFT
FlyData  Koichi Fujikawa / FlyData Founder #SVFTFlyData  Koichi Fujikawa / FlyData Founder #SVFT
FlyData Koichi Fujikawa / FlyData Founder #SVFT
 
Datalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven MarketingDatalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven Marketing
 
Bridged Overview by CodeData
Bridged Overview by CodeDataBridged Overview by CodeData
Bridged Overview by CodeData
 
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
 
Chief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentChief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data Environment
 
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
 
The Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle ManagementThe Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle Management
 
5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016
 

Similaire à Open Data Science Conference Agile Data

#rstats lessons for #measure
#rstats lessons for #measure#rstats lessons for #measure
#rstats lessons for #measureMark Edmondson
 
Make data simple in the cognitive era
Make data simple in the cognitive eraMake data simple in the cognitive era
Make data simple in the cognitive eraIBM Analytics
 
2014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.0
2014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.02014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.0
2014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.0Joakim Lindbom
 
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...InfluxData
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the tradeFangda Wang
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence WorkshopDavid Tan
 
Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Massimiliano Di Penta
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudInside Analysis
 
Desmistificando Tecnologias
Desmistificando TecnologiasDesmistificando Tecnologias
Desmistificando TecnologiasJuliano Martins
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Chief Analytics Officer Forum
 
7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier 7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier Thoughtworks
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019DataKitchen
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfneelakandan2001kpm
 
Intro of Key Features of SoftCAAT Pro software
Intro of Key Features of SoftCAAT Pro softwareIntro of Key Features of SoftCAAT Pro software
Intro of Key Features of SoftCAAT Pro softwarerafeq
 
DevOps for the Discouraged
DevOps for the Discouraged DevOps for the Discouraged
DevOps for the Discouraged James Wickett
 
Belladati Meetup Singapore Workshop
Belladati Meetup Singapore WorkshopBelladati Meetup Singapore Workshop
Belladati Meetup Singapore Workshopbelladati
 
Intro of Key Features of SoftCAAT BI Software
Intro of Key Features of SoftCAAT BI SoftwareIntro of Key Features of SoftCAAT BI Software
Intro of Key Features of SoftCAAT BI Softwarerafeq
 
vodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA
 

Similaire à Open Data Science Conference Agile Data (20)

#rstats lessons for #measure
#rstats lessons for #measure#rstats lessons for #measure
#rstats lessons for #measure
 
Make data simple in the cognitive era
Make data simple in the cognitive eraMake data simple in the cognitive era
Make data simple in the cognitive era
 
2014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.0
2014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.02014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.0
2014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.0
 
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence Workshop
 
Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
Desmistificando Tecnologias
Desmistificando TecnologiasDesmistificando Tecnologias
Desmistificando Tecnologias
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
 
7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier 7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
 
Intro of Key Features of SoftCAAT Pro software
Intro of Key Features of SoftCAAT Pro softwareIntro of Key Features of SoftCAAT Pro software
Intro of Key Features of SoftCAAT Pro software
 
DevOps for the Discouraged
DevOps for the Discouraged DevOps for the Discouraged
DevOps for the Discouraged
 
Belladati Meetup Singapore Workshop
Belladati Meetup Singapore WorkshopBelladati Meetup Singapore Workshop
Belladati Meetup Singapore Workshop
 
AE Foyer: Information Management in the Digital Enterprise
AE Foyer: Information Management in the Digital EnterpriseAE Foyer: Information Management in the Digital Enterprise
AE Foyer: Information Management in the Digital Enterprise
 
Intro of Key Features of SoftCAAT BI Software
Intro of Key Features of SoftCAAT BI SoftwareIntro of Key Features of SoftCAAT BI Software
Intro of Key Features of SoftCAAT BI Software
 
vodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applications
 

Plus de DataKitchen

seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019DataKitchen
 
ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoDataKitchen
 
Fri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataopsFri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataopsDataKitchen
 
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & RedshiftIntroduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & RedshiftDataKitchen
 
Redshift Introduction
Redshift IntroductionRedshift Introduction
Redshift IntroductionDataKitchen
 

Plus de DataKitchen (6)

seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
 
ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps Manifesto
 
Fri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataopsFri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataops
 
Amazon EMR
Amazon EMRAmazon EMR
Amazon EMR
 
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & RedshiftIntroduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
 
Redshift Introduction
Redshift IntroductionRedshift Introduction
Redshift Introduction
 

Dernier

Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 

Dernier (20)

Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

Open Data Science Conference Agile Data

  • 1. AGILE DATA Christopher Bergh Head Chef, DataKitchen O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci
  • 2. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  • 3. 3K I T C H E N DATA Algorithm Nerd Columbia, MIT, NASA- Ames; ATC Automation Into In 1990 Fuzzy Logic, Neural Networks, Constraint Satisfaction; Unix/C Software Nerd CTO, Dir Engineering, VP Product Management Into In 2000 Management of Software Teams & Startups; PowerPoint Data Nerd COO: ETL Engineers, Analysts & Analytic Tool Into In 2010 W. Edwards Deming, Data, Bootstrapping; Excel Hacking WHO AM I
  • 4. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  • 5. SO WHAT IS THE PROBLEM? In one word ….
  • 7. LOTSA People In Analytic Teams DATA SCIENTIST REPORTING ANALYST ETL ENGINEER DATABASE ARCHITECT DEV OPS ENGINEERData Governance
  • 8. LOTSA Data & Analysis ONE OFF RE USE
  • 9. LOTSA Missed Expectations Analyze Prepare Data C Analyze Prepare Data Business Customer Expectation Analyst Reality Communicate The business does not think that Analysts are preparing data Analysts don’t want to prepare data
  • 10. Complexity Another Field, Software Development, Ran into the Same Problems With Complexity ... … They Used Something Called ‘Agile’ To Solve The Problem
  • 11. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  • 15. PRACTICES THAT ARE EASY TO APPLY  Development Sprints  User Stories  Daily Meetings  Defined Roles  Retrospectives  Pair Programming  Burn Down Charts
  • 16. SOME PRACTICES HAVE BEEN DIFFICULT TO APPLY  Test Driven Development  Branching And Merging  Refactoring  Small Releases  Frequent Or Continuous Integration  Experimentation For Learning  Individual Development Environments
  • 17. AGILE – WHAT IS UNIQUE TO ANALYTICS? 17 PUT THE ANALYST AT THE CENTER
  • 18. AGILE – WHAT IS UNIQUE TO ANALYTICS? ANALYICS PERCIEVED VALUE DECAY CURVE
  • 19. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  • 20. Why? Your work is just code: models, transforms, etc. Use a source code control system (like GIT) to enable: Branching Merging Diff 5/31/2015 20 1. MANAGE YOUR WORK LIKE CODE
  • 21. 2. TEST AND CONTAIN 1. Create and monitor tests 2. Test on separate data from production 3. Run tests early and often 4. Target 20% of code for tests 5/31/2015 21 Unit Tests & Systems Test … Keep Adding & Improving 1. Break up you work into components 2. Manage the environment for each component (e.g. Docker, AMI) 3. Practice Environment Version Control
  • 22. 3. PROVIDE SEPARATE ENVIRONMENTS FOR ANALYSTS Why? Analysts need their data the data to iterate, develop & explore. 5/31/2015 22
  • 23. 4. SUPPORT THREE TYPES OF WORKFLOWS Small Team Work directly on production Feature Branch Merge back to production branch Data Governance 3rd party verification before production merge 5/31/2015 23 Review Test Approve
  • 24. 5. GIVE ANALYSTS ABILITY TO EDIT DATABASE SAFELY 5/31/2015 24 Best-in-class companies take 12 days to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards average 143 days Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information Figure out how to do this in minutes
  • 27. AGILE DATA Christopher Bergh cbergh@datakitchen.io Questions? Comments? BOSTON 2015 @opendatasci