SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
2015
Agile Data In Seven Shocking Steps!
Christopher Bergh
@OPENDATASCI
AGENDA
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Seven Shocking Steps
3
K I T C H E N
DATA
Algorithm Nerd
Columbia, MIT, NASA-
Ames; ATC Automation
Into In 1990
Fuzzy Logic, Neural
Networks, Constraint
Satisfaction; Unix/C
Software Nerd
CTO, Dir Engineering, VP
Product Management
Into In 2000
Management of Software
Teams & Startups;
PowerPoint
Data Nerd
COO: ETL Engineers,
Analysts & Analytic Tool
Into In 2010
W. Edwards Deming,
Data, Bootstrapping;
Excel Hacking
WHO AM I
SO WHAT IS THE PROBLEM?
In one word ….
LOTSA People and Roles In
Analytic Teams
DATA SCIENTIST
REPORTING ANALYST
ETL ENGINEER
DATABASE ARCHITECT
DEV OPS ENGINEER
Data Governance
Manager
Graphic Source: EMC Education -- Building Data Science
LOTSA Technologies in Analytics
LOTSA Discovery & Experiments
Rapid expansion of data discovery as an alternative
to traditional, centrally managed BI delivery models
has shifted significant power ”
Gartner
“
Complexity
Another Field, Software Development, Ran into the Same
Problems With Complexity ...
… They Used Something Called ‘Agile’ To
Solve The Problem
AGENDA
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Seven Shocking Steps
Agile … A Solution To That Complexity
9/21/2015 10
analytics
agilemanifesto.org
Some Agile Practices Are Easier To Apply
9/21/2015 11
 Development Sprints
 User Stories
 Daily Meetings
 Defined Roles
 Retrospectives
 Pair Programming
 Burn Down Charts
Some practices have been difficult to apply
 Individual Development Environments
 Test Driven Development
 Branching And Merging
 Refactoring
 Small Releases
 Frequent Or Continuous Integration
 Experimentation For Learning
Agile – What Is Unique To Analytics?
9/21/2015 13
PUT THE
ANALYST
(New Role)
AT THE CENTER
Agile – What Is Unique To Analytics?
9/21/2015 14
ANALYICS
PERCIEVED
VALUE DECAY
CURVE
AGENDA
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Seven Shocking Steps
Seven Key Steps To Agile In Analytics
1. Add Tests
2. Modularize & Containerize
3. Do Branching & Merging
4. Use Multiple Environments
5. Give Your Analyst ‘Knobs’
6. Use Simple Storage
7. Support Three Workflows
9/21/2015 16
analytics
❶ Add tests
Types
1. Error – stop the line
2. Warning – investigate later
3. Info – list of changes
Examples
1. Input file row count way
below a critical threshold
2. Input file row count a little
below a threshold
3. These customers changed
territories
9/21/2015 17
And keep adding them with each feature developed!
❷ Modularize & Containerize
Modularize
1. Do not create on ‘monolith’ of code
2. Break up you work into component modules
Containerize
1. Manage the environment for each component
(e.g. Docker, AMI)
2. Practice Environment Version Control
9/21/2015 18
❸ Branch & Merge
Manage your work like code
• Why? Your work is just code: models, transforms, etc.
• Use a source code control system (like GIT) to enable Branching &
Merging
9/21/2015 19
❹ Use Multiple Environments
Provide a data environment for each branch
• Analysts need a controlled environment for their experiments
• Engineers need a place to develop outside of production
9/21/2015 20
❺ Give Your Analysts Knobs
• Give you analysts and data scientists the ability to edit the Production
Database safely
• Why?
• “Best-in-class companies take 12 days to integrate new data
sources into their analytical systems; industry average companies
take 60 days; and, laggards average 143 days”
Figure out how to do this in minutes
9/21/2015 21
Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information
❻ Use Simple Storage
• Keep copies of all your raw data in simple, cheap
storage (s3, HFDS)
 Data Lake
• Create parameterized variations of your process that
allow you to assemble data for experimentation,
development, and production
 Data Marts
• Be able to back up and restore your databases easily
 “My own database”
9/21/2015 22
❼ Support Three Workflows
9/21/2015 23
Small Team
 Promote directly to production
Feature Branch
 Merge back to production branch
Data Governance
 3rd party verification before
production mergeReview
Test
Approve
CONCLUSION
CONCLUSION

Contenu connexe

Tendances

Analytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret WeaponAnalytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret Weapon
Databricks
 

Tendances (20)

Understanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityUnderstanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application Quality
 
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
 
Hadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - AltiscaleHadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - Altiscale
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
How to Optimize Sales Analytics Using 10x the Data at 1/10th the CostHow to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
 
Analytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret WeaponAnalytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret Weapon
 
Dsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicDsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovic
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine Learning
 
A brief history of data warehousing
A brief history of data warehousingA brief history of data warehousing
A brief history of data warehousing
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
 
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on Snowflake
 
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOps
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
Accelerating Innovation with Unified Analytics with Ali Ghodsi
Accelerating Innovation with Unified Analytics with Ali GhodsiAccelerating Innovation with Unified Analytics with Ali Ghodsi
Accelerating Innovation with Unified Analytics with Ali Ghodsi
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streaming
 
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
 
Big Data – A New Testing Challenge
Big Data – A New Testing ChallengeBig Data – A New Testing Challenge
Big Data – A New Testing Challenge
 
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine LearningUnleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine Learning
 

En vedette

En vedette (8)

Open Data Science Conference Agile Data
Open Data Science Conference Agile DataOpen Data Science Conference Agile Data
Open Data Science Conference Agile Data
 
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
 
Chief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentChief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data Environment
 
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
 
The Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle ManagementThe Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle Management
 
5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016
 

Similaire à Data kitchen 7 agile steps - big data fest 9-18-2015

Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 

Similaire à Data kitchen 7 agile steps - big data fest 9-18-2015 (20)

Agile Data
Agile DataAgile Data
Agile Data
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics Environment
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Alten calsoft labs analytics service offerings
Alten calsoft labs   analytics service offeringsAlten calsoft labs   analytics service offerings
Alten calsoft labs analytics service offerings
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-Oracle
 
Democratizing Apache Spark for the Enterprise with Jonathan Gole
Democratizing Apache Spark for the Enterprise with Jonathan GoleDemocratizing Apache Spark for the Enterprise with Jonathan Gole
Democratizing Apache Spark for the Enterprise with Jonathan Gole
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
How to make your data scientists happy
How to make your data scientists happy How to make your data scientists happy
How to make your data scientists happy
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 

Dernier

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 

Dernier (20)

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 

Data kitchen 7 agile steps - big data fest 9-18-2015

  • 1. 2015 Agile Data In Seven Shocking Steps! Christopher Bergh @OPENDATASCI
  • 2. AGENDA What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Seven Shocking Steps
  • 3. 3 K I T C H E N DATA Algorithm Nerd Columbia, MIT, NASA- Ames; ATC Automation Into In 1990 Fuzzy Logic, Neural Networks, Constraint Satisfaction; Unix/C Software Nerd CTO, Dir Engineering, VP Product Management Into In 2000 Management of Software Teams & Startups; PowerPoint Data Nerd COO: ETL Engineers, Analysts & Analytic Tool Into In 2010 W. Edwards Deming, Data, Bootstrapping; Excel Hacking WHO AM I
  • 4. SO WHAT IS THE PROBLEM? In one word ….
  • 5. LOTSA People and Roles In Analytic Teams DATA SCIENTIST REPORTING ANALYST ETL ENGINEER DATABASE ARCHITECT DEV OPS ENGINEER Data Governance Manager Graphic Source: EMC Education -- Building Data Science
  • 7. LOTSA Discovery & Experiments Rapid expansion of data discovery as an alternative to traditional, centrally managed BI delivery models has shifted significant power ” Gartner “
  • 8. Complexity Another Field, Software Development, Ran into the Same Problems With Complexity ... … They Used Something Called ‘Agile’ To Solve The Problem
  • 9. AGENDA What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Seven Shocking Steps
  • 10. Agile … A Solution To That Complexity 9/21/2015 10 analytics agilemanifesto.org
  • 11. Some Agile Practices Are Easier To Apply 9/21/2015 11  Development Sprints  User Stories  Daily Meetings  Defined Roles  Retrospectives  Pair Programming  Burn Down Charts
  • 12. Some practices have been difficult to apply  Individual Development Environments  Test Driven Development  Branching And Merging  Refactoring  Small Releases  Frequent Or Continuous Integration  Experimentation For Learning
  • 13. Agile – What Is Unique To Analytics? 9/21/2015 13 PUT THE ANALYST (New Role) AT THE CENTER
  • 14. Agile – What Is Unique To Analytics? 9/21/2015 14 ANALYICS PERCIEVED VALUE DECAY CURVE
  • 15. AGENDA What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Seven Shocking Steps
  • 16. Seven Key Steps To Agile In Analytics 1. Add Tests 2. Modularize & Containerize 3. Do Branching & Merging 4. Use Multiple Environments 5. Give Your Analyst ‘Knobs’ 6. Use Simple Storage 7. Support Three Workflows 9/21/2015 16 analytics
  • 17. ❶ Add tests Types 1. Error – stop the line 2. Warning – investigate later 3. Info – list of changes Examples 1. Input file row count way below a critical threshold 2. Input file row count a little below a threshold 3. These customers changed territories 9/21/2015 17 And keep adding them with each feature developed!
  • 18. ❷ Modularize & Containerize Modularize 1. Do not create on ‘monolith’ of code 2. Break up you work into component modules Containerize 1. Manage the environment for each component (e.g. Docker, AMI) 2. Practice Environment Version Control 9/21/2015 18
  • 19. ❸ Branch & Merge Manage your work like code • Why? Your work is just code: models, transforms, etc. • Use a source code control system (like GIT) to enable Branching & Merging 9/21/2015 19
  • 20. ❹ Use Multiple Environments Provide a data environment for each branch • Analysts need a controlled environment for their experiments • Engineers need a place to develop outside of production 9/21/2015 20
  • 21. ❺ Give Your Analysts Knobs • Give you analysts and data scientists the ability to edit the Production Database safely • Why? • “Best-in-class companies take 12 days to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards average 143 days” Figure out how to do this in minutes 9/21/2015 21 Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information
  • 22. ❻ Use Simple Storage • Keep copies of all your raw data in simple, cheap storage (s3, HFDS)  Data Lake • Create parameterized variations of your process that allow you to assemble data for experimentation, development, and production  Data Marts • Be able to back up and restore your databases easily  “My own database” 9/21/2015 22
  • 23. ❼ Support Three Workflows 9/21/2015 23 Small Team  Promote directly to production Feature Branch  Merge back to production branch Data Governance  3rd party verification before production mergeReview Test Approve