SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
1
K I T C H E N
DATA
Do agile data in just
5 shocking steps!
Copyright © 2015 by DataKitchen, Inc. All Rights Reserved.
by
Gil Benghiat
gil@datakitchen.io
@benghiat
@datakitchen_io
Tuesday, May 19
CIC (Cambridge Innovation Center)
1 Broadway, Cambridge, MA
Agenda
•Gil & DataKitchen
•A look at Agile through Data lenses
•How to do Agile Data
2
Gil Benghiat – decades working with data
• Network Management Data
• Database Management
• Clinical Trial Data
• Pharmaceutical Sales Data
• Data Liberation
• Data Preparation
gil@datakitchen.io
@benghiat
6/2/2015 3
Solid Oak Consulting
4
Data Analysts And Their Teams Are Spending
60-80% Of Their Time
On Data Preparation And Production
This creates an expectation gap
5
Analyze
Prepare Data
C
Analyze
Prepare Data
Business Customer
Expectation
Analyst
Reality
Communicate
The business does not
think that Analysts are
preparing data
Analysts don’t want to
prepare data
6
DataKitchen is on a mission
to integrate and organize
data to make analysts
super-powered.
• Offering
• Set-up service
• Software subscription
• UI to integrate data
• Benefits
• Data warehouse
• Eliminate drudgery of repeated integrations
agilemanifesto.org
6/2/2015 7
analytics
Switch the
word
“software” to
“analytics”
agilemanifesto.org
6/2/2015 8
and excel files
s/software/analytics/
The switch
works for the 12
principles too.
Iterate to
improve the
analytics.
Iterate to
improve the
process.
Agile methodologies contain a number of practices
that can apply to data
 Sprints
 Stories
 Prioritization
 Daily Meeting
 Defined roles
 Retrospectives
 Pair Programming
 Burn down charts
 etc.
9
The Data Analyst has the central role as the
bridge between business and data
What do analysts and data scientists want?
Flexibility
&
Speed
6/2/2015 10
You need to
be fast and
produce
trustworthy
data
Some practices have been difficult to apply to data
 Test Driven Development
 Branching and merging
 Refactoring
 Small Releases
 Frequent or Continuous Integration
 Experimentation for learning
11
Do agile data in just
5 shocking steps!
12
❶ Add tests
Types
1. Error – stop the line
2. Warning – investigate later
3. Info – list of changes
Examples
1. Input file row count way below
a critical threshold
2. Input file row count a little
below a threshold
3. These customers changed
territories
6/2/2015 13
And keep adding them with each feature developed!
❷ Manage your transforms like code
Use a source code control system (like GIT) to enable:
• Branching
• Merging
• Diff
6/2/2015 14
❸ Provide a data environment for each branch
The underlying data is needed to
develop and test the code/transformations
6/2/2015 15
❹ Support three types of workflows
Small Team
Promote directly to production
Feature Branch
Merge back to production branch
Data Governance
3rd party verification before
production merge
6/2/2015 16
Review
Test
Approve
❺ Give you analysts and data scientists the ability
to edit the DW safely
6/2/2015 17
Best-in-class companies take 12 days to
integrate new data sources into their
analytical systems; industry average
companies take 60 days; and, laggards
average 143 days
Source: Aberdeen Group: Data Management for BI: Fueling the
analytical engine with high-octane information
Figure out how to
do this in
minutes
18
K I T C H E N
DATA
Do agile data in just
5 shocking steps!
Copyright © 2015 by DataKitchen, Inc. All Rights Reserved.
by
Gil Benghiat
gil@datakitchen.io
@benghiat
@datakitchen_io
Tuesday, May 19
CIC (Cambridge Innovation Center)
1 Broadway, Cambridge, MA

Contenu connexe

Tendances

Dsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicDsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicRadovan Baćović
 
5 Simple Steps to Unleash Big Data Talend Connect
5 Simple Steps to Unleash Big Data Talend Connect5 Simple Steps to Unleash Big Data Talend Connect
5 Simple Steps to Unleash Big Data Talend ConnectTalend
 
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine LearningUnleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine LearningTalend
 
Achieving Agility and Scale for Your Data Lake - Talend
Achieving Agility and Scale for Your Data Lake - TalendAchieving Agility and Scale for Your Data Lake - Talend
Achieving Agility and Scale for Your Data Lake - TalendTalend
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scaleLooker
 
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTODatadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTOTheFamily
 
Talend 6.1 - What's New in Talend?
Talend 6.1 - What's New in Talend?Talend 6.1 - What's New in Talend?
Talend 6.1 - What's New in Talend?Talend
 
Talend AS A Product
Talend AS A ProductTalend AS A Product
Talend AS A ProductAbdul Manaf
 
Data Engineering Efficiency @ Netflix - Strata 2017
Data Engineering Efficiency @ Netflix - Strata 2017Data Engineering Efficiency @ Netflix - Strata 2017
Data Engineering Efficiency @ Netflix - Strata 2017Michelle Ufford
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceDatabricks
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseDatabricks
 
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeAtScale
 
Big Data Testing Strategies
Big Data Testing StrategiesBig Data Testing Strategies
Big Data Testing StrategiesKnoldus Inc.
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesLars Albertsson
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceEric Kavanagh
 
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Mike Tuche, CEO of Talend: Enabling the Data Driven EnterpriseMike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Mike Tuche, CEO of Talend: Enabling the Data Driven EnterpriseTalend
 
Introducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on DatabricksIntroducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on DatabricksDatabricks
 
Redis rise of Dataops
Redis rise of DataopsRedis rise of Dataops
Redis rise of Dataopslandoop
 

Tendances (20)

Dsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicDsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovic
 
Data ops in practice
Data ops in practiceData ops in practice
Data ops in practice
 
5 Simple Steps to Unleash Big Data Talend Connect
5 Simple Steps to Unleash Big Data Talend Connect5 Simple Steps to Unleash Big Data Talend Connect
5 Simple Steps to Unleash Big Data Talend Connect
 
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine LearningUnleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine Learning
 
Achieving Agility and Scale for Your Data Lake - Talend
Achieving Agility and Scale for Your Data Lake - TalendAchieving Agility and Scale for Your Data Lake - Talend
Achieving Agility and Scale for Your Data Lake - Talend
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
 
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTODatadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
 
Talend 6.1 - What's New in Talend?
Talend 6.1 - What's New in Talend?Talend 6.1 - What's New in Talend?
Talend 6.1 - What's New in Talend?
 
Talend AS A Product
Talend AS A ProductTalend AS A Product
Talend AS A Product
 
Data Engineering Efficiency @ Netflix - Strata 2017
Data Engineering Efficiency @ Netflix - Strata 2017Data Engineering Efficiency @ Netflix - Strata 2017
Data Engineering Efficiency @ Netflix - Strata 2017
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field Experience
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on Snowflake
 
Big Data Testing Strategies
Big Data Testing StrategiesBig Data Testing Strategies
Big Data Testing Strategies
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data Governance
 
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Mike Tuche, CEO of Talend: Enabling the Data Driven EnterpriseMike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
 
Introducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on DatabricksIntroducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on Databricks
 
Redis rise of Dataops
Redis rise of DataopsRedis rise of Dataops
Redis rise of Dataops
 

En vedette

Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosionactifio
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterMatt Tesauro
 
Chief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentChief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentCraig Milroy
 
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...DataKitchen
 
The Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle ManagementThe Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle Managementactifio
 
5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkersactifio
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 Dataiku
 

En vedette (7)

Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
 
Chief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentChief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data Environment
 
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
 
The Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle ManagementThe Future of Enterprise IT: DevOps and Data Lifecycle Management
The Future of Enterprise IT: DevOps and Data Lifecycle Management
 
5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers5 Invaluable Insights From Top DevOps Thinkers
5 Invaluable Insights From Top DevOps Thinkers
 
The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016 The Rise of the DataOps - Dataiku - J On the Beach 2016
The Rise of the DataOps - Dataiku - J On the Beach 2016
 

Similaire à Do Agile Data in Just 5 Shocking Steps!

ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoDataKitchen
 
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202Christopher Gutknecht
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019DataKitchen
 
Sabre: Mastering a strong foundation for operational excellence and enhanced ...
Sabre: Mastering a strong foundation for operational excellence and enhanced ...Sabre: Mastering a strong foundation for operational excellence and enhanced ...
Sabre: Mastering a strong foundation for operational excellence and enhanced ...Orchestra Networks
 
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019DataKitchen
 
CSI approach to your Production Management
CSI approach to your Production ManagementCSI approach to your Production Management
CSI approach to your Production ManagementAndrius Gudaitis
 
DGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data QualityDGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data QualityCaserta
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
DC Business Intelligentsia January Meetup: Agile BI and Data Chaos
DC Business Intelligentsia January Meetup: Agile BI and Data ChaosDC Business Intelligentsia January Meetup: Agile BI and Data Chaos
DC Business Intelligentsia January Meetup: Agile BI and Data ChaosExcella
 
Run IT as Business Meetup self-service BI
Run IT as Business Meetup self-service BIRun IT as Business Meetup self-service BI
Run IT as Business Meetup self-service BIMark Wu
 
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellNadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellIT Arena
 
Keeping the Pulse of Your Data: Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data:  Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data:  Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data: Why You Need Data Observability to Improve D...Precisely
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...VMware Tanzu
 
Baseline counterparty database
Baseline counterparty database Baseline counterparty database
Baseline counterparty database Xoriant CDi
 
apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...
apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...
apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...apidays
 
How to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital worldHow to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital worldCaseWare IDEA
 
Keeping the Pulse of Your Data:  Why You Need Data Observability 
Keeping the Pulse of Your Data:  Why You Need Data Observability Keeping the Pulse of Your Data:  Why You Need Data Observability 
Keeping the Pulse of Your Data:  Why You Need Data Observability Precisely
 

Similaire à Do Agile Data in Just 5 Shocking Steps! (20)

ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps Manifesto
 
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
 
Sabre: Mastering a strong foundation for operational excellence and enhanced ...
Sabre: Mastering a strong foundation for operational excellence and enhanced ...Sabre: Mastering a strong foundation for operational excellence and enhanced ...
Sabre: Mastering a strong foundation for operational excellence and enhanced ...
 
DATA BLENDING
DATA BLENDINGDATA BLENDING
DATA BLENDING
 
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
 
CSI approach to your Production Management
CSI approach to your Production ManagementCSI approach to your Production Management
CSI approach to your Production Management
 
DGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data QualityDGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data Quality
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
DC Business Intelligentsia January Meetup: Agile BI and Data Chaos
DC Business Intelligentsia January Meetup: Agile BI and Data ChaosDC Business Intelligentsia January Meetup: Agile BI and Data Chaos
DC Business Intelligentsia January Meetup: Agile BI and Data Chaos
 
Run IT as Business Meetup self-service BI
Run IT as Business Meetup self-service BIRun IT as Business Meetup self-service BI
Run IT as Business Meetup self-service BI
 
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellNadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
 
Keeping the Pulse of Your Data: Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data:  Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data:  Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data: Why You Need Data Observability to Improve D...
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
 
Baseline counterparty database
Baseline counterparty database Baseline counterparty database
Baseline counterparty database
 
Big Data + PeopleSoft = BIG WIN!
Big Data + PeopleSoft = BIG WIN!Big Data + PeopleSoft = BIG WIN!
Big Data + PeopleSoft = BIG WIN!
 
apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...
apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...
apidays LIVE LONDON - Data monetisation: Increasing revenue through data-driv...
 
How to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital worldHow to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital world
 
Keeping the Pulse of Your Data:  Why You Need Data Observability 
Keeping the Pulse of Your Data:  Why You Need Data Observability Keeping the Pulse of Your Data:  Why You Need Data Observability 
Keeping the Pulse of Your Data:  Why You Need Data Observability 
 

Dernier

Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 

Dernier (20)

Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 

Do Agile Data in Just 5 Shocking Steps!

  • 1. 1 K I T C H E N DATA Do agile data in just 5 shocking steps! Copyright © 2015 by DataKitchen, Inc. All Rights Reserved. by Gil Benghiat gil@datakitchen.io @benghiat @datakitchen_io Tuesday, May 19 CIC (Cambridge Innovation Center) 1 Broadway, Cambridge, MA
  • 2. Agenda •Gil & DataKitchen •A look at Agile through Data lenses •How to do Agile Data 2
  • 3. Gil Benghiat – decades working with data • Network Management Data • Database Management • Clinical Trial Data • Pharmaceutical Sales Data • Data Liberation • Data Preparation gil@datakitchen.io @benghiat 6/2/2015 3 Solid Oak Consulting
  • 4. 4 Data Analysts And Their Teams Are Spending 60-80% Of Their Time On Data Preparation And Production
  • 5. This creates an expectation gap 5 Analyze Prepare Data C Analyze Prepare Data Business Customer Expectation Analyst Reality Communicate The business does not think that Analysts are preparing data Analysts don’t want to prepare data
  • 6. 6 DataKitchen is on a mission to integrate and organize data to make analysts super-powered. • Offering • Set-up service • Software subscription • UI to integrate data • Benefits • Data warehouse • Eliminate drudgery of repeated integrations
  • 8. agilemanifesto.org 6/2/2015 8 and excel files s/software/analytics/ The switch works for the 12 principles too. Iterate to improve the analytics. Iterate to improve the process.
  • 9. Agile methodologies contain a number of practices that can apply to data  Sprints  Stories  Prioritization  Daily Meeting  Defined roles  Retrospectives  Pair Programming  Burn down charts  etc. 9 The Data Analyst has the central role as the bridge between business and data
  • 10. What do analysts and data scientists want? Flexibility & Speed 6/2/2015 10 You need to be fast and produce trustworthy data
  • 11. Some practices have been difficult to apply to data  Test Driven Development  Branching and merging  Refactoring  Small Releases  Frequent or Continuous Integration  Experimentation for learning 11
  • 12. Do agile data in just 5 shocking steps! 12
  • 13. ❶ Add tests Types 1. Error – stop the line 2. Warning – investigate later 3. Info – list of changes Examples 1. Input file row count way below a critical threshold 2. Input file row count a little below a threshold 3. These customers changed territories 6/2/2015 13 And keep adding them with each feature developed!
  • 14. ❷ Manage your transforms like code Use a source code control system (like GIT) to enable: • Branching • Merging • Diff 6/2/2015 14
  • 15. ❸ Provide a data environment for each branch The underlying data is needed to develop and test the code/transformations 6/2/2015 15
  • 16. ❹ Support three types of workflows Small Team Promote directly to production Feature Branch Merge back to production branch Data Governance 3rd party verification before production merge 6/2/2015 16 Review Test Approve
  • 17. ❺ Give you analysts and data scientists the ability to edit the DW safely 6/2/2015 17 Best-in-class companies take 12 days to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards average 143 days Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information Figure out how to do this in minutes
  • 18. 18 K I T C H E N DATA Do agile data in just 5 shocking steps! Copyright © 2015 by DataKitchen, Inc. All Rights Reserved. by Gil Benghiat gil@datakitchen.io @benghiat @datakitchen_io Tuesday, May 19 CIC (Cambridge Innovation Center) 1 Broadway, Cambridge, MA