SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Seven Steps to High-
Velocity Data Analytics with
DataOps
2017-03-16 ! San Jose, CA
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
Agenda
DataKitchen Summary
Analytic Landscape
Seven Shocking Steps
Case Study
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
DataKitchen Executive Summary
• Who we serve:
• IT & Analytic Teams
• What we do:
• World’s first company focused on enabling DataOps
• How we do it:
• DataOps Software Platform
• Why DataOps matters? It allows analytic teams to:
• Deliver insight fast
• With high quality
• Using the tools they love
• Reuse what they’ve created
• Continuously show value to their business customers
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
The world changed
February 2005
“I am competing
with Amazon”
-- Director of Analytics
Brand team support
Big Pharma Company
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
Expectations are going higher
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
Your competitors are moving fast
WATERFALL ➡ AGILE ➡ DEVOPS RELEASE FREQUENCY
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
Your customers expect high quality
HAND MADE ➡ MASS PRODUCTION ➡ LEAN MANUFACTURING
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
How do you deliver on high expectations
on speed and quality?
The answer is DataOps
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
Seven Steps to Implement DataOps
1. Add Data and Logic Tests
2. Use a Version Control System
3. Branch and Merge
4. Use Multiple Environments
5. Reuse & Containerize
6. Parameterize Your Processing
7. Use Simple Storage
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
❶ Add Data And Logic Tests
Are you outputs
consistent?
And Save Test
Results!
Access:
Python Code
Transform:
SQL Code,
ETL Code
Model:
R Code
Visualize:
Tableau
Workbook XML
Report:
Tableau Online
Are data inputs
free from
issues?
Is your business logic
still correct?
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
Example Tests
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
At the end of the day, Analytic work is all
just code
Access:
Python Code
Transform:
SQL Code,
ETL Code
Model:
R Code
Visualize:
Tableau
Workbook XML
Report:
Tableau Online
❷ Use a Version Control System
Source Code
Control
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
❸ Branch & Merge
Source Code
Control
Branching & Merging enables people to safely work on their own
tasks
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
Access:
Python Code
Transform:
SQL Code,
ETL Code
Model:
R Code
Visualize:
Tableau
Workbook XML
Report:
Tableau Online
❹ Use Multiple Environments
Analytic Environment
Your Analytic Work Requires Coordinating Tools And Hardware
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
❹ Use Multiple Environments
Provide an Analytic Environment for each branch
• Analysts need a controlled environment for their experiments
• Engineers need a place to develop outside of production
• Update Production only after all tests are run!
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
❺ Reuse & Containerize
Containerize
1. Manage the environment for each
component (e.g. Docker, AMI)
2. Practice Environment Version Control
Reuse
1. Do not create one ‘monolith’ of code
2. Reuse the code and results
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
❻ Parameterize Your Processing
• Parameters and named
sets of parameters will
increase your velocity
• With parameters, you can
vary
• Inputs [you can make a
time machine]
• Outputs
• Steps in the workflow
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
❼ Use Simple Storage
• Data Lake
• Keep copies of all your raw data in simple, cheap
storage (s3, HFDS, file system)
• Data Restore: Be able to back up and restore your
databases easily
• “My Own Database”: Data Marts On Demand
• Create parameterized variations of your process that
allow you to assemble data for experimentation,
development, and productionDMDWDM
Transfor
m
Transfor
m
Transfor
m
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
DataOps case study
• One Data Engineer
• Supports 12 analysts
• Makes 12 schema changes a week without breaking
anything
• One Data Analyst supports hundreds of sales people
• Other Data Analysts make visualization changes and publish
them the next day
Analytics Team supporting Sales and Marketing in a
Pharmaceutical Company
Copyright © 2017 by DataKitchen, Inc. All Rights Reserved.
DataKitchen Executive Summary
• Who we serve:
• IT & Analytic Teams
• What we do:
• World’s first company focused on enabling DataOps
• How we do it:
• DataOps Software Platform
• Why DataOps matters? It allows analytic teams to:
• Deliver insight fast
• With high quality
• Using the tools they love
• Reuse what they’ve created
• Continuously show value to their business customers

Contenu connexe

Tendances

Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...Rehgan Avon
 
Low-tech, Low-cost data management: Six insights from national reporting on f...
Low-tech, Low-cost data management: Six insights from national reporting on f...Low-tech, Low-cost data management: Six insights from national reporting on f...
Low-tech, Low-cost data management: Six insights from national reporting on f...srjbridge
 
90% of Enterprises are Using DataOps. Why Aren’t You?
90% of Enterprises are Using DataOps. Why Aren’t You?90% of Enterprises are Using DataOps. Why Aren’t You?
90% of Enterprises are Using DataOps. Why Aren’t You?Delphix
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsEllen Friedman
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesEric Kavanagh
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldDATAVERSITY
 
The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationEric Kavanagh
 
Will AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and DashboardsWill AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and DashboardsEric Kavanagh
 
Choosing the Right Open Source Database
Choosing the Right Open Source DatabaseChoosing the Right Open Source Database
Choosing the Right Open Source DatabaseAll Things Open
 
Streaming and Visual Data Discovery for the Internet of Things
Streaming and Visual Data Discovery for the Internet of ThingsStreaming and Visual Data Discovery for the Internet of Things
Streaming and Visual Data Discovery for the Internet of ThingsDatawatchCorporation
 
Monitoring data quality by Jos Gheerardyn of Yields.io
Monitoring data quality by Jos Gheerardyn of Yields.ioMonitoring data quality by Jos Gheerardyn of Yields.io
Monitoring data quality by Jos Gheerardyn of Yields.ioDataops Ghent Meetup
 
Best Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the OrganizationBest Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the OrganizationChasity Gibson
 
Enabling DataOps with Unified Data Lineage
Enabling DataOps with Unified Data LineageEnabling DataOps with Unified Data Lineage
Enabling DataOps with Unified Data LineageMANTA
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
progress_DBBI-infographic_01-01
progress_DBBI-infographic_01-01progress_DBBI-infographic_01-01
progress_DBBI-infographic_01-01Natasha Peterson
 
Dsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicDsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicRadovan Baćović
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) Dataiku
 
Why Data Science Projects Fail?
Why Data Science Projects Fail?Why Data Science Projects Fail?
Why Data Science Projects Fail?Ethan Ram
 
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Data Con LA
 
13 2792 big-data_keynote_presentation_finalpass_05_d_v02
13 2792 big-data_keynote_presentation_finalpass_05_d_v0213 2792 big-data_keynote_presentation_finalpass_05_d_v02
13 2792 big-data_keynote_presentation_finalpass_05_d_v02Erin Kerrigan
 

Tendances (20)

Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a...
 
Low-tech, Low-cost data management: Six insights from national reporting on f...
Low-tech, Low-cost data management: Six insights from national reporting on f...Low-tech, Low-cost data management: Six insights from national reporting on f...
Low-tech, Low-cost data management: Six insights from national reporting on f...
 
90% of Enterprises are Using DataOps. Why Aren’t You?
90% of Enterprises are Using DataOps. Why Aren’t You?90% of Enterprises are Using DataOps. Why Aren’t You?
90% of Enterprises are Using DataOps. Why Aren’t You?
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud World
 
The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data Integration
 
Will AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and DashboardsWill AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and Dashboards
 
Choosing the Right Open Source Database
Choosing the Right Open Source DatabaseChoosing the Right Open Source Database
Choosing the Right Open Source Database
 
Streaming and Visual Data Discovery for the Internet of Things
Streaming and Visual Data Discovery for the Internet of ThingsStreaming and Visual Data Discovery for the Internet of Things
Streaming and Visual Data Discovery for the Internet of Things
 
Monitoring data quality by Jos Gheerardyn of Yields.io
Monitoring data quality by Jos Gheerardyn of Yields.ioMonitoring data quality by Jos Gheerardyn of Yields.io
Monitoring data quality by Jos Gheerardyn of Yields.io
 
Best Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the OrganizationBest Practices for Scaling Data Science Across the Organization
Best Practices for Scaling Data Science Across the Organization
 
Enabling DataOps with Unified Data Lineage
Enabling DataOps with Unified Data LineageEnabling DataOps with Unified Data Lineage
Enabling DataOps with Unified Data Lineage
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
progress_DBBI-infographic_01-01
progress_DBBI-infographic_01-01progress_DBBI-infographic_01-01
progress_DBBI-infographic_01-01
 
Dsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovicDsc 2021 presentation_radovan_bacovic
Dsc 2021 presentation_radovan_bacovic
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
 
Why Data Science Projects Fail?
Why Data Science Projects Fail?Why Data Science Projects Fail?
Why Data Science Projects Fail?
 
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
 
13 2792 big-data_keynote_presentation_finalpass_05_d_v02
13 2792 big-data_keynote_presentation_finalpass_05_d_v0213 2792 big-data_keynote_presentation_finalpass_05_d_v02
13 2792 big-data_keynote_presentation_finalpass_05_d_v02
 

En vedette

3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017Drift
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 
Bridged Overview by CodeData
Bridged Overview by CodeDataBridged Overview by CodeData
Bridged Overview by CodeDataSam Sur
 
Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!DataKitchen
 
Data kitchen 7 agile steps - big data fest 9-18-2015
Data kitchen   7 agile steps - big data fest 9-18-2015Data kitchen   7 agile steps - big data fest 9-18-2015
Data kitchen 7 agile steps - big data fest 9-18-2015DataKitchen
 
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)Denodo
 
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...Data Virtualization Journey: How to Grow from Single Project and to Enterpris...
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...Denodo
 
Open Data Science Conference Agile Data
Open Data Science Conference Agile DataOpen Data Science Conference Agile Data
Open Data Science Conference Agile DataDataKitchen
 
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)Amazon Web Services
 
Denodo DataFest 2016: The Role of Data Virtualization in IoT Integration
Denodo DataFest 2016: The Role of Data Virtualization in IoT IntegrationDenodo DataFest 2016: The Role of Data Virtualization in IoT Integration
Denodo DataFest 2016: The Role of Data Virtualization in IoT IntegrationDenodo
 
Results 140712082255-phpapp01-140712094930-phpapp01
Results 140712082255-phpapp01-140712094930-phpapp01Results 140712082255-phpapp01-140712094930-phpapp01
Results 140712082255-phpapp01-140712094930-phpapp01Lider705
 
лекція №8
лекція №8лекція №8
лекція №8shulga_sa
 
Kimetz asier andres
Kimetz asier andresKimetz asier andres
Kimetz asier andresAlmudena73
 
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosionactifio
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterMatt Tesauro
 
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBDenodo
 

En vedette (20)

3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
Bridged Overview by CodeData
Bridged Overview by CodeDataBridged Overview by CodeData
Bridged Overview by CodeData
 
Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!
 
Data kitchen 7 agile steps - big data fest 9-18-2015
Data kitchen   7 agile steps - big data fest 9-18-2015Data kitchen   7 agile steps - big data fest 9-18-2015
Data kitchen 7 agile steps - big data fest 9-18-2015
 
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
 
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...Data Virtualization Journey: How to Grow from Single Project and to Enterpris...
Data Virtualization Journey: How to Grow from Single Project and to Enterpris...
 
Open Data Science Conference Agile Data
Open Data Science Conference Agile DataOpen Data Science Conference Agile Data
Open Data Science Conference Agile Data
 
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
 
Denodo DataFest 2016: The Role of Data Virtualization in IoT Integration
Denodo DataFest 2016: The Role of Data Virtualization in IoT IntegrationDenodo DataFest 2016: The Role of Data Virtualization in IoT Integration
Denodo DataFest 2016: The Role of Data Virtualization in IoT Integration
 
Results 140712082255-phpapp01-140712094930-phpapp01
Results 140712082255-phpapp01-140712094930-phpapp01Results 140712082255-phpapp01-140712094930-phpapp01
Results 140712082255-phpapp01-140712094930-phpapp01
 
лекція №8
лекція №8лекція №8
лекція №8
 
Kimetz asier andres
Kimetz asier andresKimetz asier andres
Kimetz asier andres
 
DataOps with Project Amaterasu
DataOps with Project AmaterasuDataOps with Project Amaterasu
DataOps with Project Amaterasu
 
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
 
Oxygen gas analyzer
Oxygen gas analyzerOxygen gas analyzer
Oxygen gas analyzer
 
Proyecto 2017/2018
Proyecto 2017/2018 Proyecto 2017/2018
Proyecto 2017/2018
 
Ingenieria civil..
Ingenieria civil..Ingenieria civil..
Ingenieria civil..
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
 
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
 

Similaire à Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with dataops

Fri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataopsFri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataopsDataKitchen
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019DataKitchen
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyDataWorks Summit
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseAmazon Web Services
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution AnalyticsRevolution Analytics
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETLAmazon Web Services
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenChristopher Bergh
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...Amazon Web Services
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Cloudera, Inc.
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data SnapLogic
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...Matt Stubbs
 
IT + Line of Business - Driving Faster, Deeper Insights Together
IT + Line of Business - Driving Faster, Deeper Insights TogetherIT + Line of Business - Driving Faster, Deeper Insights Together
IT + Line of Business - Driving Faster, Deeper Insights TogetherDATAVERSITY
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 

Similaire à Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with dataops (20)

Fri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataopsFri benghiat gil-odsc-data-kitchen-data science to dataops
Fri benghiat gil-odsc-data-kitchen-data science to dataops
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the Enterprise
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
ABD217_From Batch to Streaming
ABD217_From Batch to StreamingABD217_From Batch to Streaming
ABD217_From Batch to Streaming
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
Migrating database to cloud
Migrating database to cloudMigrating database to cloud
Migrating database to cloud
 
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
 
IT + Line of Business - Driving Faster, Deeper Insights Together
IT + Line of Business - Driving Faster, Deeper Insights TogetherIT + Line of Business - Driving Faster, Deeper Insights Together
IT + Line of Business - Driving Faster, Deeper Insights Together
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 

Dernier

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 

Dernier (20)

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 

Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with dataops

  • 1. Seven Steps to High- Velocity Data Analytics with DataOps 2017-03-16 ! San Jose, CA
  • 2. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. Agenda DataKitchen Summary Analytic Landscape Seven Shocking Steps Case Study
  • 3. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. DataKitchen Executive Summary • Who we serve: • IT & Analytic Teams • What we do: • World’s first company focused on enabling DataOps • How we do it: • DataOps Software Platform • Why DataOps matters? It allows analytic teams to: • Deliver insight fast • With high quality • Using the tools they love • Reuse what they’ve created • Continuously show value to their business customers
  • 4. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. The world changed February 2005 “I am competing with Amazon” -- Director of Analytics Brand team support Big Pharma Company
  • 5. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. Expectations are going higher
  • 6. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. Your competitors are moving fast WATERFALL ➡ AGILE ➡ DEVOPS RELEASE FREQUENCY
  • 7. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. Your customers expect high quality HAND MADE ➡ MASS PRODUCTION ➡ LEAN MANUFACTURING
  • 8. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. How do you deliver on high expectations on speed and quality? The answer is DataOps
  • 9. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. Seven Steps to Implement DataOps 1. Add Data and Logic Tests 2. Use a Version Control System 3. Branch and Merge 4. Use Multiple Environments 5. Reuse & Containerize 6. Parameterize Your Processing 7. Use Simple Storage
  • 10. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. ❶ Add Data And Logic Tests Are you outputs consistent? And Save Test Results! Access: Python Code Transform: SQL Code, ETL Code Model: R Code Visualize: Tableau Workbook XML Report: Tableau Online Are data inputs free from issues? Is your business logic still correct?
  • 11. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. Example Tests
  • 12. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. At the end of the day, Analytic work is all just code Access: Python Code Transform: SQL Code, ETL Code Model: R Code Visualize: Tableau Workbook XML Report: Tableau Online ❷ Use a Version Control System Source Code Control
  • 13. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. ❸ Branch & Merge Source Code Control Branching & Merging enables people to safely work on their own tasks
  • 14. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. Access: Python Code Transform: SQL Code, ETL Code Model: R Code Visualize: Tableau Workbook XML Report: Tableau Online ❹ Use Multiple Environments Analytic Environment Your Analytic Work Requires Coordinating Tools And Hardware
  • 15. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. ❹ Use Multiple Environments Provide an Analytic Environment for each branch • Analysts need a controlled environment for their experiments • Engineers need a place to develop outside of production • Update Production only after all tests are run!
  • 16. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. ❺ Reuse & Containerize Containerize 1. Manage the environment for each component (e.g. Docker, AMI) 2. Practice Environment Version Control Reuse 1. Do not create one ‘monolith’ of code 2. Reuse the code and results
  • 17. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. ❻ Parameterize Your Processing • Parameters and named sets of parameters will increase your velocity • With parameters, you can vary • Inputs [you can make a time machine] • Outputs • Steps in the workflow
  • 18. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. ❼ Use Simple Storage • Data Lake • Keep copies of all your raw data in simple, cheap storage (s3, HFDS, file system) • Data Restore: Be able to back up and restore your databases easily • “My Own Database”: Data Marts On Demand • Create parameterized variations of your process that allow you to assemble data for experimentation, development, and productionDMDWDM Transfor m Transfor m Transfor m
  • 19. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. DataOps case study • One Data Engineer • Supports 12 analysts • Makes 12 schema changes a week without breaking anything • One Data Analyst supports hundreds of sales people • Other Data Analysts make visualization changes and publish them the next day Analytics Team supporting Sales and Marketing in a Pharmaceutical Company
  • 20. Copyright © 2017 by DataKitchen, Inc. All Rights Reserved. DataKitchen Executive Summary • Who we serve: • IT & Analytic Teams • What we do: • World’s first company focused on enabling DataOps • How we do it: • DataOps Software Platform • Why DataOps matters? It allows analytic teams to: • Deliver insight fast • With high quality • Using the tools they love • Reuse what they’ve created • Continuously show value to their business customers