SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Agile Leadership: Guiding DataOps
Teams Through Rapid Change and
Uncertainty
Suki Dhuphar
April 8th, 2020
Today’s
Speaker
Suki Dhuphar
EMEA Lead, Tamr
Agenda
● Introduction
● What is Modern Enterprise Data
Engineering? (Why it’s more
important than ever before)
● How to adapt your Data
Engineering processes to rapid
change
● Strategies to keep remote teams
aligned and virtually connected
● The importance of using data to
drive business decisions
● Why organizations need to
modernize data management
quickly and effectively (How to
accelerate cloud migration)
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Data is more important now than ever before!!
4
Agenda
● Introduction
● What is Modern Enterprise Data
Engineering? (Why it’s more
important than ever before)
● How to adapt your Data
Engineering processes to rapid
change
● Strategies to keep remote teams
aligned and virtually connected
● The importance of using data to
drive business decisions
● Why organizations need to
modernize data management
quickly and effectively (How to
accelerate cloud migration)
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Large Orgs Can Be Tempted to Put AI/ML Cart Before Data Horse
6
What is DataOps? = Modern Data Engineering
Practice
DataOps is an automated, process oriented methodology,
used by analytic and data teams to improve the quality
and reduce the cycle time of data analytics.
7
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Data Science vs. Data Engineering
There is a significant overlap between data engineers & data
scientists when it comes to skills and responsibilities.
The main difference is one of focus.
Data Engineers are focused on building infrastructure &
architecture for data generation, preparation and
publishing.
In contrast, data scientists are focused on advanced
mathematics and statistical analysis on published data.
Some traditional enterprise “data management’
professionals will become data engineers.
Link here to detailed infographic from DataCamp
Reporting and Visualization
Statistical Modeling & Machine Learning
Data Movement
Data Cleaning, Unification, Alignment
Database Performance Optimization
Software
Engineer
Data
Scientist
Data
Engineer
Data Engineer
Data Scientist
8
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Reality of modern enterprise data scientists.
They are constantly and idiosyncratically “fixing” the core data
Data Scientist Survey by Figure Eight
9
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Empowering data consumers is core to success of
industry disruptors
“One of the primary goals of the platform is to enable
other teams to focus on business logic, making
experimentation, implementation, operation of
stream processing jobs easy. By having a platform to
abstract the “hard stuff”, removing complexities away
from users, this would unleash broader team agility
and product innovations.”
10
Agenda
● Introduction
● What is Modern Enterprise Data
Engineering? (Why it’s more
important than ever before)
● How to adapt your Data
Engineering processes to rapid
change
● Strategies to keep remote teams
aligned and virtually connected
● The importance of using data to
drive business decisions
● Why organizations need to
modernize data management
quickly and effectively (How to
accelerate cloud migration)
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Traditional “Methods” for data engineering
All necessary, none alone is sufficient to solve broader problems either approach alone is sufficient
● Standardization -- one schema to rule them all
● Aggregation -- tends to create more/bigger siloes
● Federation -- always creates significant query performance challenges
● Master Data Management -- “deviations” difficult to handle and too deterministic
● Rationalize Systems -- single vendor = radical compromise
● Throw Bodies at it -- expensive, time-consuming, ineffective & inconsistent
12
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Human/behavioral challenges often primary bottleneck
● Afraid to share data
○ Due to data quality (worry about being judged or having to take on the responsibility of
fixing the data consumers’ requests)
● Hoarding data
○ A method of organizational control or job preservation
● Obscuring data complexity
○ Failure to embrace the complexity, diversity, and idiosyncrasy of data generated in a large
enterprise
● Limiting access to a small number of users
○ A method of control or as a reflection of insecurity of data quality
13
Traditional companies have significant “legacy drag coefficient”
Manage data from their business systems more as “exhaust” than “asset” > “significant data debt”
Result: “Random Data Salad”
Data debt from constant change/entropy
Restructuring
Leadership
Changes
Politics
Dynamic Schema DBs -
Mongo et al
“Data
Hoarding”
Legacy
Burden
M&A
Problem: Thousands of systems generating
data every day that were built over decades
to support business processes - idiosyncratic
to that time/context.
Data is idiosyncratic to each system - creates
fundamental “data disconnect” and “data
decay” Consequences: 1. Too much time spent on data prep vs. analysis / action.
2. High failure rate of BI / analytics projects
3. Game changing initiatives deemed ‘impossible’ and never
start
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 14
Modern, Open Data Engineering Ecosystem
(aka DataOps)
Sources (tabular)
Internal
databases
Internal
apps / systems
External
endpoints
Internal
files
External
files
Feedback &
Usage
Mastering & Quality
Movement & Automation
Storage & Compute
Governance, Privacy
& Policy
Catalog &
Crawling
Consumption
endpoints
Analytics
Source
Remediation
Source / Cloud
Migration
Custom
Apps
Consumers
Citizens
Analysts
Data Scientists
Developers
Publishing &
Versioning
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
15
Sources People, Process, Tools
Internal Tabular
Data
External Tabular
Data
● Cloud First
● Continuous (assume data will change)
● Agile (deploy quickly and iteratively)
● Highly Automated - automate whenever possible
● Open/Best of Breed (not one platform/vendor)
● Bi-Directional (Feedback)
● Collaborative (Humans at the Core)
● Service Oriented (clear endpoints for data)
● Loosely Coupled (Restful Interfaces Table(s) In/Out)
● Both aggregated AND federated storage
● Both batch AND Streaming
● Lineage/Provenance is essential
● Scale Out/Distributed
Modern Enterprise Data Engineering Principles ≈ “DataOps”
Consumers
Citizens
Analysts
Data
Scientists
Developers
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 16
Reality of Data Ecosystem/Landscape : EXTREME NOISE
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
17
Agenda
● Introduction
● What is Modern Enterprise Data
Engineering? (Why it’s more
important than ever before)
● How to adapt your Data
Engineering processes to rapid
change
● Strategies to keep remote teams
aligned and virtually connected
● The importance of using data to
drive business decisions
● Why organizations need to
modernize data management
quickly and effectively (How to
accelerate cloud migration)
How many employees work from
home?
Regular work-at-home has grown 173% since 2005,
11% faster than the rest of the workforce.[Global
Workplace Analytics’ analysis of 2018 ACS data]
How many people could work-
from-home?
● 56% of employees have a job where at least
some of what they do could be done remotely
[Global Workplace Analytics analysis of BLS data, 2017]
● 62% of employees say they could work
remotely [Citrix 2019 poll]
● Studies repeatedly show desks are vacant 50-
60% of the time.
Adapting to change - Telecommuting/WFH/Remote working is not a
new concept
The chart shows the percentage of people who work-at-
home by industry. [Global Workplace Analytics’ special analysis of 2016 ACS data]
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 19
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Data
Engineering
Data
Suppliers
CIO
Source Owner
DBA
IT Professional
CDO
Data Engineer
Curator
Steward
Business owners and Other CxOs
Data Consumers
Data Scientist
Data Analyst
Data Citizen
Developer
ELT Professional
Key People/Personas in the Modern Data Ecosystem
20
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Key Roles in next-gen Data Ecosystem
Role Goals Tools
Citizen Use data to make business decisions
Viz, CRM, Excel, PowerPoint, Word, Web
Search
Analyst Deliver insights to the business, typically through dashboards and reports Viz, Excel, SSDP, Web Search
Scientist Deliver insights to the business, typically through models and algorithms R, Python, SAS, SSDP
Developer Build applications which leverage corporate data Python, Java, JS, SQL, REST
Engineer Deliver and manage data pipelines ETL, SQL
Curator Ensure consumers have the data they need, in the form they need it Data mastering tools
Steward Uses feedback from consumers to improve data broadly Data Feedback Tools
Source Owner
Define and manage purpose, processes (data creation, consumption) & users
(i.e., access) of the data source
EDW, SQL, ERWin, LDAP, SAP
ConsumersPreparersSuppliers
21
Agenda
● Introduction
● What is Modern Enterprise Data
Engineering? (Why it’s more
important than ever before)
● How to adapt your Data
Engineering processes to rapid
change
● Strategies to keep remote teams
aligned and virtually connected
● The importance of using data to
drive business decisions
● Why organizations need to
modernize data management
quickly and effectively (How to
accelerate cloud migration)
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Useful comparison between CDO and CFO
Tablestakes
CHIEF FINANCIAL OFFICER
● What money do we have?
● Where did it come from?
● Where is it going and why?
Long term goal: Return on Assets
CHIEF DATA OFFICER
● What data do we have?
● Where does it come from?
● Who consumes it and why?
Long term goal: Return on Data
23
How Do I Start?
You may have already started...
● Leverage existing mastered data as ground truth
● Keep the best parts of your MDM, just enhance the Mastering capability
...But if you haven’t…
● Find a data-rich, analytically valuable problem for which fragmented data and
knowledge present a challenge
...Either way, it’s essential to keep an agile...
● ...Mindset - focus on quick wins that have been beyond reach, then build
● ...Skillset - engage the data experts at their current skill level, let machines do the
rest
● ...Toolset - simple, collaborative data curation, optimally in the tools they already
use
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 24
25
212 Sources (tables) - mostly
SAP
● Enterprises have hundreds of
source systems
● Sources must be combined,
consolidated, and classified
● These lists are building blocks
for transformational analytics
What are Transformational Analytic Outcomes?
Question: How many customers do we have?
Before After
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 26
Before After
What are Transformational Analytic Outcomes?
Question: What is our customer distribution by sales totals?
● Analytics begin with sell more
and/or spend less
● Transformational analytics aren’t
new, they are broader
● Business wants speed and up-to-
date information
● Data variety skews answers,
creating misinformation instead of
clarity
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 27
Agenda
● Introduction
● What is Modern Enterprise Data
Engineering? (Why it’s more
important than ever before)
● How to adapt your Data
Engineering processes to rapid
change
● Strategies to keep remote teams
aligned and virtually connected
● The importance of using data to
drive business decisions
● Why organizations need to
modernize data management
quickly and effectively (How to
accelerate cloud migration)
Why now? 7 years ago, we needed data scientists!
But now that we have them - where do they get their data?
Data Scientist: The Sexiest
Job of the 21st Century
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 29
Today: we have data scientists! (and want to do cool AI stuff)
Data Scientist Jobs
Indeed.com, % of all postings
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 30
Unique Moment in Time: Enterprise Data as an Asset
Vestibulum congueLatent Opportunity of
Enterprise “Data As An
Asset”
Enterprise Migration to
the Cloud
Generational Change in
Enterprise Data
Management
Fear of disruption by “Data Natives”
Low Hanging Analytical Fruit
“Competing on Analytics”/Strategic Imperative
Popularity of AI but really of Core Data Quality
Inability of traditional tech to scale
Lack of innovation from old vendors
Maturing Big Data Tech (HDFS/Lakes)
Democratization of analytics
Rise of the CDO
Decades of treading data as operational exhaust
Deeply Fragmented/Siloed Data Environments
Inability to leverage new sources - esp external
“AI Cart” before the “Data Horse”
Significant “lift & shift” opportunity
Potential for behavioral changes
New infra good/secure enough
Now that we’ve established Data Science as critical component of
enterprise:
It’s time for each enterprise in the Global 2000 to build their data
engineering muscles to enable them to “compete on analytics”
over the coming decades.
31
What NOT to do
● Avoid boil the ocean/”waterfall” (projects measured in years/quarters)
○ Build rational long term infra while delivering real analytic value along the way
● Single “Platform”: Don’t overestimate what single piece of software can do
○ Focus on thoughtfully designed ecosystem of loosely coupled best of breed tools
● Single Vendor: Don’t overestimate what single vendor can do
○ Align vendors with APIs and expectations that they MUST work together
● Don’t Underestimate effort required to make FOSS work
○ Just because Google does it doesn’t mean you can do it
● Don’t underestimate human/behavioral challenges with data
○ Most often the reason that projects fail/stall are human/behavioral
● Avoid “Data Engineering/Science Hubris”
○ I Data - therefore I am
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 32
Thank You
Questions
Email: suki.dhuphar@tamr.com
For Joining this Webinar
You’ll receive exclusive access to
Tamr’s new guide
33

Contenu connexe

Tendances

Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Caserta
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data LakeCaserta
 
Moving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsMoving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsCaserta
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
The Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's EnterpriseThe Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's EnterpriseCaserta
 
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017Caserta
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldDATAVERSITY
 
Virtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis WorkshopVirtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis WorkshopCCG
 
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
The Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: CollaborationThe Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: CollaborationEmbarcadero Technologies
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDATAVERSITY
 
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...DATAVERSITY
 
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Caserta
 
Accelerating Fast Data Strategy with Data Virtualization
Accelerating Fast Data Strategy with Data VirtualizationAccelerating Fast Data Strategy with Data Virtualization
Accelerating Fast Data Strategy with Data VirtualizationDenodo
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and InnovationCaserta
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and InnovationCaserta
 
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data Wrong
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data WrongThe Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data Wrong
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data WrongDATAVERSITY
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsDATAVERSITY
 
MLOps - Getting Machine Learning Into Production
MLOps - Getting Machine Learning Into ProductionMLOps - Getting Machine Learning Into Production
MLOps - Getting Machine Learning Into ProductionMichael Pearce
 

Tendances (20)

Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
 
Moving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsMoving Past Infrastructure Limitations
Moving Past Infrastructure Limitations
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
The Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's EnterpriseThe Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's Enterprise
 
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud World
 
Virtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis WorkshopVirtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis Workshop
 
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
The Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: CollaborationThe Key to Big Data Modeling: Collaboration
The Key to Big Data Modeling: Collaboration
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
 
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
 
Accelerating Fast Data Strategy with Data Virtualization
Accelerating Fast Data Strategy with Data VirtualizationAccelerating Fast Data Strategy with Data Virtualization
Accelerating Fast Data Strategy with Data Virtualization
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data Wrong
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data WrongThe Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data Wrong
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data Wrong
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
MLOps - Getting Machine Learning Into Production
MLOps - Getting Machine Learning Into ProductionMLOps - Getting Machine Learning Into Production
MLOps - Getting Machine Learning Into Production
 

Similaire à Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty

Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?DATAVERSITY
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Tdwi march 2015 presentation
Tdwi march 2015 presentationTdwi march 2015 presentation
Tdwi march 2015 presentationAlison Macfie
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Denodo
 
Creating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdfCreating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdfEnov8
 
Should You Invest In DataOps Services?
Should You Invest In DataOps Services?Should You Invest In DataOps Services?
Should You Invest In DataOps Services?Enov8
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIDenodo
 
The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation Caserta
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteCaserta
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)Denodo
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform Michael Ghen
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseCaserta
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsDATAVERSITY
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsEmbarcadero Technologies
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Big Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPS
Big Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPSBig Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPS
Big Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPSMatt Stubbs
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseDatabricks
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Denodo
 

Similaire à Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty (20)

Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Tdwi march 2015 presentation
Tdwi march 2015 presentationTdwi march 2015 presentation
Tdwi march 2015 presentation
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling Fundamentals
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
Creating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdfCreating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdf
 
Should You Invest In DataOps Services?
Should You Invest In DataOps Services?Should You Invest In DataOps Services?
Should You Invest In DataOps Services?
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AI
 
The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced Analytics
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data Assets
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Big Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPS
Big Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPSBig Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPS
Big Data LDN 2018: AGILE DATA MASTERING: THE RIGHT APPROACH FOR DATAOPS
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
 

Plus de TamrMarketing

Data Mastering at Scale with Michael Stonebraker
Data Mastering at Scale with Michael StonebrakerData Mastering at Scale with Michael Stonebraker
Data Mastering at Scale with Michael StonebrakerTamrMarketing
 
Data as a Strategic Asset
Data as a Strategic AssetData as a Strategic Asset
Data as a Strategic AssetTamrMarketing
 
Optimize supply chains using machine learning superpowers webinar deck
Optimize supply chains using machine learning superpowers webinar deckOptimize supply chains using machine learning superpowers webinar deck
Optimize supply chains using machine learning superpowers webinar deckTamrMarketing
 
7 Steps for Boosting R&D Outcomes
7 Steps for Boosting R&D Outcomes7 Steps for Boosting R&D Outcomes
7 Steps for Boosting R&D OutcomesTamrMarketing
 
How Santander UK Accelerates Digital Initiatives by Mastering Customer Data
How Santander UK Accelerates Digital Initiatives by Mastering Customer DataHow Santander UK Accelerates Digital Initiatives by Mastering Customer Data
How Santander UK Accelerates Digital Initiatives by Mastering Customer DataTamrMarketing
 
DataOps @ Scale: A Modern Framework for Data Management in the Public Sector
DataOps @ Scale: A Modern Framework for Data Management in the Public SectorDataOps @ Scale: A Modern Framework for Data Management in the Public Sector
DataOps @ Scale: A Modern Framework for Data Management in the Public SectorTamrMarketing
 

Plus de TamrMarketing (6)

Data Mastering at Scale with Michael Stonebraker
Data Mastering at Scale with Michael StonebrakerData Mastering at Scale with Michael Stonebraker
Data Mastering at Scale with Michael Stonebraker
 
Data as a Strategic Asset
Data as a Strategic AssetData as a Strategic Asset
Data as a Strategic Asset
 
Optimize supply chains using machine learning superpowers webinar deck
Optimize supply chains using machine learning superpowers webinar deckOptimize supply chains using machine learning superpowers webinar deck
Optimize supply chains using machine learning superpowers webinar deck
 
7 Steps for Boosting R&D Outcomes
7 Steps for Boosting R&D Outcomes7 Steps for Boosting R&D Outcomes
7 Steps for Boosting R&D Outcomes
 
How Santander UK Accelerates Digital Initiatives by Mastering Customer Data
How Santander UK Accelerates Digital Initiatives by Mastering Customer DataHow Santander UK Accelerates Digital Initiatives by Mastering Customer Data
How Santander UK Accelerates Digital Initiatives by Mastering Customer Data
 
DataOps @ Scale: A Modern Framework for Data Management in the Public Sector
DataOps @ Scale: A Modern Framework for Data Management in the Public SectorDataOps @ Scale: A Modern Framework for Data Management in the Public Sector
DataOps @ Scale: A Modern Framework for Data Management in the Public Sector
 

Dernier

Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 

Dernier (16)

Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 

Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty

  • 1. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Suki Dhuphar April 8th, 2020
  • 3. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  • 4. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Data is more important now than ever before!! 4
  • 5. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  • 6. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Large Orgs Can Be Tempted to Put AI/ML Cart Before Data Horse 6
  • 7. What is DataOps? = Modern Data Engineering Practice DataOps is an automated, process oriented methodology, used by analytic and data teams to improve the quality and reduce the cycle time of data analytics. 7
  • 8. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Data Science vs. Data Engineering There is a significant overlap between data engineers & data scientists when it comes to skills and responsibilities. The main difference is one of focus. Data Engineers are focused on building infrastructure & architecture for data generation, preparation and publishing. In contrast, data scientists are focused on advanced mathematics and statistical analysis on published data. Some traditional enterprise “data management’ professionals will become data engineers. Link here to detailed infographic from DataCamp Reporting and Visualization Statistical Modeling & Machine Learning Data Movement Data Cleaning, Unification, Alignment Database Performance Optimization Software Engineer Data Scientist Data Engineer Data Engineer Data Scientist 8
  • 9. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Reality of modern enterprise data scientists. They are constantly and idiosyncratically “fixing” the core data Data Scientist Survey by Figure Eight 9
  • 10. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Empowering data consumers is core to success of industry disruptors “One of the primary goals of the platform is to enable other teams to focus on business logic, making experimentation, implementation, operation of stream processing jobs easy. By having a platform to abstract the “hard stuff”, removing complexities away from users, this would unleash broader team agility and product innovations.” 10
  • 11. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  • 12. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Traditional “Methods” for data engineering All necessary, none alone is sufficient to solve broader problems either approach alone is sufficient ● Standardization -- one schema to rule them all ● Aggregation -- tends to create more/bigger siloes ● Federation -- always creates significant query performance challenges ● Master Data Management -- “deviations” difficult to handle and too deterministic ● Rationalize Systems -- single vendor = radical compromise ● Throw Bodies at it -- expensive, time-consuming, ineffective & inconsistent 12
  • 13. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Human/behavioral challenges often primary bottleneck ● Afraid to share data ○ Due to data quality (worry about being judged or having to take on the responsibility of fixing the data consumers’ requests) ● Hoarding data ○ A method of organizational control or job preservation ● Obscuring data complexity ○ Failure to embrace the complexity, diversity, and idiosyncrasy of data generated in a large enterprise ● Limiting access to a small number of users ○ A method of control or as a reflection of insecurity of data quality 13
  • 14. Traditional companies have significant “legacy drag coefficient” Manage data from their business systems more as “exhaust” than “asset” > “significant data debt” Result: “Random Data Salad” Data debt from constant change/entropy Restructuring Leadership Changes Politics Dynamic Schema DBs - Mongo et al “Data Hoarding” Legacy Burden M&A Problem: Thousands of systems generating data every day that were built over decades to support business processes - idiosyncratic to that time/context. Data is idiosyncratic to each system - creates fundamental “data disconnect” and “data decay” Consequences: 1. Too much time spent on data prep vs. analysis / action. 2. High failure rate of BI / analytics projects 3. Game changing initiatives deemed ‘impossible’ and never start Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 14
  • 15. Modern, Open Data Engineering Ecosystem (aka DataOps) Sources (tabular) Internal databases Internal apps / systems External endpoints Internal files External files Feedback & Usage Mastering & Quality Movement & Automation Storage & Compute Governance, Privacy & Policy Catalog & Crawling Consumption endpoints Analytics Source Remediation Source / Cloud Migration Custom Apps Consumers Citizens Analysts Data Scientists Developers Publishing & Versioning Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 15
  • 16. Sources People, Process, Tools Internal Tabular Data External Tabular Data ● Cloud First ● Continuous (assume data will change) ● Agile (deploy quickly and iteratively) ● Highly Automated - automate whenever possible ● Open/Best of Breed (not one platform/vendor) ● Bi-Directional (Feedback) ● Collaborative (Humans at the Core) ● Service Oriented (clear endpoints for data) ● Loosely Coupled (Restful Interfaces Table(s) In/Out) ● Both aggregated AND federated storage ● Both batch AND Streaming ● Lineage/Provenance is essential ● Scale Out/Distributed Modern Enterprise Data Engineering Principles ≈ “DataOps” Consumers Citizens Analysts Data Scientists Developers Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 16
  • 17. Reality of Data Ecosystem/Landscape : EXTREME NOISE Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 17
  • 18. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  • 19. How many employees work from home? Regular work-at-home has grown 173% since 2005, 11% faster than the rest of the workforce.[Global Workplace Analytics’ analysis of 2018 ACS data] How many people could work- from-home? ● 56% of employees have a job where at least some of what they do could be done remotely [Global Workplace Analytics analysis of BLS data, 2017] ● 62% of employees say they could work remotely [Citrix 2019 poll] ● Studies repeatedly show desks are vacant 50- 60% of the time. Adapting to change - Telecommuting/WFH/Remote working is not a new concept The chart shows the percentage of people who work-at- home by industry. [Global Workplace Analytics’ special analysis of 2016 ACS data] Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 19
  • 20. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Data Engineering Data Suppliers CIO Source Owner DBA IT Professional CDO Data Engineer Curator Steward Business owners and Other CxOs Data Consumers Data Scientist Data Analyst Data Citizen Developer ELT Professional Key People/Personas in the Modern Data Ecosystem 20
  • 21. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Key Roles in next-gen Data Ecosystem Role Goals Tools Citizen Use data to make business decisions Viz, CRM, Excel, PowerPoint, Word, Web Search Analyst Deliver insights to the business, typically through dashboards and reports Viz, Excel, SSDP, Web Search Scientist Deliver insights to the business, typically through models and algorithms R, Python, SAS, SSDP Developer Build applications which leverage corporate data Python, Java, JS, SQL, REST Engineer Deliver and manage data pipelines ETL, SQL Curator Ensure consumers have the data they need, in the form they need it Data mastering tools Steward Uses feedback from consumers to improve data broadly Data Feedback Tools Source Owner Define and manage purpose, processes (data creation, consumption) & users (i.e., access) of the data source EDW, SQL, ERWin, LDAP, SAP ConsumersPreparersSuppliers 21
  • 22. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  • 23. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Useful comparison between CDO and CFO Tablestakes CHIEF FINANCIAL OFFICER ● What money do we have? ● Where did it come from? ● Where is it going and why? Long term goal: Return on Assets CHIEF DATA OFFICER ● What data do we have? ● Where does it come from? ● Who consumes it and why? Long term goal: Return on Data 23
  • 24. How Do I Start? You may have already started... ● Leverage existing mastered data as ground truth ● Keep the best parts of your MDM, just enhance the Mastering capability ...But if you haven’t… ● Find a data-rich, analytically valuable problem for which fragmented data and knowledge present a challenge ...Either way, it’s essential to keep an agile... ● ...Mindset - focus on quick wins that have been beyond reach, then build ● ...Skillset - engage the data experts at their current skill level, let machines do the rest ● ...Toolset - simple, collaborative data curation, optimally in the tools they already use Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 24
  • 25. 25
  • 26. 212 Sources (tables) - mostly SAP ● Enterprises have hundreds of source systems ● Sources must be combined, consolidated, and classified ● These lists are building blocks for transformational analytics What are Transformational Analytic Outcomes? Question: How many customers do we have? Before After Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 26
  • 27. Before After What are Transformational Analytic Outcomes? Question: What is our customer distribution by sales totals? ● Analytics begin with sell more and/or spend less ● Transformational analytics aren’t new, they are broader ● Business wants speed and up-to- date information ● Data variety skews answers, creating misinformation instead of clarity Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 27
  • 28. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  • 29. Why now? 7 years ago, we needed data scientists! But now that we have them - where do they get their data? Data Scientist: The Sexiest Job of the 21st Century Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 29
  • 30. Today: we have data scientists! (and want to do cool AI stuff) Data Scientist Jobs Indeed.com, % of all postings Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 30
  • 31. Unique Moment in Time: Enterprise Data as an Asset Vestibulum congueLatent Opportunity of Enterprise “Data As An Asset” Enterprise Migration to the Cloud Generational Change in Enterprise Data Management Fear of disruption by “Data Natives” Low Hanging Analytical Fruit “Competing on Analytics”/Strategic Imperative Popularity of AI but really of Core Data Quality Inability of traditional tech to scale Lack of innovation from old vendors Maturing Big Data Tech (HDFS/Lakes) Democratization of analytics Rise of the CDO Decades of treading data as operational exhaust Deeply Fragmented/Siloed Data Environments Inability to leverage new sources - esp external “AI Cart” before the “Data Horse” Significant “lift & shift” opportunity Potential for behavioral changes New infra good/secure enough Now that we’ve established Data Science as critical component of enterprise: It’s time for each enterprise in the Global 2000 to build their data engineering muscles to enable them to “compete on analytics” over the coming decades. 31
  • 32. What NOT to do ● Avoid boil the ocean/”waterfall” (projects measured in years/quarters) ○ Build rational long term infra while delivering real analytic value along the way ● Single “Platform”: Don’t overestimate what single piece of software can do ○ Focus on thoughtfully designed ecosystem of loosely coupled best of breed tools ● Single Vendor: Don’t overestimate what single vendor can do ○ Align vendors with APIs and expectations that they MUST work together ● Don’t Underestimate effort required to make FOSS work ○ Just because Google does it doesn’t mean you can do it ● Don’t underestimate human/behavioral challenges with data ○ Most often the reason that projects fail/stall are human/behavioral ● Avoid “Data Engineering/Science Hubris” ○ I Data - therefore I am Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 32
  • 33. Thank You Questions Email: suki.dhuphar@tamr.com For Joining this Webinar You’ll receive exclusive access to Tamr’s new guide 33