SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
The First Step in Information Management
www.firstsanfranciscopartners.com
Produced by:
MONTHLY SERIES
Brought to you in partnership with:
June 1, 2017
Data-Centric Development
Welcome, Malcolm Chisholm
 First San Francisco Partners’ Chief Innovation Officer
 More than 25 years of experience in data management
 Areas of expertise: data-centric development
methodology, data governance, master/reference data
management, metadata engineering, business rules
management/ execution, data architecture and design
pg 2© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Polling Questions
pg 3© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 Do you have key data-centric projects (i.e. Data Lake) you are
implementing or have implemented this year?
− Yes
− No
− Not sure
 Do you employ (or have employed) Waterfall or Agile methods to
manage your data-centric projects?
− Yes
− No
− Not sure
Topics for Today’s Webinar
 Data-Centric Development Defined
 The Focus on Programming in Agile Development
 How to Include a Data Focus in Agile Development
 The Focus on Programming in Big Data
 The Data-Centric Development Life Cycle
 Using Conceptual Data Modeling to Make Development Data-Centric
 Data-Centric Case Study
 Closing Remarks, Resources and Q&A
pg 4© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
www.firstsanfranciscopartners.com
Data-Centric Development Defined
Data-Centric vs. Process-Centric Projects
pg 6© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Computerized
Systems
Control
Systems
Information
Systems
Process-Centric
Systems
Data-Centric
Systems
Data-centric
Projects
Process-centric
Projects
Types of Computerized Systems
 Distinction between data-centric and
process-centric projects
 Process-centric: traditional projects for
computerized systems that automate a
process and where data is a by-product
 Data-centric: focused on building a system
that is to purely manage data and not to
automate a business process
 Some overlap: data-centric will involve automation
(e.g., ETL) and process-centric will involve data
(e.g., used for measuring process efficiency)
 Data-Centric Development Project – gets value out of pre-existing data or from curating
data to completely separate areas of the enterprise to derive value from it.
 Example: Data Warehouse starts with pre-existing production data
 Example: Customer Master Data Management (MDM) hub which other
applications will use to get “golden records” for customers
 Example: Big Data projects for analytics
 Process-Centric Development Project – automates some aspect of the enterprise; often,
a manual process that is automated or an existing automated process that is upgraded (no focus on the data).
 Example: Point-of-sale system
 Example: Payroll system
 Example: Medical billing system
 There is always overlap: process comes into data-centric projects, and data always exists in process-centric
projects. It is the overall focus that is different.
What is a Data-Centric Development Project
pg 7© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Fundamental Classes of Data-Centric Projects
pg 8© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 Project types that are fundamentally data-centric:
− Data Warehouses
− Data Marts
− Operational Data Stores (ODSs)
− Data Lakes
− Reference Data Management (RDM)
− Master Data Management (MDM)
1960s 1970s 1980s 1990s 2000s 2010s
Mainframes
Package Implementation
Distributed Computing
Internet
Cloud
Manual Process Automation
Data Warehouses / BI
MDM
Big Data
Technology PCs
Business Use Cases
pg 9© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 Over 50+ years, different technology answered different use cases
 General move from process-centricity to data-centricity
 Systems Development Life Cycle (SDLC) is a 1960s-era methodology
 SDLC is still almost universally used, including Agile
Projects Have Been Run the Same Way for Many Years
www.firstsanfranciscopartners.com
The Focus on Programming in Agile Development
Requirements
Analysis
Design
Development
Quality Assurance
Production
Post-Production
Waterfall Systems
Development Life Cycle
(SDLC)
1. Waterfall presumes there is a process to be automated.
In a data-centric project, the starting point is
existing production dataBut
2. Business Analysts expect users to state requirements.
Users never understand the data at the outsetBut
3. Waterfall is linear.
With data-centric, there are true cycles of
iteration as understanding of source data evolves
But
4. Waterfall QA phase only tests functionality, not data.
With data-centric, data quality needs to be testedBut
 And there are many other mismatches.
Waterfall SDLC is a
development project
methodology created in
the mid-1960s for
process-centric
projects. It is highly
embedded in IT, but not
well-aligned to data-
centric projects.
pg 11© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Waterfall SDLC vs. Data-Centric Projects
Requirements
Analysis
Design
Development
Quality Assurance
Production
Post-Production
Requirements
Analysis
Design
Development
Quality Assurance
Production
Post-Production
Requirements
Analysis
Design
Development
Quality Assurance
Production
Post-Production
Sprint Sprint Sprint
Epics User Stories Backlog ManagementProject Increments
pg 12© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 Agile is more popular today, but retains aspects of Waterfall and has no particular
data-centric aspects.
Agile
Tell me your
requirements!
They’re kind
of like this…
Business
Analyst
Business
User
Data Warehouse
Development
Project Business
User
Business
Analyst
Hey, this report
doesn’t make any
sense to me!
It’s your problem
because your
requirements were bad
January February March April May June July
3 Years Later
Data Extract from
Data Warehouse
Data
Scientist 1 Data
Scientist 2
I just don’t
understand this data
I have no clue who to
even ask
 These problems reflect the way
that development projects are
managed – the traditional SDLC.
pg 13© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
What Can Go Wrong with Data-Centric Projects?
www.firstsanfranciscopartners.com
How to Include a Data Focus
in Agile Development
Agile / Waterfall What Data-Centric Projects Need
• Oriented to automating an unautomated or partially
automated process
• Oriented to getting value out of existing production data or
curating data for other processes to use
• Users are able to articulate processes reasonably well for
requirements
• Users typically do not know the details of the source data and
may not even know where it is – or if it exists
• Testing focuses on whether the functionality matches
requirements
• Testing focuses on data quality of source data and data produced
by transformations, calculations and derivations
• No testing artifacts are carried over into Production • Data quality rules developed in testing are put into Production
for continuous data quality monitoring
• Knowledge gained during the project is used only for
development activities within the project
• Knowledge gained during the project is part of what is delivered
and is used later for developing reports after Production
implementation
• Stakeholders are predominantly the business users who will
benefit from the functionality
• Stakeholders also include representatives from business areas
who will use the data outside of the application, or may do so in
the future, e.g. data scientists
#
1
2
3
4
5
7
• Legal questions about processes are rare
• Legal, privacy and compliance concerns exist both for the
curation and permitted business use of data
6
pg 15© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Mismatch of Traditional Project Methodology to Data-Centric
www.firstsanfranciscopartners.com
The Focus on Programming in Big Data
Data Problems in Big Data Environments
pg 17© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
• The technology and processes to get data into a Big Data Environment
are relatively simple
• But there are huge challenges with understanding the source data
Big Data Environment (Data Lake)
Emails
Documents
Web Pages
XML
Relational
Flat Files
Audio
Image
Video
I
N
G
E
S
T
I
O
N
Source A
Source B
Source C
Source D
Source E
Columnar Databases in Big Data Environments
pg 18© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
• Columnar Databases are used a lot in Big Data
• They have to be organized to look like their queries, and to house the
data that comes into them from the sources
• Thus Target design and Source data analysis are huge issues
rowID
Column
Family
Column
Qualifier
“Timestamp” Payload
Doe|1968-11-04|John “CUSTOMER”
“EMPLOYEE”Doe|1968-11-04|John
Examples of Column
Family
“PURCHASER”Doe|1968-11-04|John
…and hundreds more…
Structure of a record
in HBase
www.firstsanfranciscopartners.com
The Data-Centric Development Life Cycle
First San Francisco Partners’ DCLC:
 Recognizes specific activities needed
for a data-centric project instead of
abstracting them into over-
generalizations like “analysis.”
 Provides for real iterations that lead
to refinement of information
requirements, instead of a single-
requirements activity.
 Understands some activities can be
carried out in parallel, instead of the
SDLC and Agile’s linear flow.
pg 21© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Introducing the Data-Centric Development Life Cycle (DCLC)
100%
Process-
centric
100%
Data-
centric
 The full DCLC is appropriate for projects that are heavily data-centric.
 However, even projects that are overwhelmingly process-centric can benefit from some
elements of the DCLC.
 This is because process-centric projects will be creating data that may be used in the future in
some analytics environment (that may not even exist yet).
pg 22© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Some Elements of DCLC are Needed for All Projects
Questions to Ask About Process-Centric Projects
pg 23© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 Will the data be used for analysis outside of the system that is being built?
− e.g., Will it be fed into a Data Warehouse or Data Lake?
− e.g., Will it be sold?
 Are there stakeholders in the data that the system will produce who are not the business
sponsors or controlled by the business sponsors?
− e.g. Data Scientists or Marketing
 Are any data feeds needed as inputs?
− Versus only data entry
 Does data quality matter to the business sponsors of the project?
− Are they bringing this concern to the project, rather than mildly agreeing with outside suggestions?
Any significant “yes” responses mean you should consider deploying elements of the
DCLC on the project.
www.firstsanfranciscopartners.com
Using Conceptual Data Modeling to
Make Development Data-Centric
What is Semantics
pg 24© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Terms
Terminology – what words are used in what contexts
(subject fields) by what communities. Needs some
level of Ontology for contexts. Not done much in
USA.
Concepts
Definitions and allied metadata. Requires some
level of Ontology to make distinctions clear. Often
not done in finance for Reference Data (Codes).
Often done poorly.
Classifications,
Taxonomies
Groupings of concepts based on common
characters, or a particular management need.
Problem of how to actually do classification often not
addressed
Hierarchies
Systems of relations between individuals, not
concepts. Problem of mixing different relation types
within the hierarchy. E.g. Legal vs Risk vs Sales
Rules
Calculations, Derivations, Constrains among
concepts. These are not Definitions, but are often
confused with them.
Ontologies
A particular view of financial reality, composed of all
the other items described below plus more relations.
A model of business information without any though
as to how it will be stored as data.
S
E
M
A
N
T
I
C
S
All this plays a role in the early
stages of the Data-Centric
Development Life Cycle
Subject-Area Models
pg 25© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
The highest level of conceptual model, but very useful in Data Discovery
www.firstsanfranciscopartners.com
Data-Centric Case Study
High-Level Overview
pg 27© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 Shared with enterprise leadership how data management
and FSFP’s Data-Centric Development Life Cycle
methodology could positively impact a major Data
Warehouse project and fill a critical project gap without
causing extra work
 Used momentum and resources of project to advance
maturity of Data Governance practices
How We Got Traction
pg 28© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 DCLC tied directly in to project deliverables
 Momentum coming from project deadlines
 Integration of clear governance goals into tactical
deliverables
 Active participation of Data Governance manager
 Inclusion of data analysis in requirements gathering
Results
pg 29© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
 Clear project data requirements that will enable re-use of data
in new reporting environment
 Cross-functional agreement to data definitions and concepts
 Business glossary ready for go-live
 Clear business ownership of data
 Data Governance team positioned for success
www.firstsanfranciscopartners.com
Closing Remarks, Resources and Q&A
Webinar Takeaways and Resources
 Takeaways
− Identify Data-centric projects vs Process-
centric ones
− Consider taking a Data-centric approach
for Data-centric projects.
 Suggested resources:
− DCLC articles on the FSFP blog
firstsanfranciscopartners.com/blog
− DCLC two-page overview
firstsanfranciscopartners.com/DCLC-overview
pg 31© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Questions?
pg 32© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
MONTHLY SERIES
Thank you!
Please join us Thursday, July 6 for the
“Governing Quality Analytics” webinar.
Malcolm Chisholm @MDChisholm
malcolm@firstsanfranciscopartners.com
John Ladley @jladley
john@firstsanfranciscopartners.com

Contenu connexe

Tendances

Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)DATAVERSITY
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsDATAVERSITY
 
RWDG Slides: Data Governance Roles and Responsibilities
RWDG Slides: Data Governance Roles and ResponsibilitiesRWDG Slides: Data Governance Roles and Responsibilities
RWDG Slides: Data Governance Roles and ResponsibilitiesDATAVERSITY
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
 
DI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics ArchitectureDI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics ArchitectureDATAVERSITY
 
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management PurgatoryData-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management PurgatoryDATAVERSITY
 
Data Governance vs. Information Governance
Data Governance vs. Information GovernanceData Governance vs. Information Governance
Data Governance vs. Information GovernanceDATAVERSITY
 
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...DATAVERSITY
 
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DATAVERSITY
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMDATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big ThingEmerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big ThingDATAVERSITY
 
Real-World Data Governance: Setting Appropriate Business Expectations
Real-World Data Governance: Setting Appropriate Business ExpectationsReal-World Data Governance: Setting Appropriate Business Expectations
Real-World Data Governance: Setting Appropriate Business ExpectationsDATAVERSITY
 
The Chief Data Officer's Agenda: What a CDO Needs to Know about Data Quality
The Chief Data Officer's Agenda: What a CDO Needs to Know about Data QualityThe Chief Data Officer's Agenda: What a CDO Needs to Know about Data Quality
The Chief Data Officer's Agenda: What a CDO Needs to Know about Data QualityDATAVERSITY
 
CDO Webinar: Ends vs. Means - The Role of Data Models and Other Key Artifacts
CDO Webinar: Ends vs. Means - The Role of Data Models and Other Key ArtifactsCDO Webinar: Ends vs. Means - The Role of Data Models and Other Key Artifacts
CDO Webinar: Ends vs. Means - The Role of Data Models and Other Key ArtifactsDATAVERSITY
 
Convincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is EssentialConvincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is EssentialDATAVERSITY
 
Big Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling MetadataBig Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling MetadataDATAVERSITY
 
Data Modeling is Data Governance
Data Modeling is Data GovernanceData Modeling is Data Governance
Data Modeling is Data GovernanceDATAVERSITY
 
Data Management vs. Data Governance Program
Data Management vs. Data Governance ProgramData Management vs. Data Governance Program
Data Management vs. Data Governance ProgramDATAVERSITY
 
Data Architecture vs Data Modeling
Data Architecture vs Data ModelingData Architecture vs Data Modeling
Data Architecture vs Data ModelingDATAVERSITY
 

Tendances (20)

Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture Requirements
 
Ashish dwivedi
Ashish dwivediAshish dwivedi
Ashish dwivedi
 
RWDG Slides: Data Governance Roles and Responsibilities
RWDG Slides: Data Governance Roles and ResponsibilitiesRWDG Slides: Data Governance Roles and Responsibilities
RWDG Slides: Data Governance Roles and Responsibilities
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
DI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics ArchitectureDI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics Architecture
 
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management PurgatoryData-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
 
Data Governance vs. Information Governance
Data Governance vs. Information GovernanceData Governance vs. Information Governance
Data Governance vs. Information Governance
 
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...
 
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDM
 
Emerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big ThingEmerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big Thing
 
Real-World Data Governance: Setting Appropriate Business Expectations
Real-World Data Governance: Setting Appropriate Business ExpectationsReal-World Data Governance: Setting Appropriate Business Expectations
Real-World Data Governance: Setting Appropriate Business Expectations
 
The Chief Data Officer's Agenda: What a CDO Needs to Know about Data Quality
The Chief Data Officer's Agenda: What a CDO Needs to Know about Data QualityThe Chief Data Officer's Agenda: What a CDO Needs to Know about Data Quality
The Chief Data Officer's Agenda: What a CDO Needs to Know about Data Quality
 
CDO Webinar: Ends vs. Means - The Role of Data Models and Other Key Artifacts
CDO Webinar: Ends vs. Means - The Role of Data Models and Other Key ArtifactsCDO Webinar: Ends vs. Means - The Role of Data Models and Other Key Artifacts
CDO Webinar: Ends vs. Means - The Role of Data Models and Other Key Artifacts
 
Convincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is EssentialConvincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is Essential
 
Big Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling MetadataBig Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling Metadata
 
Data Modeling is Data Governance
Data Modeling is Data GovernanceData Modeling is Data Governance
Data Modeling is Data Governance
 
Data Management vs. Data Governance Program
Data Management vs. Data Governance ProgramData Management vs. Data Governance Program
Data Management vs. Data Governance Program
 
Data Architecture vs Data Modeling
Data Architecture vs Data ModelingData Architecture vs Data Modeling
Data Architecture vs Data Modeling
 

Similaire à The First Step in Data Management

Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsDATAVERSITY
 
Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements  Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements Data Blueprint
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
MDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large EnterprisesMDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large EnterprisesMark Schoeppel
 
Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringData-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringDATAVERSITY
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements Data Blueprint
 
Data-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture RequirementsData-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture RequirementsDATAVERSITY
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsDATAVERSITY
 
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan JonesAIIM International
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
 
Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality RightDATAVERSITY
 
Best practice for_agile_ds_projects
Best practice for_agile_ds_projectsBest practice for_agile_ds_projects
Best practice for_agile_ds_projectsKhalid Kahloot
 
Data Architecture for Solutions.pdf
Data Architecture for Solutions.pdfData Architecture for Solutions.pdf
Data Architecture for Solutions.pdfAlan McSweeney
 
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAOAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAAlex Fiteni
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
What makes an effective data team?
What makes an effective data team?What makes an effective data team?
What makes an effective data team?Snowplow Analytics
 
Drive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event ProcessingDrive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event ProcessingPerficient, Inc.
 

Similaire à The First Step in Data Management (20)

Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling Fundamentals
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture Requirements
 
Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements  Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
MDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large EnterprisesMDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large Enterprises
 
Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringData-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality Engineering
 
2014 dqe handouts
2014 dqe handouts2014 dqe handouts
2014 dqe handouts
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements
 
Data-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture RequirementsData-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture Requirements
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced Analytics
 
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
 
Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality Right
 
Best practice for_agile_ds_projects
Best practice for_agile_ds_projectsBest practice for_agile_ds_projects
Best practice for_agile_ds_projects
 
Data Architecture for Solutions.pdf
Data Architecture for Solutions.pdfData Architecture for Solutions.pdf
Data Architecture for Solutions.pdf
 
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAOAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
What makes an effective data team?
What makes an effective data team?What makes an effective data team?
What makes an effective data team?
 
Drive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event ProcessingDrive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event Processing
 

Plus de DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

Plus de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

The First Step in Data Management

  • 1. The First Step in Information Management www.firstsanfranciscopartners.com Produced by: MONTHLY SERIES Brought to you in partnership with: June 1, 2017 Data-Centric Development
  • 2. Welcome, Malcolm Chisholm  First San Francisco Partners’ Chief Innovation Officer  More than 25 years of experience in data management  Areas of expertise: data-centric development methodology, data governance, master/reference data management, metadata engineering, business rules management/ execution, data architecture and design pg 2© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
  • 3. Polling Questions pg 3© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  Do you have key data-centric projects (i.e. Data Lake) you are implementing or have implemented this year? − Yes − No − Not sure  Do you employ (or have employed) Waterfall or Agile methods to manage your data-centric projects? − Yes − No − Not sure
  • 4. Topics for Today’s Webinar  Data-Centric Development Defined  The Focus on Programming in Agile Development  How to Include a Data Focus in Agile Development  The Focus on Programming in Big Data  The Data-Centric Development Life Cycle  Using Conceptual Data Modeling to Make Development Data-Centric  Data-Centric Case Study  Closing Remarks, Resources and Q&A pg 4© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
  • 6. Data-Centric vs. Process-Centric Projects pg 6© 2017 First San Francisco Partners www.firstsanfranciscopartners.com Computerized Systems Control Systems Information Systems Process-Centric Systems Data-Centric Systems Data-centric Projects Process-centric Projects Types of Computerized Systems  Distinction between data-centric and process-centric projects  Process-centric: traditional projects for computerized systems that automate a process and where data is a by-product  Data-centric: focused on building a system that is to purely manage data and not to automate a business process  Some overlap: data-centric will involve automation (e.g., ETL) and process-centric will involve data (e.g., used for measuring process efficiency)
  • 7.  Data-Centric Development Project – gets value out of pre-existing data or from curating data to completely separate areas of the enterprise to derive value from it.  Example: Data Warehouse starts with pre-existing production data  Example: Customer Master Data Management (MDM) hub which other applications will use to get “golden records” for customers  Example: Big Data projects for analytics  Process-Centric Development Project – automates some aspect of the enterprise; often, a manual process that is automated or an existing automated process that is upgraded (no focus on the data).  Example: Point-of-sale system  Example: Payroll system  Example: Medical billing system  There is always overlap: process comes into data-centric projects, and data always exists in process-centric projects. It is the overall focus that is different. What is a Data-Centric Development Project pg 7© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
  • 8. Fundamental Classes of Data-Centric Projects pg 8© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  Project types that are fundamentally data-centric: − Data Warehouses − Data Marts − Operational Data Stores (ODSs) − Data Lakes − Reference Data Management (RDM) − Master Data Management (MDM)
  • 9. 1960s 1970s 1980s 1990s 2000s 2010s Mainframes Package Implementation Distributed Computing Internet Cloud Manual Process Automation Data Warehouses / BI MDM Big Data Technology PCs Business Use Cases pg 9© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  Over 50+ years, different technology answered different use cases  General move from process-centricity to data-centricity  Systems Development Life Cycle (SDLC) is a 1960s-era methodology  SDLC is still almost universally used, including Agile Projects Have Been Run the Same Way for Many Years
  • 10. www.firstsanfranciscopartners.com The Focus on Programming in Agile Development
  • 11. Requirements Analysis Design Development Quality Assurance Production Post-Production Waterfall Systems Development Life Cycle (SDLC) 1. Waterfall presumes there is a process to be automated. In a data-centric project, the starting point is existing production dataBut 2. Business Analysts expect users to state requirements. Users never understand the data at the outsetBut 3. Waterfall is linear. With data-centric, there are true cycles of iteration as understanding of source data evolves But 4. Waterfall QA phase only tests functionality, not data. With data-centric, data quality needs to be testedBut  And there are many other mismatches. Waterfall SDLC is a development project methodology created in the mid-1960s for process-centric projects. It is highly embedded in IT, but not well-aligned to data- centric projects. pg 11© 2017 First San Francisco Partners www.firstsanfranciscopartners.com Waterfall SDLC vs. Data-Centric Projects
  • 12. Requirements Analysis Design Development Quality Assurance Production Post-Production Requirements Analysis Design Development Quality Assurance Production Post-Production Requirements Analysis Design Development Quality Assurance Production Post-Production Sprint Sprint Sprint Epics User Stories Backlog ManagementProject Increments pg 12© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  Agile is more popular today, but retains aspects of Waterfall and has no particular data-centric aspects. Agile
  • 13. Tell me your requirements! They’re kind of like this… Business Analyst Business User Data Warehouse Development Project Business User Business Analyst Hey, this report doesn’t make any sense to me! It’s your problem because your requirements were bad January February March April May June July 3 Years Later Data Extract from Data Warehouse Data Scientist 1 Data Scientist 2 I just don’t understand this data I have no clue who to even ask  These problems reflect the way that development projects are managed – the traditional SDLC. pg 13© 2017 First San Francisco Partners www.firstsanfranciscopartners.com What Can Go Wrong with Data-Centric Projects?
  • 14. www.firstsanfranciscopartners.com How to Include a Data Focus in Agile Development
  • 15. Agile / Waterfall What Data-Centric Projects Need • Oriented to automating an unautomated or partially automated process • Oriented to getting value out of existing production data or curating data for other processes to use • Users are able to articulate processes reasonably well for requirements • Users typically do not know the details of the source data and may not even know where it is – or if it exists • Testing focuses on whether the functionality matches requirements • Testing focuses on data quality of source data and data produced by transformations, calculations and derivations • No testing artifacts are carried over into Production • Data quality rules developed in testing are put into Production for continuous data quality monitoring • Knowledge gained during the project is used only for development activities within the project • Knowledge gained during the project is part of what is delivered and is used later for developing reports after Production implementation • Stakeholders are predominantly the business users who will benefit from the functionality • Stakeholders also include representatives from business areas who will use the data outside of the application, or may do so in the future, e.g. data scientists # 1 2 3 4 5 7 • Legal questions about processes are rare • Legal, privacy and compliance concerns exist both for the curation and permitted business use of data 6 pg 15© 2017 First San Francisco Partners www.firstsanfranciscopartners.com Mismatch of Traditional Project Methodology to Data-Centric
  • 17. Data Problems in Big Data Environments pg 17© 2017 First San Francisco Partners www.firstsanfranciscopartners.com • The technology and processes to get data into a Big Data Environment are relatively simple • But there are huge challenges with understanding the source data Big Data Environment (Data Lake) Emails Documents Web Pages XML Relational Flat Files Audio Image Video I N G E S T I O N Source A Source B Source C Source D Source E
  • 18. Columnar Databases in Big Data Environments pg 18© 2017 First San Francisco Partners www.firstsanfranciscopartners.com • Columnar Databases are used a lot in Big Data • They have to be organized to look like their queries, and to house the data that comes into them from the sources • Thus Target design and Source data analysis are huge issues rowID Column Family Column Qualifier “Timestamp” Payload Doe|1968-11-04|John “CUSTOMER” “EMPLOYEE”Doe|1968-11-04|John Examples of Column Family “PURCHASER”Doe|1968-11-04|John …and hundreds more… Structure of a record in HBase
  • 20. First San Francisco Partners’ DCLC:  Recognizes specific activities needed for a data-centric project instead of abstracting them into over- generalizations like “analysis.”  Provides for real iterations that lead to refinement of information requirements, instead of a single- requirements activity.  Understands some activities can be carried out in parallel, instead of the SDLC and Agile’s linear flow. pg 21© 2017 First San Francisco Partners www.firstsanfranciscopartners.com Introducing the Data-Centric Development Life Cycle (DCLC)
  • 21. 100% Process- centric 100% Data- centric  The full DCLC is appropriate for projects that are heavily data-centric.  However, even projects that are overwhelmingly process-centric can benefit from some elements of the DCLC.  This is because process-centric projects will be creating data that may be used in the future in some analytics environment (that may not even exist yet). pg 22© 2017 First San Francisco Partners www.firstsanfranciscopartners.com Some Elements of DCLC are Needed for All Projects
  • 22. Questions to Ask About Process-Centric Projects pg 23© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  Will the data be used for analysis outside of the system that is being built? − e.g., Will it be fed into a Data Warehouse or Data Lake? − e.g., Will it be sold?  Are there stakeholders in the data that the system will produce who are not the business sponsors or controlled by the business sponsors? − e.g. Data Scientists or Marketing  Are any data feeds needed as inputs? − Versus only data entry  Does data quality matter to the business sponsors of the project? − Are they bringing this concern to the project, rather than mildly agreeing with outside suggestions? Any significant “yes” responses mean you should consider deploying elements of the DCLC on the project.
  • 23. www.firstsanfranciscopartners.com Using Conceptual Data Modeling to Make Development Data-Centric
  • 24. What is Semantics pg 24© 2017 First San Francisco Partners www.firstsanfranciscopartners.com Terms Terminology – what words are used in what contexts (subject fields) by what communities. Needs some level of Ontology for contexts. Not done much in USA. Concepts Definitions and allied metadata. Requires some level of Ontology to make distinctions clear. Often not done in finance for Reference Data (Codes). Often done poorly. Classifications, Taxonomies Groupings of concepts based on common characters, or a particular management need. Problem of how to actually do classification often not addressed Hierarchies Systems of relations between individuals, not concepts. Problem of mixing different relation types within the hierarchy. E.g. Legal vs Risk vs Sales Rules Calculations, Derivations, Constrains among concepts. These are not Definitions, but are often confused with them. Ontologies A particular view of financial reality, composed of all the other items described below plus more relations. A model of business information without any though as to how it will be stored as data. S E M A N T I C S All this plays a role in the early stages of the Data-Centric Development Life Cycle
  • 25. Subject-Area Models pg 25© 2017 First San Francisco Partners www.firstsanfranciscopartners.com The highest level of conceptual model, but very useful in Data Discovery
  • 27. High-Level Overview pg 27© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  Shared with enterprise leadership how data management and FSFP’s Data-Centric Development Life Cycle methodology could positively impact a major Data Warehouse project and fill a critical project gap without causing extra work  Used momentum and resources of project to advance maturity of Data Governance practices
  • 28. How We Got Traction pg 28© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  DCLC tied directly in to project deliverables  Momentum coming from project deadlines  Integration of clear governance goals into tactical deliverables  Active participation of Data Governance manager  Inclusion of data analysis in requirements gathering
  • 29. Results pg 29© 2017 First San Francisco Partners www.firstsanfranciscopartners.com  Clear project data requirements that will enable re-use of data in new reporting environment  Cross-functional agreement to data definitions and concepts  Business glossary ready for go-live  Clear business ownership of data  Data Governance team positioned for success
  • 31. Webinar Takeaways and Resources  Takeaways − Identify Data-centric projects vs Process- centric ones − Consider taking a Data-centric approach for Data-centric projects.  Suggested resources: − DCLC articles on the FSFP blog firstsanfranciscopartners.com/blog − DCLC two-page overview firstsanfranciscopartners.com/DCLC-overview pg 31© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
  • 32. Questions? pg 32© 2017 First San Francisco Partners www.firstsanfranciscopartners.com MONTHLY SERIES
  • 33. Thank you! Please join us Thursday, July 6 for the “Governing Quality Analytics” webinar. Malcolm Chisholm @MDChisholm malcolm@firstsanfranciscopartners.com John Ladley @jladley john@firstsanfranciscopartners.com