Our architecturally solid stool requires three legs: people, process, and technologies. This webinar looks at the most misunderstood of these three components: technology. While most organizations begin with technologies, it turns out that technologies are the last component that should be considered. This webinar will survey a range of technologies that can be used to increase the productivity of Data Management efforts. The goal is to invest in as little infrastructure as possible while still achieving business/program objectives. This program’s learning objectives include:
• Understanding technology considerations
• Appreciating the overview of data technologies and then specifically
• CASE technologies
• Repositories
• Profiling/discovery tools
• Data Quality engineering tools
• Appreciating the complete Data Quality life cycle
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
DataEd Slides: Leveraging Data Management Technologies
1. Copyright 2020 by Data Blueprint Slide # 1Peter Aiken, PhD
(Unlocking Business Value)
Peter Aiken, PhD
Leveraging
Data
Management
Technologies
• DAMA International President 2009-2013 / 2018
• DAMA International Achievement Award 2001
(with Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
• I've been doing this a long time
• My work is recognized as useful
• Associate Professor of IS (vcu.edu)
• Founder, Data Blueprint (datablueprint.com)
• DAMA International (dama.org)
• CDO Society (iscdo.org)
• 11 books and dozens of articles
• Experienced w/ 500+ data
management practices worldwide
• Multi-year immersions
– US DoD (DISA/Army/Marines/DLA)
– Nokia
– Deutsche Bank
– Wells Fargo
– Walmart … PETER AIKEN WITH JUANITA BILLINGS
FOREWORD BY JOHN BOTTEGA
MONETIZING
DATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
2Copyright 2020 by Data Blueprint Slide #
Peter Aiken, Ph.D.
9. Copyright 2020 by Data Blueprint Slide # 1Peter Aiken, PhD
(Unlocking Business Value)
Peter Aiken, PhD
Leveraging
Data
Management
Technologies
• DAMA International President 2009-2013 / 2018
• DAMA International Achievement Award 2001
(with Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
• I've been doing this a long time
• My work is recognized as useful
• Associate Professor of IS (vcu.edu)
• Founder, Data Blueprint (datablueprint.com)
• DAMA International (dama.org)
• CDO Society (iscdo.org)
• 11 books and dozens of articles
• Experienced w/ 500+ data
management practices worldwide
• Multi-year immersions
– US DoD (DISA/Army/Marines/DLA)
– Nokia
– Deutsche Bank
– Wells Fargo
– Walmart … PETER AIKEN WITH JUANITA BILLINGS
FOREWORD BY JOHN BOTTEGA
MONETIZING
DATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
2Copyright 2020 by Data Blueprint Slide #
Peter Aiken, Ph.D.
10. Copyright 2020 by Data Blueprint Slide # X
By the end of this session, you should have a better
understanding of data management technologies in
terms of:
• Technology Considerations
• Data Technology Architecture
• CASE Tools
• Repositories
• Profiling/Discovery Tools
• Data Quality Engineering Tools
• Data Life Cycle
• Other Technologies:
– Servers, EII Technologies, Portals, Conversion Tools
Approaching Data Management Technologies
Blind Persons and the Elephant
4Copyright 2020 by Data Blueprint Slide #
http://www.dailymirror.lk/print/opinion/editorial-we-need-to-become-channels-of-peace/172-27164
It is like a fan!
It is like a snake!
It is like a wall!
It is like a rope!
It is like a tree!
11. 5Copyright 2020 by Data Blueprint Slide #
Unrefined
data management
definition
Sources
Uses
Data Management
6Copyright 2020 by Data Blueprint Slide #
More refined
data management
definition
Sources
ReuseData Management➜ ➜
12. 7Copyright 2020 by Data Blueprint Slide #
Data Governance
Data Assets/Ethical Framework
Sources
➜ Use
➜Reuse
Better still data management definition
➜
Standard data
Data supply
Data literacy
Making a Better Data Sandwich
8Copyright 2020 by Data Blueprint Slide #
Data literacy
Standard data
Data supply
13. Making a Better Data Sandwich
9Copyright 2020 by Data Blueprint Slide #
Standard data
Data supply
Data literacy
Making a Better Data Sandwich
Quality engineering/
architecture work products
do not happen accidentally!
10Copyright 2020 by Data Blueprint Slide #
Standard data
Data supply
Data literacy
This cannot happen without engineering and architecture!
14. Technologies by themselves, are a One Legged Stool
11Copyright 2020 by Data Blueprint Slide #
Success Requires a 3-Legged Stool
12Copyright 2020 by Data Blueprint Slide #
People
Process
Technology
15. 13Copyright 2020 by Data Blueprint Slide #
People
Process
Technology
14Copyright 2020 by Data Blueprint Slide #
16. Supply/demand for data talent
15Copyright 2020 by Data Blueprint Slide #
https://www.logianalytics.com/bi-trends/3-keys-understanding-data/
Growth of Data vs. Growth of Data Analysts
• Stored data accumulating at
28% annual growth rate
• Data analysts in workforce
growing at 5.7% growth rate
16Copyright 2020 by Data Blueprint Slide #
R. Buckminster Fuller
17. 17Copyright 2020 by Data Blueprint Slide #
https://en.wikipedia.org/wiki/Moore%27s_law#/media/File:Moore%27s_Law_Transistor_Count_1971-2016.png
Postpone technology investments
as long as possible
The hardest part of
requirements is not
doing design
Vendor Hype
• CIOs/CDOs feel pressure
• Vendor/project promise auditing
• No understanding of hype curve
18Copyright 2020 by Data Blueprint Slide #
18. Who wrote this … ?
19Copyright 2020 by Data Blueprint Slide #
• In considering any new subject,
• there is frequently a tendency
first to overrate what we find to
be already interesting or
remarkable, and
• secondly - by a sort of natural
reaction - to undervalue the true
state of the case.
– Lady Augusta Ada King,
(1815 – 1852)
Countess of Lovelace
– (aka) Ada Lovelace,
daughter of Lord Byron
– Publisher of the first
computing program
20Copyright 2020 by Data Blueprint Slide #
19. http://www.gartner.com/technology/research/methodologies/hype-cycle.jsp
21Copyright 2020 by Data Blueprint Slide #
Technology Trigger: A potential technology breakthrough kicks things off. Early proof-of-concept stories and media interest
trigger significant publicity. Often no usable products exist and commercial viability is unproven.
Trough of Disillusionment: Interest wanes as experiments and implementations fail to deliver. Producers of the
technology shake out or fail. Investments continue only if the surviving providers improve their products to the
satisfaction of early adopters.
Peak of Inflated Expectations: Early publicity produces a number of
success stories—often accompanied by scores of failures. Some
companies take action; many do not.
Slope of Enlightenment: More instances of how the technology can benefit the
enterprise start to crystallize and become more widely understood. Second- and third-
generation products appear from technology providers. More enterprises fund pilots;
conservative companies remain cautious.
Plateau of Productivity: Mainstream adoption starts to
take off. Criteria for assessing provider viability are more
clearly defined. The technology’s broad market
applicability and relevance are clearly paying off.
Gartner Five-phase Hype Cycle
Hype Cycle for Data Management
22Copyright 2020 by Data Blueprint Slide #
20. Hype Cycle for Information Governance and Master Data Management
23Copyright 2020 by Data Blueprint Slide #
Hype Cycle for Analytics and Business Intelligence
24Copyright 2020 by Data Blueprint Slide #
24. Tools and Methods Are Required!
31Copyright 2020 by Data Blueprint Slide #
Copyright 2020 by Data Blueprint Slide # X
By the end of this session, you should have a better
understanding of data management technologies in
terms of:
• Technology Considerations
• Data Technology Architecture
• CASE Tools
• Repositories
• Profiling/Discovery Tools
• Data Quality Engineering Tools
• Data Life Cycle
• Other Technologies:
– Servers, EII Technologies, Portals, Conversion Tools
Approaching Data Management Technologies
25. Computer-aided software engineering (CASE)
is the scientific application of a set of tools and
methods to a software system which is meant
to result in high-quality, defect-free, and
maintainable software products. It also refers
to methods for the development of information
systems together with automated tools that
can be used in the software development
process. Computer Aided Software/Systems Engineering Tools
• Scientific application of a set of tools and methods to a software system which is
meant to result in high-quality, defect free, and maintainable software products
• Refers to methods for the development of information systems together with
automated tools that can be used in the software development process
• CASE functions include analysis, design, and programming
33Copyright 2020 by Data Blueprint Slide #
Source: http://en.wikipedia.org/wiki/
CASE Tools
CASE-based Support
34Copyright 2020 by Data Blueprint Slide #
http://www.visible.com
26. CASE-based Support
35Copyright 2020 by Data Blueprint Slide #
http://www.visible.com
CASE-based Support
36Copyright 2020 by Data Blueprint Slide #
http://www.visible.com
28. 39Copyright 2020 by Data Blueprint Slide #
This includes:
• Senders
– flows from the CASE effort that
can inform the re-architecting
effort.
• Receivers
– flows from the project that can
inform the CASE effort.
• Senders and receivers
– some elements, such as
restructuring and reengineering,
are both senders and receivers.
CASE Tool: "Taxonomy"
A variety of
CASE-based
methods and
technologies can
access and
update the
metadata
metadata
Integration
Additional metadata uses
accessible via: web; portal;
XML; RDBMS
Everything must "fit" into one
CASE technology
Changing Model of CASE Tool Usage
40Copyright 2020 by Data Blueprint Slide #
Limited access
from outside
the CASE
technology
environment
CASE
tool-specific
methods
and
technologies
Limited additional
metadata use
29. Copyright 2020 by Data Blueprint Slide # X
By the end of this session, you should have a better
understanding of data management technologies in
terms of:
• Technology Considerations
• Data Technology Architecture
• CASE Tools
• Repositories
• Profiling/Discovery Tools
• Data Quality Engineering Tools
• Data Life Cycle
• Other Technologies:
– Servers, EII Technologies, Portals, Conversion Tools
Approaching Data Management Technologies
The Biggest Challenges to Data Management Practice
42Copyright 2020 by Data Blueprint Slide #
30. One Eighth of the Data Management Spend
43Copyright 2020 by Data Blueprint Slide #
88%
12%
Metadata
• Metadata management
is still a nascent
discipline that only
represents 12% of the
time spent in data
management
Repositories have been difficult to "sell"
21 September 1999
Michael Blechar, Lisa Wallace
Management Summary
Most executive and IS managers view an IT metadata repository as an
esoteric technology that is not directly related to the business.
However, as will be seen, an IT metadata repository can substantially
help IS organizations support the applications, which in turn support
the business. An IT metadata repository is a pre-built system and
reference database where the IS organizations can track and manage
the information about the applications and databases they build and
maintain; think of it as the inventory and change impact reporting
system for IS. These repositories track metadata such as the
descriptions of jobs, programs, modules, screens, data and
databases, and the interrelationships between them. Metadata differs
from the actual data being described. Metadata is information about
data. For example, the metadata descriptions in the repository tell one
that the field "customer number" appears in Databases A, B and F ...
44Copyright 2020 by Data Blueprint Slide #
[From gartner.com]
31. What tools do you use?
45%
23%
13%
9%
7%
2%
1% 1% 1% 1%
None HomeGrown Other CA Platinum Rochade Universal
Repository
DesignBank DWGuide InfoManager Interface
Metadata
Tool
• Many build their own
Repository Technologies in Use
45Copyright 2020 by Data Blueprint Slide #
Number Responding=181
• Almost 50% doesn't use
• The "traditional" players are low
numbers
Magic Quadrant for Metadata Management Solutions
46Copyright 2020 by Data Blueprint Slide #
https://www.gartner.com/document/3894971?ref=solrAll&refval=219836558&qid=de595a5685b6f86db0ec6
32. IBM's AD/Cycle Information Model
47Copyright 2020 by Data Blueprint Slide #
48Copyright 2020 by Data Blueprint Slide #
https://wiscorp.com/kwf_diagram.html
33. • "The repository" does not have to be an integrated solution
– it must be an easily integrateable solution
• Repository functionality (does not equal a) repository
– metadata must easily evolve to repository solution
• Multiple repositories are not necessarily bad
– as interim solutions, Excel has been working quite well
• Minimal functionality includes
• ability to create, read, update, delete, and evolve metadata items
• Remember the 1st law of data management
– In order to manage metadata, you need metadata repository functions
49Copyright 2020 by Data Blueprint Slide #
Implementing Metadata Repository Functionality
Copyright 2020 by Data Blueprint Slide # X
By the end of this session, you should have a better
understanding of data management technologies in
terms of:
• Technology Considerations
• Data Technology Architecture
• CASE Tools
• Repositories
• Profiling/Discovery Tools
• Data Quality Engineering Tools
• Data Life Cycle
• Other Technologies:
– Servers, EII Technologies, Portals, Conversion Tools
Approaching Data Management Technologies
34. Time Spent by Data Management Teams Across Disciplines
51Copyright 2020 by Data Blueprint Slide #
https://www.gartner.com/document/3894971?ref=solrAll&refval=219836558&qid=de595a5685b6f86db0ec6
Data Discovery Technologies
• Data analysis software technologies deliver up to 10X
productivity over manual approaches
• Based on a powerful computing technology that allows data
engineers to quickly form candidate hypotheses with respect to
the existing data structures
• Hypotheses are then presented to the SMEs (both business and
technical) who confirm, refine, or deny them
• Allows existing data structures to be inferred at rate that is an
order of magnitude more effective than previous manual
approaches
• Pioneers include Evoke->CSI, Metagenix->Ascential->IBM,
Sypherlink
52Copyright 2020 by Data Blueprint Slide #
Profiling
Discovery
Analysis
35. How has this been done in the past?
Old
• Manually
• Brute force
• Repository dependent
• Quality indifferent
• Not repeatable
New
• Semi-automated
• Engineered
• Repository
independent
• Integrated quality
• Repeatable
• Currency
• Accuracy
53Copyright 2020 by Data Blueprint Slide #
54Copyright 2020 by Data Blueprint Slide #
Select an Attribute to
get a list of values
Double-click a value to
see rows with that value
36. Comparing Weekly Progress
Monday
Morning:
Model
preparation
Afternoon:
Model refinement/
validation session
Tuesday
Morning:
Model refinement/
validation session
Afternoon:
Model refinement/
validation session
Wednesday
Morning:
Model
preparation
Afternoon:
Model refinement/
validation session
Thursday
Morning:
Model refinement/
validation session
Afternoon:
Model refinement/
validation session
Friday
Morning:
Model
preparation
Afternoon:
Model refinement/
validation session
Monday
Morning:
Model
preparation
Afternoon:
Model
preparation
Tuesday
Morning:
Model
preparation
Afternoon:
Model refinement/
validation session
Wednesday
Morning:
Model
preparation
Afternoon:
Model
preparation
Thursday
Morning:
Model
preparation
Afternoon:
Model refinement/
validation session
Friday
Morning:
Model
preparation
Afternoon:
Model
preparation
Reactive
Proactive
55Copyright 2020 by Data Blueprint Slide #
Trifacta/Data Wrangling
56Copyright 2020 by Data Blueprint Slide #
41. Copyright 2020 by Data Blueprint Slide # X
By the end of this session, you should have a better
understanding of data management technologies in
terms of:
• Technology Considerations
• Data Technology Architecture
• CASE Tools
• Repositories
• Profiling/Discovery Tools
• Data Quality Engineering Tools
• Data Life Cycle
• Other Technologies:
– Servers, EII Technologies, Portals, Conversion Tools
Approaching Data Management Technologies
Data acquisition activities Data usage activitiesData storage
Traditional Data Life Cycle
66Copyright 2020 by Data Blueprint Slide #
42. 67Copyright 2020 by Data Blueprint Slide #
Metadata
Creation
Data
Assessment
MetadataRefinement
DataRefinement
Data
Manipulation
DataCreation
Data
Utilization
Metadata
Structuring
Data Storage
DataLifeCycleModel
Metadata Data
Dimension Focus/Phase: Refinement Creation Structuring Creation Manipulation Refinement Utilization Assessment
Data
Architecture
Quality
Data architecture quality
is the focus of metadata
creation & refinement
efforts.
↵ ↵ ↵
Data Model
Quality
Data model quality is the
focus of metadata
refinement & structuring
efforts
↵ ↵ ↵
Data Value
Quality
Data value quality is the
focus of the data
creation, manipulation,
and refinements phases.
↵ ↵ ↵ ↵
Data
Representation
Quality
Data representation
quality is the focus of
data utilization phase.
↵ ↵
Dimensions Related to Phases
• Data architecture quality is the focus of metadata creation and refinement efforts.
• Data model quality is the focus of metadata structuring efforts
• Data value quality is the focus of the data creation, manipulation, and refinements phases.
• Data architecture and model quality are the focus of metadata refinement efforts.
• Data representation quality is the focus of data utilization and assessment phase.
68Copyright 2020 by Data Blueprint Slide #
43. Copyright 2020 by Data Blueprint Slide # X
By the end of this session, you should have a better
understanding of data management technologies in
terms of:
• Technology Considerations
• Data Technology Architecture
• CASE Tools
• Repositories
• Profiling/Discovery Tools
• Data Quality Engineering Tools
• Data Life Cycle
• Other Technologies:
– Servers, EII Technologies, Portals, Conversion Tools
Approaching Data Management Technologies
Other Technologies
Data Integration Definition:
• Pulling together and reconciling dispersed data for analytic
purposes that organizations have maintained in multiple,
heterogeneous systems. Data needs to be accessed and
extracted, moved and loaded, validated and cleaned,
standardized and transformed.
• Other tools include:
– Servers
– EII technologies
– Portals
– Conversion tools
70Copyright 2020 by Data Blueprint Slide #
Source: http://www.information-management.com
44. Portal Options
71Copyright 2020 by Data Blueprint Slide #
[Adapted from Terry Lanham Designing Innovative Enterprise Portals and Implementing Them Into Your Content Strategies Lockheed
Martin’s Compelling Case Study Web Content II: Leveraging Best-of-Breed Content Strategies - San Francisco, CA 23 January 2001]
Legacy Systems Transformed Into Web-services Accessed Through a Portal
72Copyright 2020 by Data Blueprint Slide #
Organizational Portal
Saturday, April 6, 2019 - All systems operational!
Organizational News
• Organizational Early News • Industry News
• Press Releases • Newsletters
Organizational IT
• Service Desk
• Settings
Email
• 320 new msgs, 14,572 total
• Send quick email
Organizational Essentials
• Knowledge network
• Employee assistance
• IT procurement
• Organizational media design
• Organizational merchandise
Search
Go
Stocks
Full Portfolio
XYZ
YYZ
ZZZ
Market Update
50
29.5
45.25
As of:
Saturday, April 6, 2019
Get Quote
Reporting
Regional
• Northeast
• Northwest
• Southeast
• Southwest
• Midnorth
• Midsouth
State
• Alabama
• Arkansas
• Georgia
• Mississippi
• Vermont
• Virginia
Legacy
Application 1
Legacy
Application 2
Legacy
Application 3
Legacy
Application 4
Legacy
Application 5
Web
Service 1.1
Web
Service 1.2
Web
Service 1.3
Web
Service 2.1
Web
Service 2.2
Web
Service 3.1
Web
Service 3.2
Web
Service 4.1
Web
Service 4.2
Web
Service 5.1
Web
Service 5.2
Web
Service 5.3
45. 73Copyright 2020 by Data Blueprint Slide #
Top Tier Demo
Portals as a Data Quality Tool
74Copyright 2020 by Data Blueprint Slide #
46. Defining Spaces
• ETL Extract Transform, Load
– delivers aggregated data to a
new database
• EAI Enterprise Application Integration
– connects applications to other applications in a
predictable manner using
pre-established connections
• EII Enterprise Information Integration
– between ETL and EAI - delivers tailored views of
information to users at the time that it is required
75Copyright 2020 by Data Blueprint Slide #
Meta-Matrix Virtual-Integration Example
76Copyright 2020 by Data Blueprint Slide #
47. Approaching Data Management Technologies
By the end of this session, you should have a better understanding
of data management technologies and their use as part of a people
process & technology 3-legged stool in terms of:
• Technology Considerations
• Data Technology Architecture
• CASE Tools
• Repositories
• Profiling/Discovery Tools
• Data Quality Engineering Tools
• Data Life Cycle
• Other Technologies:
– Servers, EII Technologies, Portals, Conversion Tools
77Copyright 2020 by Data Blueprint Slide #
Gartner Key Findings
• Data assets continue to drive strategic cloud service providers’
offerings
• Machine learning is increasingly popular–key uses:
– Data integration tools,
– Database management systems,
– Data quality tools and
– Metadata management solutions
• Increasing use of cloud for production applications requiring that
database in the cloud
• Organizations applying a combination of data warehouses, data
lakes and data hubs can achieve greater flexibility to support a
range of use cases compared to those applying only one.
78Copyright 2020 by Data Blueprint Slide #
https://www.gartner.com/document/3894971?ref=solrAll&refval=219836558&qid=de595a5685b6f86db0ec6
48. IT Business
Data
Perceived State of Data
79Copyright 2020 by Data Blueprint Slide #
Data
Desired To Be State of Data
80Copyright 2020 by Data Blueprint Slide #
IT Business
49. The Real State of Data
81Copyright 2020 by Data Blueprint Slide #
Data
IT Business
It isn't possible to go digital
Digital
82Copyright 2020 by Data Blueprint Slide #
50. aBy just spelling 'data'
Dat
83Copyright 2020 by Data Blueprint Slide #
It requires more work
Data
84Copyright 2020 by Data Blueprint Slide #
a
51. Lady Ada Augusta King Rule
85Copyright 2020 by Data Blueprint Slide #
https://people.well.com/user/adatoole/bio.htm
Recent Technology Realization
86Copyright 2020 by Data Blueprint Slide #
GarbageIn➜
GarbageOut!Recent
52. GI➜GO!
87Copyright 2020 by Data Blueprint Slide #
Perfect
Model
Garbage
Data
Garbage
Results
Data
Warehouse
Machine
Learning
Business
Intelligence
Block Chain
AI
MDM
Analytics
Technology
Data
Governance
GI➜GO!
88Copyright 2020 by Data Blueprint Slide #
Perfect
Model
Quality
Data
Garbage
Results
Data
Warehouse
Machine
Learning
Business
Intelligence
Block Chain
AI
MDM
Analytics
Technology
Data
Governance
55. + =
Questions?
93Copyright 2020 by Data Blueprint Slide #
It’s your turn!
Use the chat feature or
Twitter (#dataed) to submit
your questions now!
Upcoming Events
May Webinar
Data Management Best Practices
May 12, 2020 @ 2:00 PM ET
June Webinar
Approaching Data Governance
Strategically
June 9, 2020 @ 2:00 PM ET
Sign up for webinars at:
www.datablueprint.com/webinar-schedule
94Copyright 2020 by Data Blueprint Slide #
Brought to you by:
56. 10124 W. Broad Street, Suite C
Glen Allen, Virginia 23060
804.521.4056
Copyright 2020 by Data Blueprint Slide # 95