Everyone wants to leverage data. The optimal implementation of analytics is an organization-wide set of capabilities. These are called advantageous organizational analytic capabilities in that a clear ROI is demonstrable from these efforts. Turns out that there are a number of prerequisites to advantageous organizational analytics. These include:
Adopting a crawl, walk, run strategy
Understanding current and potential organizational maturity and corresponding capabilities
Achieving an appropriate technology/human capability balance
Implementing useful IT systems development practices
Installing necessary non-IT leadership
This webinar will explore these and other topics using examples drawn from DOD, healthcare researchers, and donation center operations.
Scaling API-first – The story of a global engineering organization
Predictive Analytics - How to get stuff out of your Crystal Ball
1. Presented by Steffani Burd, PhD & Peter Aiken, Ph.D.
Predictive Analytics
Getting Stuff from Your Crystal Ball
Protect Your Data | Build Your Business
Copyright 2013 by Data Blueprint
Your Presenters
Steffani Burd
• PhD Columbia/
Statistics
• B.A. University of Chicago/
Specialization: Neurobiology and
Behavioral Science
• InfraGard, Secret Service
Electronic Crimes Task Force,
NYPD Auxiliary Police Officer
• Founder, Ansec Group
• Ernst & Young Consulting
• Experienced Internationally/Fluent
Chinese/Spanish
• Cageless shark diving
Peter Aiken
• 30+ years data mgt.
• Multiple Int. awards/recognition
• Founding Director,
Data Blueprint (datablueprint.com)
• Associate Professor of IS (vcu.edu)
• Past, President, DAMA
International (dama.org)
• 9 books and dozens of articles
• 500+ empirical practice
descriptions
• Multi-year immersions w/
organizations as diverse as
US DoD, Nokia, Deutsche Bank,
Wells Fargo, Walmart, and the
Commonwealth of Virginia
6
2. Copyright 2013 by Data Blueprint
Ordering Pizza in the Future
7
8Copyright 2016 by Data Blueprint Slide #
Data Science
The Sexiest Job
of the 21st
Century
3. What is a Data Scientist?
9Copyright 2016 by Data Blueprint Slide #
Copyright 2013 by Data Blueprint
10
8. Customer
19Copyright 2016 by Data Blueprint Slide #
Current Customer
Ex-Custom
er?
Potential Customer
VIP-Custom
er?
Data Scientist?
20Copyright 2016 by Data Blueprint Slide #
Data science is a redundant term,
since all science involves data; it's like
saying, "book librarian."
Eric Siegel, Ph.D., author of Predictive
Analytics: The Power to Predict Who Will
Click, Buy, Lie, or Die
9. PA in the Analytics World
Descriptive
Ask: What happened? What is happening?
Find: Structured data
Show: Profiles, Bar/Pie charts, Narrative
Predictive
Ask: What will happen? Why will it happen?
Find: Structured/unstructured data
Show: Risk Profiles, Pros/Cons, Care Recs
Prescriptive
Ask: What should I do? Why should I do it?
Find: Unstructured/structured data
Show: Strategic Goals, Support Recs
! Organization-wide
! Volume and Noise
! Utility
! Meaningful scoring
! Actionable recs
! Realistic goals
! Support
! Manage & measure
C
Four Analytic Problems
C
Source: Elder Research (www.datamininglabs.com). “The Ten Levels of Analytics
10. Four Categories of Modeling Technology
C
Source: Elder Research (www.datamininglabs.com). “The Ten Levels of Analytics
Getting Stuff from Your Crystal Ball
S
Based on Tom Davenport’s “A predictive analytics primer” in Predictive Analytics in Practice from
Harvard Business Review Insight Center, 2014
11. Copyright 2013 by Data Blueprint
Maslow's Hierarchy of Needs
25
Data Management Practices Hierarchy
You can accomplish Advanced
Data Practices without
becoming proficient in the
Foundational Data
Management Practices
however this will:
• Take longer
• Cost more
• Deliver less
• Present
greater
risk
(with thanks to Tom DeMarco)
Advanced
Data
Practices
• MDM
• Mining
• Big Data
• Analytics
• Warehousing
• SOA
Foundational Data Management Practices
26Copyright 2016 by Data Blueprint Slide #
Data Platform/Architecture
Data Governance Data Quality
Data Operations
Data Management Strategy
Technologies
Capabilities
12. One concept for process
improvement, others include:
• Norton Stage Theory
• TQM
• TQdM
• TDQM
• ISO 9000
and focus on understanding
current processes and
determining where to make
improvements.
Copyright 2013 by Data Blueprint
DMM Capability Maturity Model Levels
Our DM practices are informal and ad hoc,
dependent upon "heroes" and heroic efforts
Performed
(1)
Managed
(2)
Our DM practices are defined and
documented processes performed at
the business unit level
Our DM efforts remain aligned with
business strategy using
standardized and consistently
implemented practices
Defined
(3)
Measured
(4)
We manage our data as a asset using
advantageous data governance practices/structures
Optimized
(5)
DM is strategic organizational capability,
most importantly we have a process for
improving our DM capabilities
27
Development guidance
Data Adminstration
Support systems
Asset recovery capability
Development training
0 1 2 3 4 5
Client Industry Competition All Respondents
Data Management Practices Assessment
Challenge
Challenge
Challenge
Data Program
Coordination
Organizational Data
Integration
Data Stewardship
Data Development
Data Support
Operations
28
Copyright 2016 by Data Blueprint
13. Copyright 2013 by Data Blueprint
Industry Focused Results
• CMU's Software
Engineering Institute (SEI) Collaboration
• Results from hundreds organizations in
various industries including:
✓ Public Companies
✓ State Government Agencies
✓ Federal Government
✓ International Organizations
• Defined industry standard
• Steps toward defining data management
"state of the practice"
29
Data Management Strategy
Data Governance
Platform & Architecture
Data Quality
Data Operations
Focus:
Implementation
and Access
Focus:
Guidance and
Facilitation
Optimized(V)
Measured(IV)
Defined(III)
Managed(II)
Initial(I)
1
2
3
4
5
DataProgramCoordination
OrganizationalDataIntegration
DataStewardship
DataDevelopment
DataSupportOperations
2007 Maturity Levels 2012 Maturity Levels
Comparison of DM Maturity 2007-2012
30
Copyright 2016 by Data Blueprint
14. “Good” Data
Analytic
Projects
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operation
Initial (I)
Repeatable (II)
Documented (III)
Managed (IV)
Optimizing (V)
Foundational Strategies
Data ROT
DM Practices
Processes
CMM/CMMI
Data-centric
Development Flow
S
“Appropriate” Statistical Analyses
Regression Techniques
Hypothesis-driven, IVs and DVs, correlations, error
Linear regression, Discrete choice models, Logistic regression,
Multinomial logistic regression, Probit regression, Time series
models, Survival or duration analysis, CART, Multivariate
adaptive regression splines
Machine Learning Techniques
Exploratory, emerging variables, scope and purpose
Neural networks, MLP, Radial basis functions, Support vector
machines, k-means cluster, Naïve Bayes, Geospatial
predictive modeling
S
15. “Valid” Assumptions
Consider
Future and past
Timeframes
Key variables
Missing data
Consequences
Model
Application
Additional and less
Documented
S
Don’t let this be you!
The Future of Predictive Analytics
Applications
Industries
Problems
Solutions
Technologies
Automation
Processing
Disruptive
F
Parallel Evolution?
16. Achieving Your Goals - Checklist
Data
Source (what, when, where, how, why)
Cleaning, Missing data, Outliers, Variables
Generalizability to population
Statistics
Rationale and Implementation
Assumptions
List and description
Implications if not valid (individual, combination)
Conditions would make assumptions not valid
Variables could include/remove
F
Derived from Tom Davenport’s “A predictive analytics primer” in Predictive Analytics in Practice from HBR Insight Center, 2014
Achieving Your Goals (cont’d)
Data Analytic Factors
Implementation Strategies
Repeatable & Scalable Solutions
Organizational Factors
Governance Models
Aligning Data and IT
Chief Data Officer
Success Factors
SLOTS
Last is First
F
Success?Success!
17. Steffani Burd, Ph.D.
sburd@ansecgroup.com
917.783.8496
Resources
5
DAMA
KDnuggets
Society for Design and Process Sciences
Presidion
HIMMS – Analytics
Peter Aiken, Ph.D.
paiken@datablueprint.com
804.382.5957
F
To Err Is Human (Institute of Medicine, Nov 1999)
The Price of Excess (PwC, 2011)
USA, Inc. (Mary Meeker – KPCB, Feb 2011)
Best Care at Lower Cost (Inst of Medicine, Sept 2012)
Bitter Pill (Steven Brill, Feb 2013)
Additional Resources
18. Data Strategy
October 11, 2016 @ 2:00 PM ET/11:00 AM PT
with Micheline Casey
Sign up here:
www.datablueprint.com/webinar-schedule
or www.dataversity.net
Copyright 2013 by Data Blueprint
39
Upcoming Events
Copyright 2013 by Data Blueprint
Questions?
+ =
40
20. “Within three years, there won’t be a Fortune 500
company without a CDO…” Futurist David Houle
The Chief Data Officer
Source: LinkedIn July 2013, Analysis of ten pages
* “Healthcare” companies: Pharmaceuticals, Online Media, IT&S
specializing in HC, Health Insurance Plan
*
Capability Maturity Model
21. Results: It Is Not Always About Money
Solution
Integrate multiple databases into one to create holistic
view of data
Automation of manual process
Results
Safe matches increased from 3 out of 10 to 6 out of 10
Turnaround time for matching patients with potential donor
significantly reduced
Data is passed safely and effectively
Inconsistencies, redundancies, corruption reduced
Ability to cross-analyze enhanced
Diabetes Management
Facilitators
▪ Secure Access with Consent
▪ Direct Secure Messaging (DSM)
▪ State and Federal, DOH
▪ Insurance
Data Inputs
▪ PHR
▪ Home Monitoring
▪ Telehealth
▪ Office Visits
▪ Hospital Visits
▪ Diagnostics
▪ Lab Work
▪ Images/X-Ray Reports
Treatment
▪ Home Healthcare / Long term Care
▪ Medications
▪ Behavioral Changes
Descriptive
Ask: What happened? What is happening?
Find: Structured data
Show: Profiles, Bar/pie charts, Narrative
Predictive
Ask: What will happen? Why will it happen?
Find: Structured/unstructured data
Show: Risk Profiles, Pros/Cons, Care Recs
Prescriptive
Ask: What should I do? Why should I do it?
Find: Unstructured/structured data
Show: Strategic Goals, Support Recs
Diabetic’s Circle of Care
22. Hemophilia Management
Descriptive
Ask: What happened? What is happening?
Find: Structured data
Show: Profiles, Bar/pie charts, Narrative
Predictive
Ask: What will happen? Why will it happen?
Find: Structured/unstructured data
Show: Risk Profiles, Pros/Cons, Care Recs
Prescriptive
Ask: What should I do? Why should I do it?
Find: Unstructured/structured data
Show: Strategic Goals, Support Recs BioMarin Licenses Factor VIII Gene Therapy
Program for Hemophilia
Novel Gene Therapy Approach to Hemophilia B
Sangamo BioSciences Receives $6.4 Million
Strategic Partnership Award From
California Institute for Regenerative Medicine to
Develop ZFP Therapeutic®
Treating Hemophilia in the 2010s
Data Warehousing
Courtesy of: http://www.infosys.com/industries/healthcare/industryofferings/Pages/healthcare
-data-warehousing.aspx
23. Big Data
10124 W. Broad Street, Suite C
Glen Allen, Virginia 23060
804.521.4056