Presentation on "Data Virtualization at UMC Utrecht: What, Why and How" by Erik Fransen (connecteddatagroup), at the BI & Data Analytics Summit on June 13th, 2019 in Diegem (Belgium)
Data Virtualization at UMC Utrecht: Don't Collect, Connect! by Erik Fransen (connecteddatagroup), presented at the #BIDASUMMIT
1. THE BI & DATA ANALYTICS SUMMIT
JUNE 13, 2019
*** Don't Collect, Connect! ***Erik Fransen, Founder Connected Data Group
@erikfransen
+31615944476
2. Across today’s complex data landscape, are you
getting the analytic data you need?
Packaged
Apps
RDBMS Excel Files Data
Warehouse
OLAP
Cubes
Big Data XML Docs Flat Files Web Services IoT Data
Business
Intelligence
Data
Visualization
Data Science Transactional
Apps / ESBs
Data & Analytics:
Opportunity and Challenge
3. Data should not be this Hard!
All we want is …..
Any
dataset we
might need
Up-to-the-minute data
Data we can
trust
Data from
everywhere
Data that is easy
to find
Easy to
understand data
DATA THAT
IS EASY TO
USE
One-off
datasets
Data whenever we want
More time for
Analyzing Data, Less
time Chasing it
4. But Data is Hard!
Here is why …..
Every
request
seems
urgent
Wide range of
requirements spanning
quick-and-dirty to well-
engineered
External Data
New data structures
and formats
Multiple analytics
tools to support
Data Security
DATA
GOVERNANCE &
COMPLIANCE
REQUIREMENTS
Rigid ETLs and
warehouse
schemas
Huge demand for Analytic
and Applications datasets
Broadening of
analytics users and
skill levels
Streaming Data
Data in the
Cloud
Business self-
service
5. Business
Intelligence
Data
Visualization
Data Science Transactional
Apps / ESBs
Packaged
Apps
RDBMS Excel Files Data
Warehouse
OLAP
Cubes
Big Data XML Docs Flat Files Web Services IoT Data
Modern Data & Analytics Architecture
“Data-as-a-Service”
“Data Virtualization Layer”
“Data Fabric” “Logical Data Warehouse”
“Virtual Data Lake”
“Enterprise Data Hub”
“Semantic Data Layer”
“Data Abstraction Layer” “Data Unification Layer” “Data Specification Layer”
supplier
consumer
broker
6. Business
Intelligence
Data
Visualization
Data Science Transactional
Apps / ESBs
Packaged
Apps
RDBMS Excel Files Data
Warehouse
OLAP
Cubes
Big Data XML Docs Flat Files Web Services IoT Data
Modern Data & Analytics Architecture
Flexible Data Architecture
Data Abstraction
Technology Abstraction
Data Management & Governance
Data Supplier
Data Consumer
Virtual Data Broker
8. 4 key aspects of the DV broker
Flexible Data Architecture:
collect and connect to any data
source, for any use case
(mode 1 and 2)
Data Abstraction: specify,
share and personalize for
multiple personas
Technology
Abstraction:
understand the
data and
specification over
tools & technology
Data Management &
Governance: data
quality, data engineering,
(meta/master) data life
cycle, security
9. Marathon Runner & Sprinter
MODE 1:
ETL/ELT, EDW, BIG DATA,CLOUD SOLUTION
COLLECT & STORE & REPLICATE DATA
MODE 2:
CONNECT AND VIRTUALIZE DATA,
NO REPLICATION & STORE DATA ONLY WHEN REQUIRED
14. ADAM: Data & Analytics Lifecycle
1. Ask the right questions
2. Find the data
3. Build the model
4. Test the Model
5. Iterate to 1,2,3 or 4
6. Deploy
16. 5
1
ADAM predictive models
2 3
ADAM: learning from data, embedding in clinical process
Find the Data Feedback for evaluation of model
quality
4
6
17. ADAM Data & Analytics Architecture• Data Lake for collecting
data and processing
analytics
• Data Vault
Datawarehouse
• Data management
• Security
• Life cycle
• Metadata
• Standardisation of data
models
• Data abstraction
• SAS, R & Python for ML
• Apache Spark for push-
down of ML
• Data Virtualization as a
Virtual Data Broker
Virtual Data processing
SAS
R
Python
Excel
Office 365
Hospital apps
Analytics
Any device
Output
Hospital registery
Data Lake
Sources
Patient Apps
Open data
Sensors/
devices
Research
data
External data
External
patient data
Employee
data
Data Vault DWH
Data stream
Virtual Staging layer
Business Data Model
Common Business Rules
Publication to users
Virtual Data stream
Security
Data services en Web services
(Virtual) Data stream
Microservices
connecteddatagroup
Pag. 33
Sources Virtual Data Broker
18. Data & Analytics project on the ADAM platform:
• Uveïtis (eye inflammation)
• Sjögren (autoimmune disease)
• Early Warning & Prevention (intensive care)
• Gaston (medication-advies)
Clinical Decision Support
21. Clinical Decision Support
• More complex data streams
• Real time & batch
• Diversity in sources to connect to (applications, devices,
data lake, datawarehouse, external data)
• Almost no standardisation of data and models
• Algorithms and decision models are dynamic in nature
• Embedding in the clinical process
• HiX-integration, HL7, mobile devices
• Logical/virtual integration, abstraction and
standardisation with Data Virtualization to support
Data Science
22. Hospital Data Model for CDS & ADAM
Logical, integrated data
model abstracted from
HIX technical tables
Re-usable and
recognizable by users
Used as primer for the
Business Data Model in
Data Virtualization
• SuperNova
• Dimensional
• ELM
23.
24. “Data-as-a-Serv”
DV broker as a core component
ML Models
input
output
REST/SOAP API’s
EDW DWH on HiX
TIBCODATA
VIRTUALIZATION
DATA
MADE TO MEASURE
HOSPITAL DATA MODEL
INTROSPECTION
DATAABSTRACTION
Patient
Lab
aanvraag
Lab
uitslag
Meting operatie
Patient
Patient EWP
Operatie
Uveitis
uitslag
ArtsVerrichting
Arts
Arts &
specialist
Medicijn
Vragen
lijst
MedicijnMeting
Operatie
Uitslag
Vragen
lijst
Toediening
Gaston
uitslag
Sjogren
uitslag
Private Cloud
Data Lake
(MAPR)
Cardiac Risk Mortaliteit
Respiratory failure
HL7 message
to GLIMS
Connects to HIX viewer
(Hospital Information System)
UveitusSjogren Early Warning GASTON
SQL SQL SQL
REST/SOAP API’s
26. Our Data Virtualization approach
• Data Strategy & Data Architecture: connect and/or collect, consume
• Business case, ROI & Use cases development
• Proof of concepts / Pilot on-premise / hybrid / private cloud
• DV training (management, designers, developers, administrators)
• Data Modelling training for DV
• Project execution (A to Z) using an incremental approach
• Our team consists of project leaders, data architects, modellers and
developers
• European TIBCO Data Virtualization Partner 2018