Watch this webinar to learn about the benefits of using semantic and graph database technology to create a Data Catalog of all of an enterprise's data, regardless of source or format, as part of a modern IT or data management stack and an important step toward building an Enterprise Data Fabric.
2. Greg West
Senior Presales Engineer
• Over 10 years experience with ETL,
Semantics and Graph Databases
• Technical Consulting and Pre-Sales
Architecture and Vision
• Cisco Systems and Cambridge
Semantics
• 25 years experience in ETL, BI, Analytics
and Semantics
• Strategic Marketing and Go-To-Market
Planning and Execution
• DataStage (IBM), Netezza (IBM), and
Podium Data (Qlik)
today’s speakers
Barbara
Petrocelli
VP Strategy and Presales
Field Operations
3. “Unprecedented levels of data
scale and distribution are
making it almost impossible
for organizations to effectively
exploit their data assets”
Source: How To Use Semantics to Drive the Business Value of Your Data, Gartner Group, Guido De Simoni, 27 Nov. 2018
ENTERPRISE DATA MANAGEMENT TODAY REQUIRES
A MODERN DATA DISCOVERY AND INTEGRATION LAYER
4. Modern Catalogs in the Data Fabric Stack
Data management
Metadata/catalog
Data security
Data governance
Data processing
Data quality
Data lineage
Policies
Global distributed platform, in-memory,
embedded, self-service, APIs
Data modeling, preparation, curation,
graph engine
Transformation, integration, cleansing
Hadoop
NoSQL
Spark
Data platform
- processing
Data lake
EDW/BDW
Ingestion, steaming, data movement
Cloud Data sources On-premises
Global data
access
Data
discovery
Data
orchestration
Data processing
and persistence
Data ingestion
and streaming
FORRESTER RESEARCH DATA FABRIC REFERENCE ARCHITECTURE
Catalogs provide a discovery and
integration layers of the Data Fabric
7. A modern data discovery and integration platform
for your enterprise data fabric.
Anzo lets business users find, connect, and blend
enterprise data into analytic ready datasets.
Map and Explore
Enterprise Data
Build Blended
Analytic-Ready
Datasets
Apply Enterprise-Ready
Data Management
8. 3 Big Ideas
1. Anzo maps the physical/logical layer to the business layer in a
data collection. This makes data in the catalog understandable
in business terms.
1. Anzo goes beyond data cataloging to allow you to apply
integration and data quality in a single data management
process.
1. Anzo collects and connects. It collects metadata that
documents the data. It also connects to the data itself so you
can immediately use the data you find through the catalog.
9. Anzo makes data understandable by connecting the
logical layer with the business layer using semantics.
Logical Layer
Business Layer
10. “We are witnessing that data
catalogs are an important source
of technical and active.
Active metadata is best utilized
when organizations can share it
with data integration and data
quality tools to inform and, in
some cases, even automate
integration design.”
Source: Modern Data and Analytics Requirements Demand a Convergence of Data Management Capabilities, Gartner Group, Sept 11, 2019
“Changing requirements are driving demand for data quality
tools, data catalogs, metadata management solutions and
data integration tools in one comprehensive solution.”
11. Claim
ID
Process
Date
Subscriber
ID
44223 10/3/2015 ID-BA213
44224 10/7/2015 ID-234I2
… … …
How it works: Data onboarding
Graph data models flexibly connect and transform new data sources.
Patient
ID
Condition Drug
Name
BA213 Sleep Apnea Narcoleptol
CS289 Type II
Diabetes
Insulin
… …
Claims
On July 3, 2016, Patient BA213
experiencing headache and
nausea following 500mg dosage
of sleep aid therapeutic,
Narcoleptol.
On Site Doctor NoteElectronic Health Records
BA213
PATIENT ID
Drug
PRESCRIBED
Narcoleptol
BRAND NAME
Sleep
Apnea
CONDITION
Patient
Record
500mg
DOSAGE
ABOUT
Note
3/7/2016
headache
and nausea
EVENT
-.05
SENTIMENT SCORE
WHEN
10/3/2015
PROCESS DATE
Subscriber
SUBSCRIBER ID
ID-BA123
Claim
44223
CLAIM ID
ABOUT
12. • Patients
• Encounters
• Providers
• Medications
• Costs
• Care Plans
• Claims
• Etc.
Providers
Care
Plans
Patients
Costs
Inpatient
Claims
Carrier
Claims
Outpatient
Claims
Prescriptiom
Drug_Events
Beneficiary
Summary
BestPractiseLinks
careprog2
careprog1
Medications
Patient
Encounters
Observations
Conditions
Allergies
Patients
Procedures
Imaging
Studies
Immunizations
Care
Plans
care planscanonicalelectronic medical records claims
How it works: Business models
Semantic data models to capture and navigate data relationships
13. How it works: Data lineage
Metadata documents the end to end data journey
Source
System
Source
Metadata
Mapping: Source
to Semantic Model
Semantic
Model
Graph
Representation
Graph
Data Set
In-Memory Graph
Data Blending
Analytics
and Access
PHASE 1: METADATA ONLY
Initial steps build up metadata to describe the data source and their
connections, as well as optionally materialize the semantic model.
PHASE 2: DATA AND METADATA
These steps use data to materialize the semantic model and enable
further data blending, enhancement and delivery.
14. Step 1:
Discover and explore linked
data in a semantic model.
Select the data you want to
use from among all the data
in the model.
Step 2:
Automatically generate code
to query the data model,
select the data you want, and
subset it out.
Step 3:
New selected subset of the data
is made available for use in
various analytic and data
visualization tools
Export: convert from graph to rectangular
How it works: User experience
The catalog of metadata linked to the semantic model becomes a platform to explore,
select, prepare, and use data in a variety of analytic tools, applications and algorithms
FrameworkofDataGovernance,DataSecurity,andMetadata
PREFIX :
<http://cambridgesemantics.com/ontolo
gies/Customer_360_Ontological_Model#
>
INSERT{
GRAPH ${targetGraph}{
#Create the connection between the
individual and their credit report
?individual :p_has_credit_report ?doc.
}
}
${usingSources}
WHERE{
#Get every individual and there SSN
?individual a :Individual.
….
15. Generate an Infinite Set of Blended Data Sets from the Catalog
Apply rules and
relationships to link,
conform and harmonize
Ingestion of raw
data
Surface business ready
datasets for analysis and
machine learning