SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
Sean Martin - Founder & CTO
Ben Szekely - SVP and Co-founder, Head of Field Operations
July 2020
Knowledge Graph Discussion:
Foundational capability for data fabric, data
integration and analytics.
©2020 Cambridge Semantics Inc. All rights reserved.
“Finding relationships in combinations of diverse data, using graph techniques at scale,
will form the foundation of modern data and analytics.
This has led to a rise in demand for data fabric architectures that utilize active metadata,
semantics, AI/ML algorithms and knowledge graphs to drive augmented
data integration and data management.”
Rising Demand
for Graph and Data Fabric
heading into 2020
Source: Data Fabrics Add Augmented Intelligence to Modernize Your Data Integration, Gartner Group, Dec. 2019
©2020 Cambridge Semantics Inc. All rights
A Knowledge Graph Definition
What is it?
A Knowledge Graph is a connected graph of data and associated metadata applied
to model, integrate and access an organization’s information assets.
What does it do?
A Knowledge Graph represents real-world entities, concepts and events as well as all
the relationships between them yielding a more accurate representation of a
business’s data.
©2020 Cambridge Semantics Inc. All rights
RDBMS/OLTP Big Data / Hadoop Document Repositories
Flat Files
Third Party
Legacy
Mart
Mart
ETL
Data Warehouse
Data Lake
XML • JSON • PDF
DOC * WEB
Traditional BI Cloud
CUSTOMERS
PRODUCTS
CLAIMS
COMPOUNDS
Knowledge Graphs and
The Enterprise Data Fabric
OVERLAY vs. RIP & REPLACE
©2020 Cambridge Semantics Inc. All rights reserved.
Our Clients Using Knowledge Graphs
Modernizing clinical data standards management
to accelerate drug development
Data Fabric and Knowledge Graph Supporting
Enterprise-wide Digital Twins
Modernizing management of vocabularies,
ontologies and data assets for regulatory reporting
Improving the accuracy and agility of data integration to
accelerate data analytics and insight generation
Accelerate drug development through an
enterprise data fabric
Provide evaluators with a single access place to
easily access and review information relevant to
drug approvals and submissions.
Building an integrated metadata repository to
better understand their enterprise wide data estate
Accelerate value from enterprise data lake
investments across multiple business units
Building a buying engine to give employees an
Amazon-like shopping experience.
Knowledge Graphs and Data Complexity
Problem
● Deliver a final set of data derived from historical clinical trials and describe
the methods used to create it
Source Data
● 88,000 SAS files for 2,792 studies
● Reference data and mapping definition
Data Metrics
● 2.6M variables identified
● 678K unique subject ids extracted
● 162M records onboarded
● 15B RDF Triples in the Knowledge Graph
Approach
● 2 weeks - Metadata analytics, automated mapping, and
in-memory transformation rules to unify data to the SDTM
model
©2020 Cambridge Semantics Inc. All rights©2018 Cambridge Semantics Inc. All rights reserved.
The Knowledge Graph Landscape
Graph
Algorithms
Transactions
(GOLTP)
Data Integration
& GOLAP
Data Integration
& transformations
(ELT and Feature
Engineering)
Real-time
Complex Analytics
queries on Small
and Large Data
Set Volumes
Increasing Data Volumes and Complexity
Real-time
Transactions.
Many users.
.
Custom
Graph
Algorithms
Real-time
Analytics on
Small Data
Set Volumes
Query
Scalability of
short, fast,
targeted
queries
.
Graph
Algorithms on
Large Data
Volumes
Data Integration
& GOLAP
Data Integration & data
transformations (ELT and
Feature Engineering)
Real-time Complex
Analytics queries on
Small and Large Data Set
Volumes
©2020 Cambridge Semantics Inc. All rights
DIY
(build your own)
Knowledge Graphs on the Big Stage
©2020 Cambridge Semantics Inc. All rights
Why Knowledge Graphs and Ontologies?
Simplifies access to complex data to address
unanticipated questions
Quickly profiles, connects and harmonizes data
from multiple sources, including unstructured
Presents tailored views, services and experiences
to different personas with conceptual models
Flexibly accommodates new data sources
and use cases on the fly, with minimal impact
Scales horizontally to accommodate enterprise
data fabric scale - Cloud agnostic
Automated Deployment and Operations with Kubernetes
Storage and Compute Integration
MODEL
Graph Data Model
• Lift Data into
Data Fabric
• Design Ontologies
• Connect Data
Models
ONBOARD
Ingest & Map
• Automated ETL
• Collaborative
Mapping
• Metadata
Capture
Enterprise
Data Sources
Machine
Learning and AI
Enterprise
Search
“Last Mile”
Analytics Tools
Metadata Catalog
Semantic-based Metadata Management, Governance, and Lineage
Cloud or On-Prem Data Storage Infrastructure
Data Storage Layer
Ingest
BLEND
GraphMarts
• Combine and Align
Related Data Sets
• In-memory MPP
OLAP Query Engine
• Data Layers
ACCESS
Hi-Res Analytics
• Analyze All
Data Together
• Fast, Iterative Queries
Ad Hoc, What if
• Code-Free or API
Graphical Application Interface
©2020 Cambridge Semantics Inc. All rights reserved.
Anzo’s Graphmarts to offer three ways to
work with Knowledge Graphs in-memory
1. Load onboarded RDF graph data
from disk
2. Load data into memory directly from
sources, APIs, streams
3. Load data at query-time through
virtualized views
Roadmap: Towards Virtual Knowledge Graphs

Contenu connexe

Tendances

Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Cambridge Semantics
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationCambridge Semantics
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingCambridge Semantics
 
Going Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph AnalyticsGoing Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph AnalyticsCambridge Semantics
 
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyFrom Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyCambridge Semantics
 
Sustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsSustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsCambridge Semantics
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Cambridge Semantics
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesCambridge Semantics
 
Introduction to Anzo Unstructured
Introduction to Anzo UnstructuredIntroduction to Anzo Unstructured
Introduction to Anzo UnstructuredCambridge Semantics
 
Scalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowScalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowCambridge Semantics
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Cambridge Semantics
 
How to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using SemanticsHow to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using SemanticsCambridge Semantics
 
Graph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise ScaleGraph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise ScaleCambridge Semantics
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceCambridge Semantics
 
Scaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsScaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsConnected Data World
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Connected Data World
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data WarehousingAmdocs
 
Building A Self Service Analytics Platform on Hadoop
Building A Self Service Analytics Platform on HadoopBuilding A Self Service Analytics Platform on Hadoop
Building A Self Service Analytics Platform on HadoopCraig Warman
 

Tendances (20)

Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail Banking
 
Going Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph AnalyticsGoing Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph Analytics
 
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyFrom Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
 
The Year of the Graph
The Year of the GraphThe Year of the Graph
The Year of the Graph
 
Sustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsSustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive Analytics
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success Stories
 
Introduction to Anzo Unstructured
Introduction to Anzo UnstructuredIntroduction to Anzo Unstructured
Introduction to Anzo Unstructured
 
Scalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowScalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and How
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
 
How to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using SemanticsHow to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using Semantics
 
Graph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise ScaleGraph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise Scale
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
Scaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsScaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analytics
 
Solution architecture for big data projects
Solution architecture for big data projectsSolution architecture for big data projects
Solution architecture for big data projects
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
 
Building A Self Service Analytics Platform on Hadoop
Building A Self Service Analytics Platform on HadoopBuilding A Self Service Analytics Platform on Hadoop
Building A Self Service Analytics Platform on Hadoop
 

Similaire à Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Integration and Analytics

Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...Amazon Web Services
 
Graph Databases – Benefits and Risks
Graph Databases – Benefits and RisksGraph Databases – Benefits and Risks
Graph Databases – Benefits and RisksDATAVERSITY
 
Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Jean-Michel Franco
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloudredmondpulver
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChicago Hadoop Users Group
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDenodo
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingAmazon Web Services
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessInformatica
 
Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)
Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)
Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)Denodo
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteCaserta
 
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Amazon Web Services
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefitsRicky Barron
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data SolutionJames Serra
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...Big Data Week
 

Similaire à Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Integration and Analytics (20)

Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
 
Graph Databases – Benefits and Risks
Graph Databases – Benefits and RisksGraph Databases – Benefits and Risks
Graph Databases – Benefits and Risks
 
Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
 
IBM Cloud pak for data brochure
IBM Cloud pak for data   brochureIBM Cloud pak for data   brochure
IBM Cloud pak for data brochure
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business Success
 
Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)
Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)
Powering Real-Time Analytics with Data Virtualization on AWS (ASEAN & ANZ)
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
 
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data Solution
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
 

Plus de Cambridge Semantics

Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...
Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...
Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...Cambridge Semantics
 
The Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge GraphThe Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge GraphCambridge Semantics
 
Healthcare and Life Sciences: Two Industries Separated by Common Data
Healthcare and Life Sciences: Two Industries Separated by Common DataHealthcare and Life Sciences: Two Industries Separated by Common Data
Healthcare and Life Sciences: Two Industries Separated by Common DataCambridge Semantics
 
Accelerate Pharma R&D with Cross-Study Analytics
Accelerate Pharma R&D with Cross-Study AnalyticsAccelerate Pharma R&D with Cross-Study Analytics
Accelerate Pharma R&D with Cross-Study AnalyticsCambridge Semantics
 
Large Scale Graph Analytics with RDF and LPG Parallel Processing
Large Scale Graph Analytics with RDF and LPG Parallel ProcessingLarge Scale Graph Analytics with RDF and LPG Parallel Processing
Large Scale Graph Analytics with RDF and LPG Parallel ProcessingCambridge Semantics
 
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...Cambridge Semantics
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesCambridge Semantics
 

Plus de Cambridge Semantics (9)

Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...
Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...
Using Machine Teaching in Text Analysis: Case Study on Using Machine Teaching...
 
The Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge GraphThe Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge Graph
 
Introduction to RDF*
Introduction to RDF*Introduction to RDF*
Introduction to RDF*
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
 
Healthcare and Life Sciences: Two Industries Separated by Common Data
Healthcare and Life Sciences: Two Industries Separated by Common DataHealthcare and Life Sciences: Two Industries Separated by Common Data
Healthcare and Life Sciences: Two Industries Separated by Common Data
 
Accelerate Pharma R&D with Cross-Study Analytics
Accelerate Pharma R&D with Cross-Study AnalyticsAccelerate Pharma R&D with Cross-Study Analytics
Accelerate Pharma R&D with Cross-Study Analytics
 
Large Scale Graph Analytics with RDF and LPG Parallel Processing
Large Scale Graph Analytics with RDF and LPG Parallel ProcessingLarge Scale Graph Analytics with RDF and LPG Parallel Processing
Large Scale Graph Analytics with RDF and LPG Parallel Processing
 
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational Databases
 

Dernier

Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 

Dernier (16)

Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 

Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Integration and Analytics

  • 1. Sean Martin - Founder & CTO Ben Szekely - SVP and Co-founder, Head of Field Operations July 2020 Knowledge Graph Discussion: Foundational capability for data fabric, data integration and analytics.
  • 2. ©2020 Cambridge Semantics Inc. All rights reserved. “Finding relationships in combinations of diverse data, using graph techniques at scale, will form the foundation of modern data and analytics. This has led to a rise in demand for data fabric architectures that utilize active metadata, semantics, AI/ML algorithms and knowledge graphs to drive augmented data integration and data management.” Rising Demand for Graph and Data Fabric heading into 2020 Source: Data Fabrics Add Augmented Intelligence to Modernize Your Data Integration, Gartner Group, Dec. 2019
  • 3. ©2020 Cambridge Semantics Inc. All rights A Knowledge Graph Definition What is it? A Knowledge Graph is a connected graph of data and associated metadata applied to model, integrate and access an organization’s information assets. What does it do? A Knowledge Graph represents real-world entities, concepts and events as well as all the relationships between them yielding a more accurate representation of a business’s data.
  • 4. ©2020 Cambridge Semantics Inc. All rights RDBMS/OLTP Big Data / Hadoop Document Repositories Flat Files Third Party Legacy Mart Mart ETL Data Warehouse Data Lake XML • JSON • PDF DOC * WEB Traditional BI Cloud CUSTOMERS PRODUCTS CLAIMS COMPOUNDS Knowledge Graphs and The Enterprise Data Fabric OVERLAY vs. RIP & REPLACE
  • 5. ©2020 Cambridge Semantics Inc. All rights reserved. Our Clients Using Knowledge Graphs Modernizing clinical data standards management to accelerate drug development Data Fabric and Knowledge Graph Supporting Enterprise-wide Digital Twins Modernizing management of vocabularies, ontologies and data assets for regulatory reporting Improving the accuracy and agility of data integration to accelerate data analytics and insight generation Accelerate drug development through an enterprise data fabric Provide evaluators with a single access place to easily access and review information relevant to drug approvals and submissions. Building an integrated metadata repository to better understand their enterprise wide data estate Accelerate value from enterprise data lake investments across multiple business units Building a buying engine to give employees an Amazon-like shopping experience.
  • 6. Knowledge Graphs and Data Complexity Problem ● Deliver a final set of data derived from historical clinical trials and describe the methods used to create it Source Data ● 88,000 SAS files for 2,792 studies ● Reference data and mapping definition Data Metrics ● 2.6M variables identified ● 678K unique subject ids extracted ● 162M records onboarded ● 15B RDF Triples in the Knowledge Graph Approach ● 2 weeks - Metadata analytics, automated mapping, and in-memory transformation rules to unify data to the SDTM model
  • 7. ©2020 Cambridge Semantics Inc. All rights©2018 Cambridge Semantics Inc. All rights reserved. The Knowledge Graph Landscape Graph Algorithms Transactions (GOLTP) Data Integration & GOLAP Data Integration & transformations (ELT and Feature Engineering) Real-time Complex Analytics queries on Small and Large Data Set Volumes Increasing Data Volumes and Complexity Real-time Transactions. Many users. . Custom Graph Algorithms Real-time Analytics on Small Data Set Volumes Query Scalability of short, fast, targeted queries . Graph Algorithms on Large Data Volumes Data Integration & GOLAP Data Integration & data transformations (ELT and Feature Engineering) Real-time Complex Analytics queries on Small and Large Data Set Volumes
  • 8. ©2020 Cambridge Semantics Inc. All rights DIY (build your own) Knowledge Graphs on the Big Stage
  • 9. ©2020 Cambridge Semantics Inc. All rights
  • 10. Why Knowledge Graphs and Ontologies? Simplifies access to complex data to address unanticipated questions Quickly profiles, connects and harmonizes data from multiple sources, including unstructured Presents tailored views, services and experiences to different personas with conceptual models Flexibly accommodates new data sources and use cases on the fly, with minimal impact Scales horizontally to accommodate enterprise data fabric scale - Cloud agnostic
  • 11. Automated Deployment and Operations with Kubernetes Storage and Compute Integration MODEL Graph Data Model • Lift Data into Data Fabric • Design Ontologies • Connect Data Models ONBOARD Ingest & Map • Automated ETL • Collaborative Mapping • Metadata Capture Enterprise Data Sources Machine Learning and AI Enterprise Search “Last Mile” Analytics Tools Metadata Catalog Semantic-based Metadata Management, Governance, and Lineage Cloud or On-Prem Data Storage Infrastructure Data Storage Layer Ingest BLEND GraphMarts • Combine and Align Related Data Sets • In-memory MPP OLAP Query Engine • Data Layers ACCESS Hi-Res Analytics • Analyze All Data Together • Fast, Iterative Queries Ad Hoc, What if • Code-Free or API Graphical Application Interface ©2020 Cambridge Semantics Inc. All rights reserved.
  • 12. Anzo’s Graphmarts to offer three ways to work with Knowledge Graphs in-memory 1. Load onboarded RDF graph data from disk 2. Load data into memory directly from sources, APIs, streams 3. Load data at query-time through virtualized views Roadmap: Towards Virtual Knowledge Graphs