Thomas Cook, director of sales, Cambridge Semantics, offers a primer on graph database technology and the rapid growth of knowledge graphs at Data Summit 2020 in his presentation titled "AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Connected World".
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Connected World
1. AnzoGraph.com
Driving AI and Machine Insights with
Knowledge Graphs in a Connected
World
Thomas Cook
Sales Director, AnzoGraph DB
2. Data Continues to Grow
AI and ML Demand Increases
Complexity of Data Ecosystem Grows
• Need to Build on Existing Analytics Capabilities with:
• Automated Data Preparation & Better Understanding
• Explainable AI & ML with Provenance
• Improved Algorithms & Analytics
• Cost Efficient Operations
Context
Knowledge
Graphs
&
Graph
Analytics
3. The Data Preparation Problem
Data Access
● Manual ETL coding
● Practicalities limit the # of
sources and types of data
Data Processing
● Laborious discovery, profiling
and selection
● Use of rules and coding for
harmonization & cleansing
Feature Engineering
● Manual coding to transform
data
● Manual feature engineering
& selection
1 Cleaning Big Data, Forbes Magazine
70-80% of time spent in Data Preparation & Feature Engineering
3
Viewed as the “least enjoyable” part of work by 76% of data scientists1
Structured Data
4. Traditional approaches to connecting siloed data
Rigid data model
Relationships more difficult
Expensive
Does not adapt well to change
Concurrency & Performance
Raw operational data dumps become
unwieldly, difficult to consume and manage
Referred to as the “Data Swamp”
Data engineering efforts are costly,
complex, lack lineage, often times not
repeatable
Heavy volume Spark clusters are
difficult to manage and tune properly
Data Warehouse
Data Lake
5. “Graph analytics will grow in the next few years
due to the need to ask complex questions across
complex data, which is not always practical or
even possible at scale using SQL queries”
…Gartner – Top 10 Data and Analytics Technology Trends for 2019
7. gg
Why Graph?
Graph’s Flexible Data
Model
Rich insights on
relationships, not just
entities
Leveraging Industry
Data Models
Process & analyze growing amounts of diverse data
Structured and
Unstructured Data
Natural Language
Sparse Data
Data you know you need to
analyze
Data you don’t know you need to
analyze
AI and ML
Traditional Analytics
Unique and Insightful
Analytics
Feature Engineering
Can evolve as data
changes
Refactoring often not
necessary as
data/needs evolve
8. What is a Knowledge Graph?
Data Architect View
One method to integrate data
from multiple data sets,
structured or unstructured, and
to leverage standard industry
ontologies to enhance analytics.
Executive View
Common understanding of all
disparate data.
Ontologist View
The best way to represent knowledge
and meaning and provide linkage and
relationship information in a data
analytics platform. Ontologies are at the
center providing a way to standardize
and enhance the conceptual model.
Inferencing provides semantic reasoning
for better understanding.
14. AnzoGraph Benchmark Results
217 X
AnzoGraph DB when compared to Neo4j
on and industry standard
TPC-H benchmark
113 X
AnzoGraph’s LUBM benchmark
performance over previous fastest result
10-300X
AnzoGraph’s performance on graph
algorithms over SPARK SQL and SPARK
with GraphFrames
Analytical Benchmarks
16. Traditional & Graph Analytics
Schema-less Data Model
Standards
Customizable Algorithms
Open Platform
Use Cases
AnzoGraph DB
• Data Harmonization & Analytics
• Enterprise Knowledge Graphs
• Scientific Data Discovery
• Customer 360
• Supply Chain
• IoT
• Fraud Detection
• Financial Research
• Network Optimization
• Anti-Money Laundering
17. Parabole and AnzoGraph Cognitive Analytics - alphaESG
• Extract text and
relationships from massive
amounts documents and
news feed
• Use AnzoGraph DB to
• Create cognitive models
• Contextualize news,
filings & reports
• Harmonize data from
SASB and various data
sources
• Provide customized
outputs and signals
• Execute analytics
23. What it is:
● Fast, Scalable Graph Database
○ In-Memory Massively Parallel Processing
(MPP) ACID-Compliant Graph Database
○ Supports RDF & Labelled Property Graphs
What it does:
○ Fast Data Loading
○ Fast Query
○ Rich Analytics
■ Graph Algorithms
■ BI/DW Analytics
■ Inferencing
■ Data Science/Feature Engineering
Algorithms
■ Define-Your-Own Analytics
○ Linear Database Scaling
○ Persist data on cheap storage
Based on Open Standards
• Built on RDF & SPARQL 1.1 standards
• LPG with the RDF* /SPARQL*
• LPG with Cypher (in 2020)
Deploy on-prem or cloud
• Kubernetes/Helm on-demand cloud
deployment
• AWS, Google and Azure
AnzoGraph™ DB
Awards
Select Customers
26. Page
Labelled Property Graphs facilitates Analytics
isA: <Man>
birthday: 09/17/1975
isA: <Woman>
Birthday: 4/23/1979
isA: <Place>
has: Water
has: Trees
partOf: <TheMountain>
Person
: Jill
Person
: Jack
Place:
The
Hill
friendOf
WentUp
WentUp
metAt=<TheHill>
metDate=07/04/2018
Date=07/04/2018
Date=07/04/2018
Today with RDF* and SPARQL*
• Relationships can be described as
clearly as any LPG database
RDF*/SPARQL* extensions to the
standard make W3C open standards
databases even more capable
27. Page
User-defined Extensions (UDXs):
Allows users to extend AnzoGraph DB functionality for custom usage
User-Defined
Functions
(UDF)
Create and register custom analytic functions, such as functions that
concatenate values or convert integers to alternate currencies.
User-Defined
Aggregates
(UDA)
Create and register aggregate functions, such as functions that
compute the arithmetic mean or calculate the average number from
a list of maximum and minimum values.
User-Defined
Services
(UDS)
Create and register services that create local SPARQL endpoints.
User-Defined
Tables (UDT)
Create and register a function that is repeatedly invoked within a
query to generate the rows of a table on-the-fly.
Data
Science
Functions
User-
defined
Functions
(UDX)
Functions you can build in JAVA or C++
35. Automated Deployment and Operations
Storage and Compute Integration
MODEL
Graph Data Model
• Lift Data into
Data Fabric
• Design Ontologies
• Connect Data
Models
ON-BOARD
Ingest & Map
• Automated ETL
• Collaborative
Mapping
• Metadata
Capture
Enterprise
Data Sources
Machine
Learning and AI
Enterprise
Search
“Last Mile”
Analytics Tools
Metadata Catalog
Semantic-based Metadata Management, Governance and Lineage
Cloud or On-Prem Data Storage Infrastructure
Data Storage Layer
Ingest
BLEND
GraphMarts
• Combine and Align
Related Data Sets
• In-memory MPP
OLAP Query Engine
• Data Layers
ACCESS
Hi-Res Analytics
• Analyze All
Data Together
• Fast, Iterative Queries
Ad Hoc, What if
• Code Free or API
Graphical Application Interface
Anzo - The Modern Data Discovery and Integration Layer for the Enterprise Data Fabric