SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
How a distributed graph
analytics platform uses
Apache Kafka for data
ingestion in real time
Rayees Pasha & Duc Le
Kafka Summit US - Sep 2021
1
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Agenda
● Overview of Graph analytics and TigerGraph
● Overview of Data ingestion into TigerGraph
● Use of Kafka Connect Framework and Benefits
● TigerGraph Data Ingestion Deep dive
● Demo - Data Ingestion using Kafka on TG Cloud
2
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Rayees Pasha
Product Lead,
TigerGraph
● Responsible for
TigerGraph Database
Engine, Language and
Platform areas of the
product.
● Prior Lead PM and ENG
positions at Workday,
Hitachi and HP
● Expertise in Database
Management and Big
Data Technologies
Session Presenters
3
Duc Le
Engineering Manager,
TigerGraph
● Lead Developer for
TigerGraph Cloud
● Master in Management
Information Systems from
Carnegie Mellon University
● Areas of specialty:
Full-stack Development,
Cloud, Containers and
Connectors
Overview of Graph
Analytics and TigerGraph
4
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Why Graph; Why Now?
Businesses want to ask business logic
questions of their data
Blending data from multiple sources,
multiple business units, and
increasingly external data
Larger and more varied datasets mean
more variables to analyze and
connections to explore and test
Importance of Graph in Today’s World
5
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM |
6
6
Who is TigerGraph?
We provide advanced analytics and machine learning on connected data
○ The only scalable graph database for the enterprise: 40-300x faster than
competition
○ Foundational for AI and ML solutions
○ Designed for efficient concurrent OLTP and OLAP workloads
○ SQL-like query language (GSQL) accelerates time to solution
○ Available on-premise & on: Google GCP, Microsoft Azure,
Our customers include:
○ The largest companies in financial services, healthcare, telecom, media, utilities
and innovative startups in cybersecurity, ecommerce and retail
Founded in 2012, HQ in Redwood City, California
Corporate Overview Video
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Advanced Analytics and Machine Learning on Connected Data
Advanced
Analytics
LEARN FROM CONNECTED DATA
AI-based Customer 360 for entity resolution,
recommendation engine, fraud detection
In-Database
Machine Learning
Distributed
Graph DB
Friction-free scale up from GB to TB to
Petabyte with lowest cost of ownership
.
CONNECT ALL DATASETS
AND PIPELINES
Customer 360 connecting 200+
datasets and pipelines
Item 360 for eCommerce across 100+
datasets
Fortune 50 Retailer
7 out of top 10 global banks
Real-time fraud detection and credit risk
assessment
10-100X faster than current solutions
ANALYZE CONNECTED DATA
Automotive Manufacturer
Supply chain planning accelerated
from 3 weeks to 45 minutes
Leading Healthcare Provider
7
Leading FinTech Company
Overview of Data Ingestion
into TigerGraph
8
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
TigerGraph Architecture
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Modes of Data Ingestion supported
Bulk Data
• Bulk data loads
using native File
loader
File Loader
Low-latency
● JDBC Type 4 driver for
Java, Python
● Spark can be used for
parallel loads
Real-time
● Streaming Data
Applications
● High-frequency Data
Apps
Bulk Data
Bulk data loads
using
•Native File loader,
•Kafka loader
Low-latency
● JDBC Type 4
driver for Java,
Python
● Spark can be
used for parallel
loads
Real-time
● Streaming Data
Apps
● High-frequency
Data Apps
Native File
Loader
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Data Ingestion Into TigerGraph Using Kafka loader
11
Step 3
Each GPE consumes the
partial data updates,
processes it and puts it on
disk.
Loading Jobs and POST use
UPSERT semantics:
● If vertex/edge doesn't
yet exist, create it.
● If vertex/edge already
exists, update it.
● Idempotent
Step 1
Loaders take in user source
data.
● Bulk load of data files or
a Kafka stream in CSV or
JSON format
● HTTP POSTs via REST
services (JSON)
● GSQL Insert commands
Step 2
Dispatcher takes in the data
ingestion requests in the form of
updates to the database.
1. Query IDS to get internal
IDs
2. Convert data to internal
format
3. Send data to one or more
corresponding GPEs
Use of Kafka Connect
Framework and Benefits
12
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Data
Source 1
Data
Source 2
Data
Source 3
TigerGraph Connector Framework Using Kafka Connect
TigerGraph
Cluster
Kafka Connect
Kafka (Can be customer-hosted)
Loader
(Available 2021Q4)
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
TigerGraph Connector Framework - Benefits
● Full control of data ingestion process
○ Throttle intake based on capacity
○ Pause as needed
○ Resume and restart data ingestion jobs as needed.
● Flexibility of system deployment
○ Works with natively deployed Kafka in the TigerGraph cluster
○ Allows customers to leverage existing TigerGraph with drop-in
integration with external Kafka cluster
● Push down ETL capabilities
○ Users can use data transformation with loader support for UDF
functions
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Kafka Loader
Easy integration of data sources
Kafka Connect
+
Data source
connector
Current Data Ingestion
Architecture Deep Dive
16
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Current Use of TigerGraph Connector Framework
AWS S3
TigerGraph
Cluster
Kafka Connect
Kafka
User Input
Language
Server
GraphStudio
(browser)
Kafka
Stream
GSQL CLI
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Define the Data Source:
● CREATE DATA_SOURCE S3 s = "/path/to/s3.config"
● s3.config
S3 Loading Job through GSQL
{
"file.reader.settings.fs.s3a.access.key": "AKIAJ****4YGHQ",
"file.reader.settings.fs.s3a.secret.key": "R8bli****p+dT4"
}
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Create a Loading Job
● loading_job.gsql
● files.config
S3 Loading Job through GSQL
{
"file.uris": "s3://my-bucket/data.csv"
}
CREATE LOADING JOB job1 FOR GRAPH my_graph {
DEFINE FILENAME f = "$s:/path/to/files.config";
LOAD f TO VERTEX v1 VALUES ($0, $1, $2);
}
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Run the Loading Job
● RUN LOADING JOB job1
S3 Loading Job through GSQL
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Define the Data Source:
S3 Loading Job through GraphStudio
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Map Data Files to Vertex type or Edge type
S3 Loading Job through GraphStudio
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Map Data columns to Vertex or Edge attributes
S3 Loading Job through GraphStudio
© 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Run the Loading Job
S3 Loading Job through GraphStudio
Demo using TigerGraph
GraphStudio Application
25
Thanks
26

Contenu connexe

Tendances

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나
네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나
네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나NAVER CLOUD PLATFORMㅣ네이버 클라우드 플랫폼
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumChengKuan Gan
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022HostedbyConfluent
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCPOliver Fierro
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
 

Tendances (20)

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나
네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나
네이버클라우드플랫폼이 제안하는 멀티클라우드(박기은 CTO) - IBM 스토리지 세미나
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with Debezium
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCP
 
Airflow Intro-1.pdf
Airflow Intro-1.pdfAirflow Intro-1.pdf
Airflow Intro-1.pdf
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
Airflow introduction
Airflow introductionAirflow introduction
Airflow introduction
 

Similaire à How a distributed graph analytics platform uses Apache Kafka for data ingestion in real time | Duc Le and Rayees Pasha, TigerGraph

Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...TigerGraph
 
Oracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service OverviewOracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service OverviewJinyu Wang
 
Hybrid data lake on google cloud with alluxio and dataproc
Hybrid data lake on google cloud  with alluxio and dataprocHybrid data lake on google cloud  with alluxio and dataproc
Hybrid data lake on google cloud with alluxio and dataprocAlluxio, Inc.
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionDaniel Zivkovic
 
Oracle GoldenGate Roadmap Oracle OpenWorld 2020
Oracle GoldenGate Roadmap Oracle OpenWorld 2020 Oracle GoldenGate Roadmap Oracle OpenWorld 2020
Oracle GoldenGate Roadmap Oracle OpenWorld 2020 Oracle
 
Portworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptxPortworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptxssuser1490e8
 
Gimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit OrlandoGimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit OrlandoRomit Mehta
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHostedbyConfluent
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHostedbyConfluent
 
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapNeo4j
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionTimothy Spann
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsVMware Tanzu
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern ApplicationRahul Kumar Gupta
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformDeepak Chandramouli
 
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdfData & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdfChris Bingham
 
The Current And Future State Of Service Mesh
The Current And Future State Of Service MeshThe Current And Future State Of Service Mesh
The Current And Future State Of Service MeshRam Vennam
 

Similaire à How a distributed graph analytics platform uses Apache Kafka for data ingestion in real time | Duc Le and Rayees Pasha, TigerGraph (20)

Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
Oracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service OverviewOracle GoldenGate Cloud Service Overview
Oracle GoldenGate Cloud Service Overview
 
Hybrid data lake on google cloud with alluxio and dataproc
Hybrid data lake on google cloud  with alluxio and dataprocHybrid data lake on google cloud  with alluxio and dataproc
Hybrid data lake on google cloud with alluxio and dataproc
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
 
Oracle GoldenGate Roadmap Oracle OpenWorld 2020
Oracle GoldenGate Roadmap Oracle OpenWorld 2020 Oracle GoldenGate Roadmap Oracle OpenWorld 2020
Oracle GoldenGate Roadmap Oracle OpenWorld 2020
 
Portworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptxPortworx 201 Customer Deck.pptx
Portworx 201 Customer Deck.pptx
 
Gimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit OrlandoGimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
 
Data Platform on GCP
Data Platform on GCPData Platform on GCP
Data Platform on GCP
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
 
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern Application
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic Platform
 
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdfData & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
 
The Current And Future State Of Service Mesh
The Current And Future State Of Service MeshThe Current And Future State Of Service Mesh
The Current And Future State Of Service Mesh
 

Plus de HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

Plus de HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Dernier

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

How a distributed graph analytics platform uses Apache Kafka for data ingestion in real time | Duc Le and Rayees Pasha, TigerGraph

  • 1. How a distributed graph analytics platform uses Apache Kafka for data ingestion in real time Rayees Pasha & Duc Le Kafka Summit US - Sep 2021 1
  • 2. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Agenda ● Overview of Graph analytics and TigerGraph ● Overview of Data ingestion into TigerGraph ● Use of Kafka Connect Framework and Benefits ● TigerGraph Data Ingestion Deep dive ● Demo - Data Ingestion using Kafka on TG Cloud 2
  • 3. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Rayees Pasha Product Lead, TigerGraph ● Responsible for TigerGraph Database Engine, Language and Platform areas of the product. ● Prior Lead PM and ENG positions at Workday, Hitachi and HP ● Expertise in Database Management and Big Data Technologies Session Presenters 3 Duc Le Engineering Manager, TigerGraph ● Lead Developer for TigerGraph Cloud ● Master in Management Information Systems from Carnegie Mellon University ● Areas of specialty: Full-stack Development, Cloud, Containers and Connectors
  • 4. Overview of Graph Analytics and TigerGraph 4
  • 5. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Why Graph; Why Now? Businesses want to ask business logic questions of their data Blending data from multiple sources, multiple business units, and increasingly external data Larger and more varied datasets mean more variables to analyze and connections to explore and test Importance of Graph in Today’s World 5
  • 6. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | 6 6 Who is TigerGraph? We provide advanced analytics and machine learning on connected data ○ The only scalable graph database for the enterprise: 40-300x faster than competition ○ Foundational for AI and ML solutions ○ Designed for efficient concurrent OLTP and OLAP workloads ○ SQL-like query language (GSQL) accelerates time to solution ○ Available on-premise & on: Google GCP, Microsoft Azure, Our customers include: ○ The largest companies in financial services, healthcare, telecom, media, utilities and innovative startups in cybersecurity, ecommerce and retail Founded in 2012, HQ in Redwood City, California Corporate Overview Video
  • 7. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Advanced Analytics and Machine Learning on Connected Data Advanced Analytics LEARN FROM CONNECTED DATA AI-based Customer 360 for entity resolution, recommendation engine, fraud detection In-Database Machine Learning Distributed Graph DB Friction-free scale up from GB to TB to Petabyte with lowest cost of ownership . CONNECT ALL DATASETS AND PIPELINES Customer 360 connecting 200+ datasets and pipelines Item 360 for eCommerce across 100+ datasets Fortune 50 Retailer 7 out of top 10 global banks Real-time fraud detection and credit risk assessment 10-100X faster than current solutions ANALYZE CONNECTED DATA Automotive Manufacturer Supply chain planning accelerated from 3 weeks to 45 minutes Leading Healthcare Provider 7 Leading FinTech Company
  • 8. Overview of Data Ingestion into TigerGraph 8
  • 9. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | TigerGraph Architecture
  • 10. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Modes of Data Ingestion supported Bulk Data • Bulk data loads using native File loader File Loader Low-latency ● JDBC Type 4 driver for Java, Python ● Spark can be used for parallel loads Real-time ● Streaming Data Applications ● High-frequency Data Apps Bulk Data Bulk data loads using •Native File loader, •Kafka loader Low-latency ● JDBC Type 4 driver for Java, Python ● Spark can be used for parallel loads Real-time ● Streaming Data Apps ● High-frequency Data Apps Native File Loader
  • 11. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Data Ingestion Into TigerGraph Using Kafka loader 11 Step 3 Each GPE consumes the partial data updates, processes it and puts it on disk. Loading Jobs and POST use UPSERT semantics: ● If vertex/edge doesn't yet exist, create it. ● If vertex/edge already exists, update it. ● Idempotent Step 1 Loaders take in user source data. ● Bulk load of data files or a Kafka stream in CSV or JSON format ● HTTP POSTs via REST services (JSON) ● GSQL Insert commands Step 2 Dispatcher takes in the data ingestion requests in the form of updates to the database. 1. Query IDS to get internal IDs 2. Convert data to internal format 3. Send data to one or more corresponding GPEs
  • 12. Use of Kafka Connect Framework and Benefits 12
  • 13. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Data Source 1 Data Source 2 Data Source 3 TigerGraph Connector Framework Using Kafka Connect TigerGraph Cluster Kafka Connect Kafka (Can be customer-hosted) Loader (Available 2021Q4)
  • 14. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | TigerGraph Connector Framework - Benefits ● Full control of data ingestion process ○ Throttle intake based on capacity ○ Pause as needed ○ Resume and restart data ingestion jobs as needed. ● Flexibility of system deployment ○ Works with natively deployed Kafka in the TigerGraph cluster ○ Allows customers to leverage existing TigerGraph with drop-in integration with external Kafka cluster ● Push down ETL capabilities ○ Users can use data transformation with loader support for UDF functions
  • 15. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Kafka Loader Easy integration of data sources Kafka Connect + Data source connector
  • 17. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Current Use of TigerGraph Connector Framework AWS S3 TigerGraph Cluster Kafka Connect Kafka User Input Language Server GraphStudio (browser) Kafka Stream GSQL CLI
  • 18. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Define the Data Source: ● CREATE DATA_SOURCE S3 s = "/path/to/s3.config" ● s3.config S3 Loading Job through GSQL { "file.reader.settings.fs.s3a.access.key": "AKIAJ****4YGHQ", "file.reader.settings.fs.s3a.secret.key": "R8bli****p+dT4" }
  • 19. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Create a Loading Job ● loading_job.gsql ● files.config S3 Loading Job through GSQL { "file.uris": "s3://my-bucket/data.csv" } CREATE LOADING JOB job1 FOR GRAPH my_graph { DEFINE FILENAME f = "$s:/path/to/files.config"; LOAD f TO VERTEX v1 VALUES ($0, $1, $2); }
  • 20. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Run the Loading Job ● RUN LOADING JOB job1 S3 Loading Job through GSQL
  • 21. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Define the Data Source: S3 Loading Job through GraphStudio
  • 22. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Map Data Files to Vertex type or Edge type S3 Loading Job through GraphStudio
  • 23. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Map Data columns to Vertex or Edge attributes S3 Loading Job through GraphStudio
  • 24. © 2021. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Run the Loading Job S3 Loading Job through GraphStudio