SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Build DMP on top of
GCP
VMFive - Randy Huang
Agenda
• Migrated Pipeline to GCP
• Cost Comparison
• Business Use Case
• Fluentd Demo
ELK + AWS EMR
Kinesis Lambda
Pros & Cons
• Pros :
• Well Support.
• Well docs.
• Easy to find Reference.
• Cons :
• High Cost.
• Not open source.
• Have to set the scale at first.
Pipeline on GCP
Dataflow
BigQuery
Machine Learning
Data Visualization
Compute Engine
Global Load Balancing
Datastudio
The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 7
Batch
BI Analysis
Storage

Cloud Storage
Processing

Cloud DataflowStreaming
Time Series Streaming

Cloud Pub/Sub
Storage

BigQuery
The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 8
Targeting Engines
Data Sources
Machine Learning
Applications
API Backend

Compute Engine
Spark MLlib

Cloud Dataproc
App Engine
Transform Data
Hosted Models

Cloud Machine Learning
Real-Time

Prediction API
Device Related

Cloud Pub/Sub
Behavior Related

Cloud Pub/Sub
3rd Party Data

Cloud Pub/Sub
Redis

Compute Engine
Pros & Cons
• Pros :
• Cost-effective.
• Operation-effective.
• Google got your back.
• Cons :
• API/SDK changes everyday.
• Some still in beta mode.
• Docs everywhere.
Workflow Monitoring
• Digdag <Airflow/Oozie/Luigi>
• Native support Python & Ruby
• Multi-Cloud
• Modular
• Workflow as code
• Docker Support
• Altering to Slack
Digdag Sample
Digdag
Cost Comparison
• $2000 on AWS per month
• about $200 on GCP production
• about another $200 for dev
• 50M events per month
Business Use Case
• Digital Ads Targeting
• User Behavior Tagging
• BI
• GEO Reporting
• KPI Reporting
• User Demographic
Some Tips
• BigQuery
• https://status.cloud.google.com/incident/bigquery/
18022
• Solved by Fluentd’s Retry and HA
• Dataflow’s SDK & docs is not sync
• Dataflow Sideinput has a bug with Streaming mode
• Compute Engine SLB - TCP/UDP setup for forwarding
Flunetd Update
• Release note for v0.14
• sub second event flush
• New Plugin APIS
support formatting configurations dynamically
(e.g., path /my/dest/${tag}/mydata.%Y-%m-%d.log)
• Secure Forward
Demo
• Nginx -> Fluentd -> BigQuery -> DataStudio
• MySQL -> Fluentd -> BigQuery

Contenu connexe

Tendances

10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise WideDatabricks
 
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...HostedbyConfluent
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...Brittany Ingram
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps TechnologiesKriangkrai Chaonithi
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow ArchitectureGerard Toonstra
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBOkbajda
 
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...InfluxData
 
Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!brad_quarry
 
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...HostedbyConfluent
 
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...HostedbyConfluent
 
Logging in The World of DevOps
Logging in The World of DevOps Logging in The World of DevOps
Logging in The World of DevOps DevOps Indonesia
 
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...HostedbyConfluent
 
Real-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkReal-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkLahav Savir
 
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...Databricks
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containerskbajda
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...HostedbyConfluent
 
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...InfluxData
 
Elastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using ConfluentElastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using Confluentconfluent
 

Tendances (20)

10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide
 
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
 
Introduction to knime
Introduction to knimeIntroduction to knime
Introduction to knime
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps Technologies
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
 
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
Aengus Rooney [Grafana] | What's New with Grafana and InfluxDB | InfluxDays E...
 
Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!Elastic Stack Basic - All The Capabilities in 6.3!
Elastic Stack Basic - All The Capabilities in 6.3!
 
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
 
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
 
Logging in The World of DevOps
Logging in The World of DevOps Logging in The World of DevOps
Logging in The World of DevOps
 
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
 
Real-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkReal-Time Vote Platform Benchmark
Real-Time Vote Platform Benchmark
 
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
 
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
 
Elastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using ConfluentElastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using Confluent
 

En vedette

A Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesA Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesDaniel Mescheder
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data PipelineJesus Rodriguez
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solidLars Albertsson
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesDataWorks Summit
 
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & DataductAmazon Web Services
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesPetteriTeikariPhD
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakHakka Labs
 

En vedette (9)

A Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesA Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data Pipelines
 
Building data pipelines
Building data pipelinesBuilding data pipelines
Building data pipelines
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
 
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and Kubernetes
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
 

Similaire à The journey of Moving from AWS ELK to GCP Data Pipeline

GCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native ArchitecturesGCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native Architecturesnine
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsNilanchal
 
DevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s SolutionDevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s SolutionVMware Tanzu
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMRightScale
 
Code first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with AzureCode first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with AzureJeremy Likness
 
Building real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudBuilding real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudJonny Daenen
 
A Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIA Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIRightScale
 
Best practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on CloudBest practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on CloudOleg Posyniak
 
Gcp intro-20160721
Gcp intro-20160721Gcp intro-20160721
Gcp intro-20160721Haeseung Lee
 
Accessing Google Cloud APIs
Accessing Google Cloud APIsAccessing Google Cloud APIs
Accessing Google Cloud APIswesley chun
 
Exploring Google APIs with Python
Exploring Google APIs with PythonExploring Google APIs with Python
Exploring Google APIs with Pythonwesley chun
 
Session 4 GCCP.pptx
Session 4 GCCP.pptxSession 4 GCCP.pptx
Session 4 GCCP.pptxDSCIITPatna
 
Kube journey 2017-04-19
Kube journey   2017-04-19Kube journey   2017-04-19
Kube journey 2017-04-19Doug Davis
 
Azure and web sites hackaton deck
Azure and web sites hackaton deckAzure and web sites hackaton deck
Azure and web sites hackaton deckAlexey Bokov
 
Powerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hackPowerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hackwesley chun
 
Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Simon Su
 
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送Google Cloud Platform - Japan
 
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptxGangTingFan
 

Similaire à The journey of Moving from AWS ELK to GCP Data Pipeline (20)

GCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native ArchitecturesGCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native Architectures
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / Platforms
 
DevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s SolutionDevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s Solution
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBM
 
Code first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with AzureCode first in the cloud: going serverless with Azure
Code first in the cloud: going serverless with Azure
 
Building real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudBuilding real-time data analytics on Google Cloud
Building real-time data analytics on Google Cloud
 
A Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIA Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROI
 
Best practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on CloudBest practices for developing your Magento Commerce on Cloud
Best practices for developing your Magento Commerce on Cloud
 
Gcp intro-20160721
Gcp intro-20160721Gcp intro-20160721
Gcp intro-20160721
 
Accessing Google Cloud APIs
Accessing Google Cloud APIsAccessing Google Cloud APIs
Accessing Google Cloud APIs
 
Exploring Google APIs with Python
Exploring Google APIs with PythonExploring Google APIs with Python
Exploring Google APIs with Python
 
Session 4 GCCP.pptx
Session 4 GCCP.pptxSession 4 GCCP.pptx
Session 4 GCCP.pptx
 
Kube journey 2017-04-19
Kube journey   2017-04-19Kube journey   2017-04-19
Kube journey 2017-04-19
 
Azure and web sites hackaton deck
Azure and web sites hackaton deckAzure and web sites hackaton deck
Azure and web sites hackaton deck
 
Powerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hackPowerful Google Cloud tools for your hack
Powerful Google Cloud tools for your hack
 
Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3
 
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
 
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
!GDSC NYUST Infrastructure and Application Modernization with Google Cloud .pptx
 

Dernier

Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLManishPatel169454
 

Dernier (20)

Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 

The journey of Moving from AWS ELK to GCP Data Pipeline

  • 1. Build DMP on top of GCP VMFive - Randy Huang
  • 2. Agenda • Migrated Pipeline to GCP • Cost Comparison • Business Use Case • Fluentd Demo
  • 3. ELK + AWS EMR Kinesis Lambda
  • 4. Pros & Cons • Pros : • Well Support. • Well docs. • Easy to find Reference. • Cons : • High Cost. • Not open source. • Have to set the scale at first.
  • 5. Pipeline on GCP Dataflow BigQuery Machine Learning Data Visualization Compute Engine Global Load Balancing
  • 7. The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 7 Batch BI Analysis Storage
 Cloud Storage Processing
 Cloud DataflowStreaming Time Series Streaming
 Cloud Pub/Sub Storage
 BigQuery
  • 8. The Products and Services logos may be used to accurately reference Google's technology and tools, for instance in architecture diagrams. 8 Targeting Engines Data Sources Machine Learning Applications API Backend
 Compute Engine Spark MLlib
 Cloud Dataproc App Engine Transform Data Hosted Models
 Cloud Machine Learning Real-Time
 Prediction API Device Related
 Cloud Pub/Sub Behavior Related
 Cloud Pub/Sub 3rd Party Data
 Cloud Pub/Sub Redis
 Compute Engine
  • 9. Pros & Cons • Pros : • Cost-effective. • Operation-effective. • Google got your back. • Cons : • API/SDK changes everyday. • Some still in beta mode. • Docs everywhere.
  • 10. Workflow Monitoring • Digdag <Airflow/Oozie/Luigi> • Native support Python & Ruby • Multi-Cloud • Modular • Workflow as code • Docker Support • Altering to Slack
  • 13.
  • 14. Cost Comparison • $2000 on AWS per month • about $200 on GCP production • about another $200 for dev • 50M events per month
  • 15. Business Use Case • Digital Ads Targeting • User Behavior Tagging • BI • GEO Reporting • KPI Reporting • User Demographic
  • 16. Some Tips • BigQuery • https://status.cloud.google.com/incident/bigquery/ 18022 • Solved by Fluentd’s Retry and HA • Dataflow’s SDK & docs is not sync • Dataflow Sideinput has a bug with Streaming mode • Compute Engine SLB - TCP/UDP setup for forwarding
  • 17. Flunetd Update • Release note for v0.14 • sub second event flush • New Plugin APIS support formatting configurations dynamically (e.g., path /my/dest/${tag}/mydata.%Y-%m-%d.log) • Secure Forward
  • 18. Demo • Nginx -> Fluentd -> BigQuery -> DataStudio • MySQL -> Fluentd -> BigQuery