SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Open-source vs.
public cloud.
Friends or Foes?
Adam Kawa, GetInData
First Decade
of Big Data
2006 - 2016.
Some of dilemmas
● Hive MR/Tez/LLAP, Impala, Presto, Spark SQL, Drill
● Oozie, Azkaban, Luigi, Airflow
● Kafka, Flume
● Storm, Spark Streaming, Samza, Flink
● Cloudera, Hortonworks, MapR
● GCP, AWS, Azure
With or without cloud
● Amazon Elastic MapReduce launched in April 2009
● Most of the companies started on-premise
○ Y!, FB, Twitter, LI, Allegro, ING
● Several companies were happy with the cloud
○ Netflix, Base CRM, Wikia
● A few companies went back and forth
○ Spotify, Foursquare, Synerise
A breaking news in 2015
Open-source
vs. public cloud
2016 - 2019.
Dark clouds over open-source
1. Provides proprietary & vendor lock-in products
○ Kinesis, PubSub, Redshift, BigQuery
Dark clouds over open-source
1. Provides proprietary & vendor lock-in products
○ Kinesis, PubSub, Redshift, BigQuery
2. Strip-mines open-source projects
○ Amazon Managed Streaming for Kafka (MSK)
○ Amazon Elasticsearch Service
○ Cloud Composer (Airflow), Cloud Datalab (Jupyter)
Dark clouds over open-source
1. Provides proprietary & vendor lock-in products
○ Kinesis, PubSub, Redshift, BigQuery
2. Strip-mines open-source projects
○ Amazon Managed Streaming for Kafka (MSK)
○ Amazon Elasticsearch Service
○ Cloud Composer (Airflow), Cloud Datalab (Jupyter)
3. Releases open-source projects to be run on the cloud
○ Beam, Tensorflow, Kubernetes
○ API compatibility (Dataflow, BigTable)
Run software in
containers on
premises and in the
cloud
Open-source responses
● Self-defense licenses
○ Confluent (Kafka), MongoDB, Redis
● Own cloud-based products
○ Databricks (Spark), Confluent (Kafka), Cloudera
(Hadoop)
● Cloud-first approach
○ That code is released first to their cloud offering
Current trends
1. More companies start using or evaluating the
cloud
2. Companies usually avoid or sometimes accept
vendor lock-in
3. Open-source projects become cloud-first and
cloud-ready
4. Kubernetes provides a portability layer
Open-source
& public cloud
Mid 2019.
Native open-source
services on Google
Cloud Platform
● a seamless sign-up
experience
● an integrated billing
● the 1st
line support
provided directly by
Google Cloud
Win! Win! Win!
Open-source
communities
and vendors
Cloud providers Customers
Open
-source
cloud
Three
Case Studies
Cloud-ready
data
analytics
platform
Mix of
open-source
and cloud
● Kafka
● Google Cloud Storage +
DataProc (Spark)
● Hive, Presto & BigQuery
● Tableau
Hybrid
environment
(test and prod)
● Real-time user journey
analytics and interactions
○ Kafka, Flink, Hive,
Docker
● Development in the cloud
● Production on=premise
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?

Contenu connexe

Tendances

Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
aspyker
 

Tendances (20)

Serverless Swift for Mobile Developers
Serverless Swift for Mobile DevelopersServerless Swift for Mobile Developers
Serverless Swift for Mobile Developers
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
 
Serverless stream processing of Debezium data change events with Knative | De...
Serverless stream processing of Debezium data change events with Knative | De...Serverless stream processing of Debezium data change events with Knative | De...
Serverless stream processing of Debezium data change events with Knative | De...
 
Advanced dev ops governance with terraform
Advanced dev ops governance with terraformAdvanced dev ops governance with terraform
Advanced dev ops governance with terraform
 
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
 
Datadog- Monitoring In Motion
Datadog- Monitoring In Motion Datadog- Monitoring In Motion
Datadog- Monitoring In Motion
 
Time Series Tech Stack for the IoT Edge
Time Series Tech Stack for the IoT EdgeTime Series Tech Stack for the IoT Edge
Time Series Tech Stack for the IoT Edge
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
 
Maximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamMaximilian Michels - Flink and Beam
Maximilian Michels - Flink and Beam
 
CS80A Foothill College Open Source Talk
CS80A Foothill College Open Source TalkCS80A Foothill College Open Source Talk
CS80A Foothill College Open Source Talk
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
 
Persist your data in an ephemeral k8 ecosystem
Persist your data in an ephemeral k8 ecosystemPersist your data in an ephemeral k8 ecosystem
Persist your data in an ephemeral k8 ecosystem
 
Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
Dok Talks #111 - Scheduled Scaling with Dask and Argo WorkflowsDok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
 
NetflixOSS and ZeroToDocker Talk
NetflixOSS and ZeroToDocker TalkNetflixOSS and ZeroToDocker Talk
NetflixOSS and ZeroToDocker Talk
 
The service mesh management plane
The service mesh management planeThe service mesh management plane
The service mesh management plane
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
 
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar AasenContainer Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
 
Cloud native policy enforcement with Open Policy Agent
Cloud native policy enforcement with Open Policy AgentCloud native policy enforcement with Open Policy Agent
Cloud native policy enforcement with Open Policy Agent
 

Similaire à Open-source vs. public cloud in the Big Data landscape. Friends or Foes?

LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud ComputingLinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing
Mark Hinkle
 
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...
Data Con LA
 
Cf intro for spring devs
Cf intro for spring devsCf intro for spring devs
Cf intro for spring devs
Eric Bottard
 
Software Strategy
Software StrategySoftware Strategy
Software Strategy
Alec Shutze
 
Cloud Foundry Introduction and Overview
Cloud Foundry Introduction and OverviewCloud Foundry Introduction and Overview
Cloud Foundry Introduction and Overview
Andy Piper
 

Similaire à Open-source vs. public cloud in the Big Data landscape. Friends or Foes? (20)

Cloud computing
Cloud computingCloud computing
Cloud computing
 
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud ComputingLinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing
 
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to B...
 
Cf intro for spring devs
Cf intro for spring devsCf intro for spring devs
Cf intro for spring devs
 
Software Strategy
Software StrategySoftware Strategy
Software Strategy
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
Linux Foundation Collaboration Summit: Hitchhiker's Guide to the Cloud
Linux Foundation Collaboration Summit: Hitchhiker's Guide to the CloudLinux Foundation Collaboration Summit: Hitchhiker's Guide to the Cloud
Linux Foundation Collaboration Summit: Hitchhiker's Guide to the Cloud
 
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
Bridge to Cloud: Using Apache Kafka to Migrate to AWSBridge to Cloud: Using Apache Kafka to Migrate to AWS
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
 
Python on AWS Lambda
Python on AWS Lambda Python on AWS Lambda
Python on AWS Lambda
 
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
 
Avoiding cloud lock-in
Avoiding cloud lock-inAvoiding cloud lock-in
Avoiding cloud lock-in
 
Cloud Foundry Introduction and Overview
Cloud Foundry Introduction and OverviewCloud Foundry Introduction and Overview
Cloud Foundry Introduction and Overview
 
Cloud computing overview
Cloud computing overviewCloud computing overview
Cloud computing overview
 
Keeping your options open
Keeping your options openKeeping your options open
Keeping your options open
 
Hybrid Cloud Service on the ThinkFree Mobile Android Platform
Hybrid Cloud Service on the ThinkFree Mobile Android PlatformHybrid Cloud Service on the ThinkFree Mobile Android Platform
Hybrid Cloud Service on the ThinkFree Mobile Android Platform
 
Future of Open Source in a Cloudy World
Future of Open Source in a Cloudy WorldFuture of Open Source in a Cloudy World
Future of Open Source in a Cloudy World
 
20091012 Command Your Data
20091012 Command Your Data20091012 Command Your Data
20091012 Command Your Data
 
Cloud computing.pptx
Cloud computing.pptxCloud computing.pptx
Cloud computing.pptx
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
 
Internet6: A Digital Game Changer
Internet6: A Digital Game ChangerInternet6: A Digital Game Changer
Internet6: A Digital Game Changer
 

Plus de GetInData

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
GetInData
 
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
GetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
GetInData
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
GetInData
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
GetInData
 
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
GetInData
 

Plus de GetInData (20)

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
 
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competition
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team?
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easier
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInData
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
 
Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...
 
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
 

Dernier

Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 

Dernier (20)

Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 

Open-source vs. public cloud in the Big Data landscape. Friends or Foes?

  • 1. Open-source vs. public cloud. Friends or Foes? Adam Kawa, GetInData
  • 2. First Decade of Big Data 2006 - 2016.
  • 3.
  • 4. Some of dilemmas ● Hive MR/Tez/LLAP, Impala, Presto, Spark SQL, Drill ● Oozie, Azkaban, Luigi, Airflow ● Kafka, Flume ● Storm, Spark Streaming, Samza, Flink ● Cloudera, Hortonworks, MapR ● GCP, AWS, Azure
  • 5. With or without cloud ● Amazon Elastic MapReduce launched in April 2009 ● Most of the companies started on-premise ○ Y!, FB, Twitter, LI, Allegro, ING ● Several companies were happy with the cloud ○ Netflix, Base CRM, Wikia ● A few companies went back and forth ○ Spotify, Foursquare, Synerise
  • 6. A breaking news in 2015
  • 7.
  • 9. Dark clouds over open-source 1. Provides proprietary & vendor lock-in products ○ Kinesis, PubSub, Redshift, BigQuery
  • 10. Dark clouds over open-source 1. Provides proprietary & vendor lock-in products ○ Kinesis, PubSub, Redshift, BigQuery 2. Strip-mines open-source projects ○ Amazon Managed Streaming for Kafka (MSK) ○ Amazon Elasticsearch Service ○ Cloud Composer (Airflow), Cloud Datalab (Jupyter)
  • 11. Dark clouds over open-source 1. Provides proprietary & vendor lock-in products ○ Kinesis, PubSub, Redshift, BigQuery 2. Strip-mines open-source projects ○ Amazon Managed Streaming for Kafka (MSK) ○ Amazon Elasticsearch Service ○ Cloud Composer (Airflow), Cloud Datalab (Jupyter) 3. Releases open-source projects to be run on the cloud ○ Beam, Tensorflow, Kubernetes ○ API compatibility (Dataflow, BigTable)
  • 12. Run software in containers on premises and in the cloud
  • 13. Open-source responses ● Self-defense licenses ○ Confluent (Kafka), MongoDB, Redis ● Own cloud-based products ○ Databricks (Spark), Confluent (Kafka), Cloudera (Hadoop) ● Cloud-first approach ○ That code is released first to their cloud offering
  • 14. Current trends 1. More companies start using or evaluating the cloud 2. Companies usually avoid or sometimes accept vendor lock-in 3. Open-source projects become cloud-first and cloud-ready 4. Kubernetes provides a portability layer
  • 16. Native open-source services on Google Cloud Platform ● a seamless sign-up experience ● an integrated billing ● the 1st line support provided directly by Google Cloud
  • 17. Win! Win! Win! Open-source communities and vendors Cloud providers Customers
  • 21. Mix of open-source and cloud ● Kafka ● Google Cloud Storage + DataProc (Spark) ● Hive, Presto & BigQuery ● Tableau
  • 22. Hybrid environment (test and prod) ● Real-time user journey analytics and interactions ○ Kafka, Flink, Hive, Docker ● Development in the cloud ● Production on=premise