SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Cassandra+Spark+ELK
Dmitriy Kalyada @ 2015
What is Spark?
• Master: Driver program
• Workers: Executors
• High Availability
• Standby Masters with
ZooKeeper
• Single-Node Recovery with
Local File System
Under the hood
• Resilient Distributed Dataset
(RDD)
• Scala + Akka Framework
• Java, Scala, Python API
• Spark SQL, MLib, Spark
Streaming, GraphX
Our particular case
Devices
Cassandra
Spark
ELK
Data flow
Fetcher Transformer Saver
Input Source(s)
x-RDD x-RDD
Output Source
Spark Cassandra Connector
• Represents Cassandra tables as Spark RDDs
• Write Spark RDDs to Cassandra tables
• Execute CQL queries in Spark applications
https://github.com/datastax/spark-cassandra-connector
CassandraRDD settings
• Connection params
• Fetching params
1. input.split.size: C* partitions in a Spark
Partition.
2. input.page.row.size: number of CQL rows
fetched per roundtrip.
Fetching essentials
…
…
…
…
-968391295277638458 … -893783532241185833
-968391295277638458, -893783532241185833
-7378580094811526501, -7340240117176401239
6426215139012569257, 6428979455828914106
-6094480671546553265, -6016282219056649738
-7259249675596554667, -7237838231745167324
-6734336817058726139, -6684208157211348972
-3891103372671105499, -3822513456325086923
4453206019575747361,4462441725813855391
7855385326468991461,7906589648045207141
-129433796439502583,-101280166181350027
-2233788032218452383,-2066644620711092198
3248662132571799756,3396129453515776704
7744134136205124749,7812918342246679728
-1408208314239486033,-1403736406052004344
• Support
Murmur3Partitioner
and
RandomPartitioner
• Retrieve token ranges
from Cassandra
• Prediction on base of 16
random token ranges
Data to RDD
…
Tokens Per RDD
[input.split.size]
Token Range #N
Slurp amount
[input.page.row.size]
Token range vs rows number
What to do?
• Change read strategy
• Split data on a smaller pieces
• Increase cluster strength
• Reorganize Cassandra schema
Elastic Search
Elastic Search & Kibana
• Index initialization: TransportClient
• Create/Delete Index
• Setup Mappings
• Indexing: ScalaEsRDD
• Data presentation: Kibana
Kibana
Deployment
• Build package: Spark Job + Dependent Jars +
Configs
• Upload to the Spark Master Node
• Start job submit script
Thank you
dkaliada@exadel.com
Dmitriy Kalyada @ 2015

Contenu connexe

Tendances

Tendances (20)

Cassandra & Spark for IoT
Cassandra & Spark for IoTCassandra & Spark for IoT
Cassandra & Spark for IoT
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 
Real time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesosReal time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesos
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache Spark
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for Training
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
Spark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousSpark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 Furious
 
Building a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at YieldbotBuilding a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at Yieldbot
 
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
 
Developing a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayDeveloping a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and Spray
 
Fast NoSQL from HDDs?
Fast NoSQL from HDDs? Fast NoSQL from HDDs?
Fast NoSQL from HDDs?
 
Kindling: Getting Started with Spark and Cassandra
Kindling: Getting Started with Spark and CassandraKindling: Getting Started with Spark and Cassandra
Kindling: Getting Started with Spark and Cassandra
 

En vedette

Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
DataWorks Summit
 
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
DataStax
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

En vedette (20)

SMARTSTUDY Django 오픈 세션 2012-08
SMARTSTUDY Django 오픈 세션 2012-08SMARTSTUDY Django 오픈 세션 2012-08
SMARTSTUDY Django 오픈 세션 2012-08
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
An Introduction to Distributed Search with Cassandra and Solr
An Introduction to Distributed Search with Cassandra and SolrAn Introduction to Distributed Search with Cassandra and Solr
An Introduction to Distributed Search with Cassandra and Solr
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
 
Spark to Production @Windward
Spark to Production @WindwardSpark to Production @Windward
Spark to Production @Windward
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDB
 
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseSolr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
 
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
 
Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
 
Continuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous DeliveryContinuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous Delivery
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
 
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
 
Using Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraUsing Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with Cassandra
 
SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Amste...
SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Amste...SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Amste...
SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Amste...
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
Real time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and ElasticsearchReal time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and Elasticsearch
 

Similaire à Cassandra + Spark + Elk

Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
Databricks
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on Databricks
Databricks
 
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the streamSpark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit
 

Similaire à Cassandra + Spark + Elk (20)

Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
 
5 Ways to Use Spark to Enrich your Cassandra Environment
5 Ways to Use Spark to Enrich your Cassandra Environment5 Ways to Use Spark to Enrich your Cassandra Environment
5 Ways to Use Spark to Enrich your Cassandra Environment
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on Databricks
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Cassandra and Spark SQL
Cassandra and Spark SQLCassandra and Spark SQL
Cassandra and Spark SQL
 
Spark Introduction
Spark IntroductionSpark Introduction
Spark Introduction
 
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the streamSpark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the stream
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
 
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex DadgarHomologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
 
Introduction to Spark - DataFactZ
Introduction to Spark - DataFactZIntroduction to Spark - DataFactZ
Introduction to Spark - DataFactZ
 
11. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/211. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/2
 
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesSpark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
 

Plus de Vasil Remeniuk

Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform Research
Vasil Remeniuk
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius Valatka
Vasil Remeniuk
 
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform Research
Vasil Remeniuk
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
Vasil Remeniuk
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
Vasil Remeniuk
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, Paulius
Vasil Remeniuk
 
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)
Vasil Remeniuk
 
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform Research
Vasil Remeniuk
 
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius Valatka
Vasil Remeniuk
 

Plus de Vasil Remeniuk (20)

Product Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикProduct Minsk - РТБ и Программатик
Product Minsk - РТБ и Программатик
 
Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14
 
Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14
 
Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3
 
Testing in Scala by Adform research
Testing in Scala by Adform researchTesting in Scala by Adform research
Testing in Scala by Adform research
 
Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform Research
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius Valatka
 
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform Research
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, Paulius
 
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)
 
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform Research
 
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius Valatka
 
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2
 
Testing in Scala. Adform Research
Testing in Scala. Adform ResearchTesting in Scala. Adform Research
Testing in Scala. Adform Research
 
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1
 
Опыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхОпыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событиях
 
ETL со Spark
ETL со SparkETL со Spark
ETL со Spark
 
Funtional Reactive Programming with Examples in Scala + GWT
Funtional Reactive Programming with Examples in Scala + GWTFuntional Reactive Programming with Examples in Scala + GWT
Funtional Reactive Programming with Examples in Scala + GWT
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Cassandra + Spark + Elk

  • 2. What is Spark? • Master: Driver program • Workers: Executors • High Availability • Standby Masters with ZooKeeper • Single-Node Recovery with Local File System
  • 3. Under the hood • Resilient Distributed Dataset (RDD) • Scala + Akka Framework • Java, Scala, Python API • Spark SQL, MLib, Spark Streaming, GraphX
  • 5. Data flow Fetcher Transformer Saver Input Source(s) x-RDD x-RDD Output Source
  • 6. Spark Cassandra Connector • Represents Cassandra tables as Spark RDDs • Write Spark RDDs to Cassandra tables • Execute CQL queries in Spark applications https://github.com/datastax/spark-cassandra-connector
  • 7. CassandraRDD settings • Connection params • Fetching params 1. input.split.size: C* partitions in a Spark Partition. 2. input.page.row.size: number of CQL rows fetched per roundtrip.
  • 8. Fetching essentials … … … … -968391295277638458 … -893783532241185833 -968391295277638458, -893783532241185833 -7378580094811526501, -7340240117176401239 6426215139012569257, 6428979455828914106 -6094480671546553265, -6016282219056649738 -7259249675596554667, -7237838231745167324 -6734336817058726139, -6684208157211348972 -3891103372671105499, -3822513456325086923 4453206019575747361,4462441725813855391 7855385326468991461,7906589648045207141 -129433796439502583,-101280166181350027 -2233788032218452383,-2066644620711092198 3248662132571799756,3396129453515776704 7744134136205124749,7812918342246679728 -1408208314239486033,-1403736406052004344 • Support Murmur3Partitioner and RandomPartitioner • Retrieve token ranges from Cassandra • Prediction on base of 16 random token ranges
  • 9. Data to RDD … Tokens Per RDD [input.split.size] Token Range #N Slurp amount [input.page.row.size]
  • 10. Token range vs rows number
  • 11. What to do? • Change read strategy • Split data on a smaller pieces • Increase cluster strength • Reorganize Cassandra schema
  • 13. Elastic Search & Kibana • Index initialization: TransportClient • Create/Delete Index • Setup Mappings • Indexing: ScalaEsRDD • Data presentation: Kibana
  • 15. Deployment • Build package: Spark Job + Dependent Jars + Configs • Upload to the Spark Master Node • Start job submit script