Submit Search
Upload
Spark Streamingによるリアルタイムユーザ属性推定
•
5 likes
•
1,900 views
Yoshiyasu SAEKI
Follow
Spark Meetup December 2015 http://connpass.com/event/23159/ 発表資料
Read less
Read more
Data & Analytics
Slideshow view
Report
Share
Slideshow view
Report
Share
1 of 27
Download now
Download to read offline
Recommended
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once Semantics
Yoshiyasu SAEKI
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
Yoshiyasu SAEKI
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Yoshiyasu SAEKI
ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較
Yoshiyasu SAEKI
グラフデータベース Neptune 使ってみた
グラフデータベース Neptune 使ってみた
Yoshiyasu SAEKI
Queryable State for Kafka Streamsを使ってみた
Queryable State for Kafka Streamsを使ってみた
Yoshiyasu SAEKI
KafkaとAWS Kinesisの比較
KafkaとAWS Kinesisの比較
Yoshiyasu SAEKI
データの民主化のために StackStorm を活用した事例
データの民主化のために StackStorm を活用した事例
Yoshiyasu SAEKI
Recommended
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once Semantics
Yoshiyasu SAEKI
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
Yoshiyasu SAEKI
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Yoshiyasu SAEKI
ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較
Yoshiyasu SAEKI
グラフデータベース Neptune 使ってみた
グラフデータベース Neptune 使ってみた
Yoshiyasu SAEKI
Queryable State for Kafka Streamsを使ってみた
Queryable State for Kafka Streamsを使ってみた
Yoshiyasu SAEKI
KafkaとAWS Kinesisの比較
KafkaとAWS Kinesisの比較
Yoshiyasu SAEKI
データの民主化のために StackStorm を活用した事例
データの民主化のために StackStorm を活用した事例
Yoshiyasu SAEKI
Voldemortの紹介
Voldemortの紹介
Yoshiyasu SAEKI
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Yoshiyasu SAEKI
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
PWL: One VM to Rule Them All
PWL: One VM to Rule Them All
Aysylu Greenberg
Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Jen Aman
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Thoughtworks
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
confluent
Ruby and Distributed Storage Systems
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
confluent
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Databricks
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
台灣資料科學年會
Technologies for Data Analytics Platform
Technologies for Data Analytics Platform
N Masahiro
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACK
Maxim Shelest
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
seoul_engineer
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
confluent
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Chris Fregly
Api world apache nifi 101
Api world apache nifi 101
Timothy Spann
Apache Pulsar Community-Jennifer
Apache Pulsar Community-Jennifer
StreamNative
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
Yoshiyasu SAEKI
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
sugiyama koki
More Related Content
What's hot
Voldemortの紹介
Voldemortの紹介
Yoshiyasu SAEKI
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Yoshiyasu SAEKI
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
PWL: One VM to Rule Them All
PWL: One VM to Rule Them All
Aysylu Greenberg
Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Jen Aman
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Thoughtworks
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
confluent
Ruby and Distributed Storage Systems
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
confluent
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Databricks
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
台灣資料科學年會
Technologies for Data Analytics Platform
Technologies for Data Analytics Platform
N Masahiro
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACK
Maxim Shelest
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
seoul_engineer
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
confluent
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Chris Fregly
Api world apache nifi 101
Api world apache nifi 101
Timothy Spann
Apache Pulsar Community-Jennifer
Apache Pulsar Community-Jennifer
StreamNative
What's hot
(20)
Voldemortの紹介
Voldemortの紹介
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
PWL: One VM to Rule Them All
PWL: One VM to Rule Them All
Facebook Presto presentation
Facebook Presto presentation
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Ruby and Distributed Storage Systems
Ruby and Distributed Storage Systems
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu Kasinathan
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
Technologies for Data Analytics Platform
Technologies for Data Analytics Platform
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACK
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Api world apache nifi 101
Api world apache nifi 101
Apache Pulsar Community-Jennifer
Apache Pulsar Community-Jennifer
Viewers also liked
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
Yoshiyasu SAEKI
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
sugiyama koki
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
NTT DATA OSS Professional Services
ビッグじゃなくても使えるSpark Streaming
ビッグじゃなくても使えるSpark Streaming
chibochibo
Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)
NTT DATA OSS Professional Services
Fast Distributed Online Classification
Fast Distributed Online Classification
DataWorks Summit/Hadoop Summit
Training Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in Spark
Patrick Pletscher
Run Spark on EMRってどんな仕組みになってるの?
Run Spark on EMRってどんな仕組みになってるの?
Satoshi Noto
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
prajods
2015-01-27 Introduction to Docker
2015-01-27 Introduction to Docker
Shuji Yamada
'Flume' Case Study
'Flume' Case Study
PriyankaRadha
Tokyo webmining発表資料 20111127
Tokyo webmining発表資料 20111127
kan_yukiko
Apache flume
Apache flume
Ramakrishna kapa
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
Shintaro Takemura
データセンタにおける消費電力のお話
データセンタにおける消費電力のお話
Koji Suganuma
Way of Experiment & Evaluation
Way of Experiment & Evaluation
Tatsuya Coike
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
MapR Technologies Japan
FreeBSD on Mac
FreeBSD on Mac
Yuichiro Naito
Kibana
Kibana
Torstein Hansen
Apache Sparkについて
Apache Sparkについて
BrainPad Inc.
Viewers also liked
(20)
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
ビッグじゃなくても使えるSpark Streaming
ビッグじゃなくても使えるSpark Streaming
Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)
Fast Distributed Online Classification
Fast Distributed Online Classification
Training Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in Spark
Run Spark on EMRってどんな仕組みになってるの?
Run Spark on EMRってどんな仕組みになってるの?
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
2015-01-27 Introduction to Docker
2015-01-27 Introduction to Docker
'Flume' Case Study
'Flume' Case Study
Tokyo webmining発表資料 20111127
Tokyo webmining発表資料 20111127
Apache flume
Apache flume
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
データセンタにおける消費電力のお話
データセンタにおける消費電力のお話
Way of Experiment & Evaluation
Way of Experiment & Evaluation
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
FreeBSD on Mac
FreeBSD on Mac
Kibana
Kibana
Apache Sparkについて
Apache Sparkについて
Similar to Spark Streamingによるリアルタイムユーザ属性推定
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Databricks
Introduction Apache Kafka
Introduction Apache Kafka
Joe Stein
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
Vasil Remeniuk
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
Rafal Kwasny
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Chris Fregly
Flink September 2015 Community Update
Flink September 2015 Community Update
Robert Metzger
15年前に作ったアプリを現在に蘇らせてみた話
15年前に作ったアプリを現在に蘇らせてみた話
Naoki Nagazumi
PySpark Best Practices
PySpark Best Practices
Cloudera, Inc.
リバースプロキシでwebサーバを集約ついでにdocker化しよう
リバースプロキシでwebサーバを集約ついでにdocker化しよう
Yasunori Kuji
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
hadooparchbook
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
Databricks
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
whoschek
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Joe Stein
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
Amazon Web Services Japan
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
Michael Noll
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache Spark
Taras Matyashovsky
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
Spark Summit
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
confluent
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Yao Yao
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
Similar to Spark Streamingによるリアルタイムユーザ属性推定
(20)
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Introduction Apache Kafka
Introduction Apache Kafka
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Flink September 2015 Community Update
Flink September 2015 Community Update
15年前に作ったアプリを現在に蘇らせてみた話
15年前に作ったアプリを現在に蘇らせてみた話
PySpark Best Practices
PySpark Best Practices
リバースプロキシでwebサーバを集約ついでにdocker化しよう
リバースプロキシでwebサーバを集約ついでにdocker化しよう
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache Spark
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
Recently uploaded
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Delhi Call girls
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
olyaivanovalion
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
olyaivanovalion
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
SUHANI PANDEY
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
olyaivanovalion
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Delhi Call girls
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
Call Girls in Nagpur High Profile Call Girls
Halmar dropshipping via API with DroFx
Halmar dropshipping via API with DroFx
olyaivanovalion
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
olyaivanovalion
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Valters Lauzums
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
olyaivanovalion
Recently uploaded
(20)
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
Halmar dropshipping via API with DroFx
Halmar dropshipping via API with DroFx
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
Spark Streamingによるリアルタイムユーザ属性推定
1.
Spark Streaming / @laclefyoshi <ysaeki@r.recruit.co.jp>
2.
• • Spark Streaming • • •
Spark Streaming Tips • 2
3.
: / SAEKI
Yoshiyasu : IT : Web 4 9 R&D Hadoop, Kafka, Storm, Spark, Druid : RICOH Theta ( ) + Google Cardboard 3
4.
Spark Streaming http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html 4
5.
5
6.
• • = • • http://www.recruit.jp/company/about/structure.html 6
7.
• • ≒ … • •
! OS etc. 7
8.
1. Web (JavaScript) 2.
fluentd Kafka 8
9.
: fluentd →
Kafka • fluent-plugin-kafka • https://github.com/htgc/fluent-plugin-kafka • output type = kafka_buffered (on file) • Kafka 0.8.2.2 • 0.9.0 • ACL 9
10.
10
11.
Suro • Netflix • https://github.com/Netflix/suro •
: Kafka Consumer API Thrift API • : • HDFS • AWS S3 • Kafka Producer • Elasticsearch • 11 LinkedIn Gobblin
12.
Hadoop • • HDFS • MLlib
• Streaming linear regression (Classification) • Streaming k-means (Clustering) • 12
13.
Spark Streaming 13
14.
Kafka • Direct Approach
(>= Spark 1.3) • • Exactly-once • Kafka Simple Consumer API Direct Approach 14
15.
Spark Streaming 1 15 http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html RDD
@ time1 RDD @ time2 RDD @ time3 RDD @ time4
16.
Spark Streaming 2 16 http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html
17.
Micro-batch 17 1Micro-batch (Cookie)
18.
Window-based micro-batch 1 1Micro-batch1Micro-batch 18
19.
Micro-batch • RDD HBase dstream.foreachRDD
{ rdd => val hbaseConf = createHbaseConfiguration() val jobConf = new Configuration(hbaseConf) jobConf.set("mapreduce.job.output.key.class", classOf[Text].getName) jobConf.set("mapreduce.job.output.value.class", classOf[Text].getName) jobConf.set("mapreduce.outputformat.class", classOf[TableOutputFormat[Text]].getName) new PairRDDFunctions(rdd.map(hbaseConvert)).saveAsNewAPIHadoopDataset(jobConf) } // RDD[(String, Map[K,V])] RDD[(String, Put)] def hbaseConvert(t:(String, Map[String, String])) = { val p = new Put(Bytes.toBytes(t._1)) t._2.toSeq.foreach( m => p.addColumn(Bytes.toBytes("seg"), Bytes.toBytes(m._1), Bytes.toBytes(m._2)) ) (t._1, p) } 19 0.5 1
20.
20
21.
Spark Streaming : •
DStream RDD • Spark Spark Streaming 21 http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html
22.
Spark Streaming : •
Fault Tolerance • Micro-batch • YARN • YARN Dynamic Resource Allocation • 22
23.
Spark Streaming : •
: → RDD → RDD DStream → DStream • 1Micro-batch 23 // RDD → RDD val input:RDD[String] = sparkContext.makeRDD(Seq("a", "b", “c")) // DStream → DStream val queue = scala.collection.mutable.Queue(rdd) val dstream:DStream[String] = sparkStreamingContext.queueStream(queue)
24.
Spark Streaming : •
spark-testing-base • https://github.com/holdenk/spark-testing-base class JsonElementCountTest extends StreamingSuiteBase { test("simple") { val input = List(List("aa"), List("bb")) val expected = List(List("AA"), List(“BB")) testOperation[String, String]( input, converterMethod _, expected, useSet = true) } } 24
25.
Spark Streaming : •
Window-based micro-batch • • o.a.spark.streaming.util.ManualClock • private class Scala • http://mkuthan.github.io/blog/2015/03/01/spark- unit-testing/ 25
26.
Spark Streaming : •
Scala Java • • Spark Streaming Kafka HBase Scala • Java 26 // api/java/JavaRDD.scala object JavaRDD { implicit def fromRDD[T: ClassTag](rdd: RDD[T]): JavaRDD[T] = new JavaRDD[T](rdd) implicit def toRDD[T](rdd: JavaRDD[T]): RDD[T] = rdd.rdd }
27.
27 • • • = • Spark
Streaming • MLlib • GraphX
Download now