SlideShare une entreprise Scribd logo
1  sur  21
Hadoop

gumi




           1
:
Twitter:@CkReal
gumi 2




        AWS S3,EMR


                     2
1.

2. Elastic MapReduce EMR

3.

4. EMR




                      3
4
2

             ( 1   )




fluentd




         5
1                        1,000      /
      AP
     APAP
       AP                                           DB
d         fluentd   fluentd   mongos               mongod(PRIMARY)

                                                    DB
                            config
                                               mongod(SECONDARY)

                                                    DB
                   fluentd   mongos
                                               mongod(SECONDARY)

                            config            ReplicaSets & Sharding
    NFS



                              6
EMR




      7
8.5GB             1.4GB /



                                                       ID



Nov 1 23:59:59 hogehoge-ap1 hogehoge ADD_MONEY 12345
[BeforeMoney] 67979 [AfterMoney] 68024 [Money] 45

Nov 1 23:59:59 hogehoge-ap2 hogehoge CONSUME_POWER 12345
[BeforePower] 25   [AfterPower] 20    [ConsumePower] 5


                         8
NFS

NFS




      9
MongoDB
     MongoDB                   (           )

     "app" : "hogehoge",

     "userid" : "12345",
                                                          ID
     "dateint" : 20111101,

     "hourint" : 23,

     "actions" : [

          "CONSUME_POWER",                     MongoDB   Sharding
          "ADD_MONEY"

     ],

     "records" : [

                "action" : "ADD_MONEY",

                "timeint" : 235959,

     ]

                                          10
EMR
 Hive Pig        Hadoop Streaming




                             Hadoop
                            Streaming
                             (Python)




            11
m2.4xlarge × 1         4.9GB           85

EMR(m2.xlarge) × 5       4.9GB           44

  m2.4xlarge × 1         7.2GB   138

EMR(m2.xlarge) × 5       7.2GB           69

         (Macbook Air)   3.6GB         30     …


                          12
EMR        CPU




      13
14
( )

NFS   Amazon S3                   EMR

S3

                                 EMR




                          S3             EMR

                                               config
                       MongoDB                 mongos

                  15
S3
     boto

S3
     c3cmd               S3

EMR
     Mapper,Reducer,Python2.7

MongoDB
     pymongo                  MongoDB

EMR
     Client Tool(Ruby)             EMR

                                        16
EMR




      17
EMR
S3
 EC2⇔S3            20MB/sec



 Hadoop



 HadoopStreaming

EMR



                      18
GB/




      19
20
21

Contenu connexe

Tendances

Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Ontico
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)Ontico
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종NAVER D2
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с дискомPostgreSQL-Consulting
 
To Hire, or to train, that is the question (Percona Live 2014)
To Hire, or to train, that is the question (Percona Live 2014)To Hire, or to train, that is the question (Percona Live 2014)
To Hire, or to train, that is the question (Percona Live 2014)Geoffrey Anderson
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsLaurynas Biveinis
 
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...Ontico
 
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...Ontico
 
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016Tomas Vondra
 
Embulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructureEmbulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructureHiroshi Toyama
 
PostgreSQL performance archaeology
PostgreSQL performance archaeologyPostgreSQL performance archaeology
PostgreSQL performance archaeologyTomas Vondra
 
PostgreSQL is the new NoSQL - at Devoxx 2018
PostgreSQL is the new NoSQL  - at Devoxx 2018PostgreSQL is the new NoSQL  - at Devoxx 2018
PostgreSQL is the new NoSQL - at Devoxx 2018Quentin Adam
 
Sun jdk 1.6 gc english version
Sun jdk 1.6 gc english versionSun jdk 1.6 gc english version
Sun jdk 1.6 gc english versionbluedavy lin
 
Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101MongoDB
 
Advanced backup methods (Postgres@CERN)
Advanced backup methods (Postgres@CERN)Advanced backup methods (Postgres@CERN)
Advanced backup methods (Postgres@CERN)Anastasia Lubennikova
 
GlusterFS As an Object Storage
GlusterFS As an Object StorageGlusterFS As an Object Storage
GlusterFS As an Object StorageKeisuke Takahashi
 

Tendances (20)

Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
 
MesosCon 2018
MesosCon 2018MesosCon 2018
MesosCon 2018
 
Mongodb
MongodbMongodb
Mongodb
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
 
CuPy v4 and v5 roadmap
CuPy v4 and v5 roadmapCuPy v4 and v5 roadmap
CuPy v4 and v5 roadmap
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с диском
 
To Hire, or to train, that is the question (Percona Live 2014)
To Hire, or to train, that is the question (Percona Live 2014)To Hire, or to train, that is the question (Percona Live 2014)
To Hire, or to train, that is the question (Percona Live 2014)
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithms
 
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
 
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
 
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
 
Embulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructureEmbulk and Machine Learning infrastructure
Embulk and Machine Learning infrastructure
 
PostgreSQL performance archaeology
PostgreSQL performance archaeologyPostgreSQL performance archaeology
PostgreSQL performance archaeology
 
PostgreSQL is the new NoSQL - at Devoxx 2018
PostgreSQL is the new NoSQL  - at Devoxx 2018PostgreSQL is the new NoSQL  - at Devoxx 2018
PostgreSQL is the new NoSQL - at Devoxx 2018
 
Chainer v4 and v5
Chainer v4 and v5Chainer v4 and v5
Chainer v4 and v5
 
Sun jdk 1.6 gc english version
Sun jdk 1.6 gc english versionSun jdk 1.6 gc english version
Sun jdk 1.6 gc english version
 
Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101
 
Advanced backup methods (Postgres@CERN)
Advanced backup methods (Postgres@CERN)Advanced backup methods (Postgres@CERN)
Advanced backup methods (Postgres@CERN)
 
GlusterFS As an Object Storage
GlusterFS As an Object StorageGlusterFS As an Object Storage
GlusterFS As an Object Storage
 

Similaire à ソーシャルゲームログ解析基盤のHadoop活用事例

Berkeley Performance Tuning
Berkeley Performance TuningBerkeley Performance Tuning
Berkeley Performance TuningGeorge Ang
 
Parallel Computing for Econometricians with Amazon Web Services
Parallel Computing for Econometricians with Amazon Web ServicesParallel Computing for Econometricians with Amazon Web Services
Parallel Computing for Econometricians with Amazon Web Servicesstephenjbarr
 
Deploying with JRuby
Deploying with JRubyDeploying with JRuby
Deploying with JRubyJoe Kutner
 
3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMR3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMRFaizan Javed
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Sujee Maniyam
 
料理を楽しくする画像配信システム
料理を楽しくする画像配信システム料理を楽しくする画像配信システム
料理を楽しくする画像配信システムIssei Naruta
 
BedCon 2013 - Java Persistenz-Frameworks für MongoDB
BedCon 2013 - Java Persistenz-Frameworks für MongoDBBedCon 2013 - Java Persistenz-Frameworks für MongoDB
BedCon 2013 - Java Persistenz-Frameworks für MongoDBTobias Trelle
 
Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform confluent
 
BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012Amazon Web Services
 
Lessons learned scaling big data in cloud
Lessons learned   scaling big data in cloudLessons learned   scaling big data in cloud
Lessons learned scaling big data in cloudVijay Rayapati
 
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos
 
Flume-Cassandra Log Processor
Flume-Cassandra Log ProcessorFlume-Cassandra Log Processor
Flume-Cassandra Log ProcessorCLOUDIAN KK
 
COOKPADでのHadoop利用
COOKPADでのHadoop利用COOKPADでのHadoop利用
COOKPADでのHadoop利用Tatsuya Sasaki
 
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUGIntroduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUGAdam Kawa
 
R Jobs on the Cloud
R Jobs on the CloudR Jobs on the Cloud
R Jobs on the CloudJohn Doxaras
 
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB Rakuten Group, Inc.
 
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...Amazon Web Services
 
Cloud Computing BP-Study 20090319
Cloud Computing BP-Study 20090319Cloud Computing BP-Study 20090319
Cloud Computing BP-Study 20090319Yukio Andoh
 

Similaire à ソーシャルゲームログ解析基盤のHadoop活用事例 (20)

Apache Nemo
Apache NemoApache Nemo
Apache Nemo
 
Berkeley Performance Tuning
Berkeley Performance TuningBerkeley Performance Tuning
Berkeley Performance Tuning
 
Parallel Computing for Econometricians with Amazon Web Services
Parallel Computing for Econometricians with Amazon Web ServicesParallel Computing for Econometricians with Amazon Web Services
Parallel Computing for Econometricians with Amazon Web Services
 
Deploying with JRuby
Deploying with JRubyDeploying with JRuby
Deploying with JRuby
 
3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMR3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMR
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
 
料理を楽しくする画像配信システム
料理を楽しくする画像配信システム料理を楽しくする画像配信システム
料理を楽しくする画像配信システム
 
BedCon 2013 - Java Persistenz-Frameworks für MongoDB
BedCon 2013 - Java Persistenz-Frameworks für MongoDBBedCon 2013 - Java Persistenz-Frameworks für MongoDB
BedCon 2013 - Java Persistenz-Frameworks für MongoDB
 
Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform Putting Kafka Together with the Best of Google Cloud Platform
Putting Kafka Together with the Best of Google Cloud Platform
 
mongodb tutorial
mongodb tutorialmongodb tutorial
mongodb tutorial
 
BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012
 
Lessons learned scaling big data in cloud
Lessons learned   scaling big data in cloudLessons learned   scaling big data in cloud
Lessons learned scaling big data in cloud
 
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
 
Flume-Cassandra Log Processor
Flume-Cassandra Log ProcessorFlume-Cassandra Log Processor
Flume-Cassandra Log Processor
 
COOKPADでのHadoop利用
COOKPADでのHadoop利用COOKPADでのHadoop利用
COOKPADでのHadoop利用
 
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUGIntroduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
 
R Jobs on the Cloud
R Jobs on the CloudR Jobs on the Cloud
R Jobs on the Cloud
 
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
 
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
 
Cloud Computing BP-Study 20090319
Cloud Computing BP-Study 20090319Cloud Computing BP-Study 20090319
Cloud Computing BP-Study 20090319
 

Plus de 知教 本間

gumiにおける、海外支社とのAtlassian製品利用事例
gumiにおける、海外支社とのAtlassian製品利用事例gumiにおける、海外支社とのAtlassian製品利用事例
gumiにおける、海外支社とのAtlassian製品利用事例知教 本間
 
GitHubEnterpriseからBitbucket(Stash) への移行事例
GitHubEnterpriseからBitbucket(Stash) への移行事例GitHubEnterpriseからBitbucket(Stash) への移行事例
GitHubEnterpriseからBitbucket(Stash) への移行事例知教 本間
 
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化についてAWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について知教 本間
 
Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production知教 本間
 
チームでChef serverを運用するには
チームでChef serverを運用するにはチームでChef serverを運用するには
チームでChef serverを運用するには知教 本間
 
Redisへと至る、gumiデータストアの歴史
Redisへと至る、gumiデータストアの歴史Redisへと至る、gumiデータストアの歴史
Redisへと至る、gumiデータストアの歴史知教 本間
 
ソーシャルゲームのEMR活用事例
ソーシャルゲームのEMR活用事例ソーシャルゲームのEMR活用事例
ソーシャルゲームのEMR活用事例知教 本間
 
MongoDBざっくり解説
MongoDBざっくり解説MongoDBざっくり解説
MongoDBざっくり解説知教 本間
 
ソーシャルゲームログ解析基盤のMongoDB活用事例
ソーシャルゲームログ解析基盤のMongoDB活用事例ソーシャルゲームログ解析基盤のMongoDB活用事例
ソーシャルゲームログ解析基盤のMongoDB活用事例知教 本間
 

Plus de 知教 本間 (9)

gumiにおける、海外支社とのAtlassian製品利用事例
gumiにおける、海外支社とのAtlassian製品利用事例gumiにおける、海外支社とのAtlassian製品利用事例
gumiにおける、海外支社とのAtlassian製品利用事例
 
GitHubEnterpriseからBitbucket(Stash) への移行事例
GitHubEnterpriseからBitbucket(Stash) への移行事例GitHubEnterpriseからBitbucket(Stash) への移行事例
GitHubEnterpriseからBitbucket(Stash) への移行事例
 
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化についてAWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について
AWSアカウント開設からインスタンスを立ち上げるまでの作業自動化について
 
Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production
 
チームでChef serverを運用するには
チームでChef serverを運用するにはチームでChef serverを運用するには
チームでChef serverを運用するには
 
Redisへと至る、gumiデータストアの歴史
Redisへと至る、gumiデータストアの歴史Redisへと至る、gumiデータストアの歴史
Redisへと至る、gumiデータストアの歴史
 
ソーシャルゲームのEMR活用事例
ソーシャルゲームのEMR活用事例ソーシャルゲームのEMR活用事例
ソーシャルゲームのEMR活用事例
 
MongoDBざっくり解説
MongoDBざっくり解説MongoDBざっくり解説
MongoDBざっくり解説
 
ソーシャルゲームログ解析基盤のMongoDB活用事例
ソーシャルゲームログ解析基盤のMongoDB活用事例ソーシャルゲームログ解析基盤のMongoDB活用事例
ソーシャルゲームログ解析基盤のMongoDB活用事例
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 

ソーシャルゲームログ解析基盤のHadoop活用事例

  • 3. 1. 2. Elastic MapReduce EMR 3. 4. EMR 3
  • 4. 4
  • 5. 2 ( 1 ) fluentd 5
  • 6. 1 1,000 / AP APAP AP DB d fluentd fluentd mongos mongod(PRIMARY) DB config mongod(SECONDARY) DB fluentd mongos mongod(SECONDARY) config ReplicaSets & Sharding NFS 6
  • 7. EMR 7
  • 8. 8.5GB 1.4GB / ID Nov 1 23:59:59 hogehoge-ap1 hogehoge ADD_MONEY 12345 [BeforeMoney] 67979 [AfterMoney] 68024 [Money] 45 Nov 1 23:59:59 hogehoge-ap2 hogehoge CONSUME_POWER 12345 [BeforePower] 25 [AfterPower] 20 [ConsumePower] 5 8
  • 10. MongoDB MongoDB ( ) "app" : "hogehoge", "userid" : "12345", ID "dateint" : 20111101, "hourint" : 23, "actions" : [ "CONSUME_POWER", MongoDB Sharding "ADD_MONEY" ], "records" : [ "action" : "ADD_MONEY", "timeint" : 235959, ] 10
  • 11. EMR Hive Pig Hadoop Streaming Hadoop Streaming (Python) 11
  • 12. m2.4xlarge × 1 4.9GB 85 EMR(m2.xlarge) × 5 4.9GB 44 m2.4xlarge × 1 7.2GB 138 EMR(m2.xlarge) × 5 7.2GB 69 (Macbook Air) 3.6GB 30 … 12
  • 13. EMR CPU 13
  • 14. 14
  • 15. ( ) NFS Amazon S3 EMR S3 EMR S3 EMR config MongoDB mongos 15
  • 16. S3 boto S3 c3cmd S3 EMR Mapper,Reducer,Python2.7 MongoDB pymongo MongoDB EMR Client Tool(Ruby) EMR 16
  • 17. EMR 17
  • 18. EMR S3 EC2⇔S3 20MB/sec Hadoop HadoopStreaming EMR 18
  • 19. GB/ 19
  • 20. 20
  • 21. 21

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n