SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
taotao.li@datayes.com
03/11/2016
Introducing Spark
Copyright © 2014 DataYes. All rights reserved
Agenda
Spark ! When, What, Why ?1
Basic Concepts in Spark2
Programming Model in Spark3
Demo & Next4
5 Q & A
Copyright © 2014 DataYes. All rights reserved
Spark ! When, What, Why ?
Top-level in Apache
2009 : Spark birth in AMPLab@UCB
2010 : open source
Into Apache incubator
2009~2010
2013
2014
New Stage : more than an open
source project
Copyright © 2014 DataYes. All rights reserved
Spark ! When, What, Why ?
From official: Apache Spark™ is a fast and general engine for large-scale data processing.
Key Points:
● A framework
● Birth for large-scale data processing
● Generalize programming model for data processing [ more than MR ]
● Provides high-level APIs : Scala, Python, R, Java
● Arm to teeth : SQL, Streaming, Machine Learning, GraphX
● Compatible with previous ecology : hadoop, mesos, hdfs, cassandra, hbase, s3 …
Copyright © 2014 DataYes. All rights reserved
Spark ! When, What, Why ?
● General
● Fast in develop
○ REPL explore
○ RDD operations
○ Less code
● Fast in processing
● Compatible
● Packges and 3-party packages
● Memory, cheaper and cheaper
● Company who accepts Spark
Copyright © 2014 DataYes. All rights reserved
Spark ! When, What, Why ?
Copyright © 2014 DataYes. All rights reserved
Spark ! When, What, Why ?
DDR4-3000 288-pin DIMM 4x4GB Price Trend
Copyright © 2014 DataYes. All rights reserved
Basic Concepts in Spark
Copyright © 2014 DataYes. All rights reserved
Basic Concepts in Spark
● Driver, Master, Worker, Executor
● Application
● SparkContext, i.e : sc
● RDD
● Transform & Action in RDD
need more ? check : 『 Spark 』2. spark 基本概念解析
Copyright © 2014 DataYes. All rights reserved
Programming Model in Spark
Copyright © 2014 DataYes. All rights reserved
Programming Model in Spark
Three basic steps to build a Spark Application
● load dataset
○ static dataset
○ dynamic dataset
● Processing
○ RDD operation
○ UDF
○ Cache
● Output Display
○ collect
○ store in database, file system ...
Copyright © 2014 DataYes. All rights reserved
Demo & Next
● Wrapper Spark for Uqer Use Cases
●
● Try Tungsten
●
● Dataframe & Datasets
●
● SQL & Mlib & Streaming
●
● 3-party package wrapper [sklearn, pandas, numpy ...etc]
Copyright © 2014 DataYes. All rights reserved
Demo & Next
Copyright © 2014 DataYes. All rights reserved
Demo & Next
● Monte Carlo in Spark
● Spark in finance : index similarity calculating
● Spark in finance : distributed backtesting strategy
Copyright © 2014 DataYes. All rights reserved
Demo, Demo, Demo
Q & A
谢 谢

Contenu connexe

Tendances

Presto Fast SQL on Anything
Presto Fast SQL on AnythingPresto Fast SQL on Anything
Presto Fast SQL on AnythingAlluxio, Inc.
 
An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015Wes McKinney
 
Spark for big data analytics
Spark for big data analyticsSpark for big data analytics
Spark for big data analyticsEdureka!
 
Big data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & ScalaBig data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & ScalaEdureka!
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache SparkBTI360
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Spark Summit
 
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed DatabaseThe Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed DatabaseArangoDB Database
 
How Apache Arrow and Parquet boost cross-language interoperability
How Apache Arrow and Parquet boost cross-language interoperabilityHow Apache Arrow and Parquet boost cross-language interoperability
How Apache Arrow and Parquet boost cross-language interoperabilityUwe Korn
 
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIsKoalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIsTakuya UESHIN
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataWes McKinney
 
Operational Tips for Deploying Spark
Operational Tips for Deploying SparkOperational Tips for Deploying Spark
Operational Tips for Deploying SparkDatabricks
 
Introduction to SparkR
Introduction to SparkRIntroduction to SparkR
Introduction to SparkROlgun Aydın
 
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaExtending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaUwe Korn
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkSamy Dindane
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark Aakashdata
 
Ibis: Scaling the Python Data Experience
Ibis: Scaling the Python Data ExperienceIbis: Scaling the Python Data Experience
Ibis: Scaling the Python Data ExperienceWes McKinney
 
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...Uwe Korn
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmInfoFarm
 

Tendances (20)

Presto Fast SQL on Anything
Presto Fast SQL on AnythingPresto Fast SQL on Anything
Presto Fast SQL on Anything
 
An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015
 
Spark for big data analytics
Spark for big data analyticsSpark for big data analytics
Spark for big data analytics
 
Big data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & ScalaBig data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & Scala
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
 
Spark Core
Spark CoreSpark Core
Spark Core
 
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed DatabaseThe Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
 
How Apache Arrow and Parquet boost cross-language interoperability
How Apache Arrow and Parquet boost cross-language interoperabilityHow Apache Arrow and Parquet boost cross-language interoperability
How Apache Arrow and Parquet boost cross-language interoperability
 
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIsKoalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIs
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
Operational Tips for Deploying Spark
Operational Tips for Deploying SparkOperational Tips for Deploying Spark
Operational Tips for Deploying Spark
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
 
Introduction to SparkR
Introduction to SparkRIntroduction to SparkR
Introduction to SparkR
 
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaExtending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and Numba
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Ibis: Scaling the Python Data Experience
Ibis: Scaling the Python Data ExperienceIbis: Scaling the Python Data Experience
Ibis: Scaling the Python Data Experience
 
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
 

En vedette

Propuestas para pensar la enseñanza en la diversidad
Propuestas para pensar la enseñanza en la diversidadPropuestas para pensar la enseñanza en la diversidad
Propuestas para pensar la enseñanza en la diversidadLauStok
 
Democrats-Clinton&Obama
Democrats-Clinton&ObamaDemocrats-Clinton&Obama
Democrats-Clinton&Obamaguest8d6534
 
Love Of A Tree. (Purple magosteen / Mangostán)
Love Of A Tree. (Purple magosteen / Mangostán)Love Of A Tree. (Purple magosteen / Mangostán)
Love Of A Tree. (Purple magosteen / Mangostán)Cachi Chien
 
Google Analytics Report on how to reduce bounce rate
Google Analytics Report on how to reduce bounce rateGoogle Analytics Report on how to reduce bounce rate
Google Analytics Report on how to reduce bounce rateStreebo
 
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD
EMC World 2016 - code.05 Automating your Physical Data Center with RackHDEMC World 2016 - code.05 Automating your Physical Data Center with RackHD
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD{code}
 
Resolución examen residentado 2016 26 de junio 2016
Resolución examen residentado 2016   26 de junio 2016Resolución examen residentado 2016   26 de junio 2016
Resolución examen residentado 2016 26 de junio 2016Villamedic Group
 

En vedette (9)

kapilumak
kapilumakkapilumak
kapilumak
 
Propuestas para pensar la enseñanza en la diversidad
Propuestas para pensar la enseñanza en la diversidadPropuestas para pensar la enseñanza en la diversidad
Propuestas para pensar la enseñanza en la diversidad
 
Democrats-Clinton&Obama
Democrats-Clinton&ObamaDemocrats-Clinton&Obama
Democrats-Clinton&Obama
 
Love Of A Tree. (Purple magosteen / Mangostán)
Love Of A Tree. (Purple magosteen / Mangostán)Love Of A Tree. (Purple magosteen / Mangostán)
Love Of A Tree. (Purple magosteen / Mangostán)
 
Google Analytics Report on how to reduce bounce rate
Google Analytics Report on how to reduce bounce rateGoogle Analytics Report on how to reduce bounce rate
Google Analytics Report on how to reduce bounce rate
 
Human Alphabets 4
Human Alphabets 4Human Alphabets 4
Human Alphabets 4
 
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD
EMC World 2016 - code.05 Automating your Physical Data Center with RackHDEMC World 2016 - code.05 Automating your Physical Data Center with RackHD
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD
 
SunGuard implementation kickoff meeting 07142016 pm
SunGuard implementation kickoff meeting 07142016 pmSunGuard implementation kickoff meeting 07142016 pm
SunGuard implementation kickoff meeting 07142016 pm
 
Resolución examen residentado 2016 26 de junio 2016
Resolución examen residentado 2016   26 de junio 2016Resolución examen residentado 2016   26 de junio 2016
Resolución examen residentado 2016 26 de junio 2016
 

Similaire à Introduing spark

Spark and Hadoop Technology
Spark and Hadoop Technology Spark and Hadoop Technology
Spark and Hadoop Technology Avinash Gautam
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersDatabricks
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkHome
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideWhizlabs
 
YARN Ready: Apache Spark
YARN Ready: Apache Spark YARN Ready: Apache Spark
YARN Ready: Apache Spark Hortonworks
 
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)NTT DATA OSS Professional Services
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark ScalaKnoldus Inc.
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsDr. Mirko Kämpf
 
Apache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsApache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsDr. Mirko Kämpf
 
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014Gerd König
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwordsSzehon Ho
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Sparkdatamantra
 
Detailed guide to the Apache Spark Framework
Detailed guide to the Apache Spark FrameworkDetailed guide to the Apache Spark Framework
Detailed guide to the Apache Spark FrameworkAegis Software Canada
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Review on Apache Spark Technology
Review on Apache Spark TechnologyReview on Apache Spark Technology
Review on Apache Spark TechnologyIRJET Journal
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Edureka!
 

Similaire à Introduing spark (20)

Spark and Hadoop Technology
Spark and Hadoop Technology Spark and Hadoop Technology
Spark and Hadoop Technology
 
Spark 101
Spark 101Spark 101
Spark 101
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Meet Spark
Meet SparkMeet Spark
Meet Spark
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 
YARN Ready: Apache Spark
YARN Ready: Apache Spark YARN Ready: Apache Spark
YARN Ready: Apache Spark
 
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark Scala
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
Apache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsApache Spark in Scientific Applications
Apache Spark in Scientific Applications
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Detailed guide to the Apache Spark Framework
Detailed guide to the Apache Spark FrameworkDetailed guide to the Apache Spark Framework
Detailed guide to the Apache Spark Framework
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Review on Apache Spark Technology
Review on Apache Spark TechnologyReview on Apache Spark Technology
Review on Apache Spark Technology
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
 

Dernier

Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 

Dernier (20)

Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 

Introduing spark

  • 1.
  • 3. Copyright © 2014 DataYes. All rights reserved Agenda Spark ! When, What, Why ?1 Basic Concepts in Spark2 Programming Model in Spark3 Demo & Next4 5 Q & A
  • 4. Copyright © 2014 DataYes. All rights reserved Spark ! When, What, Why ? Top-level in Apache 2009 : Spark birth in AMPLab@UCB 2010 : open source Into Apache incubator 2009~2010 2013 2014 New Stage : more than an open source project
  • 5. Copyright © 2014 DataYes. All rights reserved Spark ! When, What, Why ? From official: Apache Spark™ is a fast and general engine for large-scale data processing. Key Points: ● A framework ● Birth for large-scale data processing ● Generalize programming model for data processing [ more than MR ] ● Provides high-level APIs : Scala, Python, R, Java ● Arm to teeth : SQL, Streaming, Machine Learning, GraphX ● Compatible with previous ecology : hadoop, mesos, hdfs, cassandra, hbase, s3 …
  • 6. Copyright © 2014 DataYes. All rights reserved Spark ! When, What, Why ? ● General ● Fast in develop ○ REPL explore ○ RDD operations ○ Less code ● Fast in processing ● Compatible ● Packges and 3-party packages ● Memory, cheaper and cheaper ● Company who accepts Spark
  • 7. Copyright © 2014 DataYes. All rights reserved Spark ! When, What, Why ?
  • 8. Copyright © 2014 DataYes. All rights reserved Spark ! When, What, Why ? DDR4-3000 288-pin DIMM 4x4GB Price Trend
  • 9. Copyright © 2014 DataYes. All rights reserved Basic Concepts in Spark
  • 10. Copyright © 2014 DataYes. All rights reserved Basic Concepts in Spark ● Driver, Master, Worker, Executor ● Application ● SparkContext, i.e : sc ● RDD ● Transform & Action in RDD need more ? check : 『 Spark 』2. spark 基本概念解析
  • 11. Copyright © 2014 DataYes. All rights reserved Programming Model in Spark
  • 12. Copyright © 2014 DataYes. All rights reserved Programming Model in Spark Three basic steps to build a Spark Application ● load dataset ○ static dataset ○ dynamic dataset ● Processing ○ RDD operation ○ UDF ○ Cache ● Output Display ○ collect ○ store in database, file system ...
  • 13. Copyright © 2014 DataYes. All rights reserved Demo & Next ● Wrapper Spark for Uqer Use Cases ● ● Try Tungsten ● ● Dataframe & Datasets ● ● SQL & Mlib & Streaming ● ● 3-party package wrapper [sklearn, pandas, numpy ...etc]
  • 14. Copyright © 2014 DataYes. All rights reserved Demo & Next
  • 15. Copyright © 2014 DataYes. All rights reserved Demo & Next ● Monte Carlo in Spark ● Spark in finance : index similarity calculating ● Spark in finance : distributed backtesting strategy
  • 16. Copyright © 2014 DataYes. All rights reserved Demo, Demo, Demo Q & A