Soumettre la recherche
Mettre en ligne
Introduing spark
•
0 j'aime
•
160 vues
Taotao Li
Suivre
a brief introduction to Apache Spark
Lire moins
Lire la suite
Ingénierie
Signaler
Partager
Signaler
Partager
1 sur 17
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
Performant data processing with PySpark, SparkR and DataFrame API
Performant data processing with PySpark, SparkR and DataFrame API
Ryuji Tamagawa
The SparkSQL things you maybe confuse
The SparkSQL things you maybe confuse
vito jeng
High Performance Python on Apache Spark
High Performance Python on Apache Spark
Wes McKinney
Apache spark linkedin
Apache spark linkedin
Yukti Kaura
Introduction to apache spark
Introduction to apache spark
UserReport
Performance of Spark vs MapReduce
Performance of Spark vs MapReduce
Edureka!
My Data Journey with Python (SciPy 2015 Keynote)
My Data Journey with Python (SciPy 2015 Keynote)
Wes McKinney
Python Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the Future
Wes McKinney
Recommandé
Performant data processing with PySpark, SparkR and DataFrame API
Performant data processing with PySpark, SparkR and DataFrame API
Ryuji Tamagawa
The SparkSQL things you maybe confuse
The SparkSQL things you maybe confuse
vito jeng
High Performance Python on Apache Spark
High Performance Python on Apache Spark
Wes McKinney
Apache spark linkedin
Apache spark linkedin
Yukti Kaura
Introduction to apache spark
Introduction to apache spark
UserReport
Performance of Spark vs MapReduce
Performance of Spark vs MapReduce
Edureka!
My Data Journey with Python (SciPy 2015 Keynote)
My Data Journey with Python (SciPy 2015 Keynote)
Wes McKinney
Python Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the Future
Wes McKinney
Presto Fast SQL on Anything
Presto Fast SQL on Anything
Alluxio, Inc.
An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015
Wes McKinney
Spark for big data analytics
Spark for big data analytics
Edureka!
Big data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & Scala
Edureka!
Intro to Apache Spark
Intro to Apache Spark
BTI360
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Spark Summit
Spark Core
Spark Core
Todd McGrath
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
How Apache Arrow and Parquet boost cross-language interoperability
How Apache Arrow and Parquet boost cross-language interoperability
Uwe Korn
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIs
Takuya UESHIN
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
Wes McKinney
Operational Tips for Deploying Spark
Operational Tips for Deploying Spark
Databricks
Apache Spark 101
Apache Spark 101
Abdullah Çetin ÇAVDAR
Introduction to SparkR
Introduction to SparkR
Olgun Aydın
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and Numba
Uwe Korn
Introduction to Apache Spark
Introduction to Apache Spark
Samy Dindane
Introduction to apache spark
Introduction to apache spark
Aakashdata
Ibis: Scaling the Python Data Experience
Ibis: Scaling the Python Data Experience
Wes McKinney
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
Uwe Korn
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
InfoFarm
kapilumak
kapilumak
kapil umak
Propuestas para pensar la enseñanza en la diversidad
Propuestas para pensar la enseñanza en la diversidad
LauStok
Contenu connexe
Tendances
Presto Fast SQL on Anything
Presto Fast SQL on Anything
Alluxio, Inc.
An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015
Wes McKinney
Spark for big data analytics
Spark for big data analytics
Edureka!
Big data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & Scala
Edureka!
Intro to Apache Spark
Intro to Apache Spark
BTI360
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Spark Summit
Spark Core
Spark Core
Todd McGrath
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
How Apache Arrow and Parquet boost cross-language interoperability
How Apache Arrow and Parquet boost cross-language interoperability
Uwe Korn
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIs
Takuya UESHIN
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
Wes McKinney
Operational Tips for Deploying Spark
Operational Tips for Deploying Spark
Databricks
Apache Spark 101
Apache Spark 101
Abdullah Çetin ÇAVDAR
Introduction to SparkR
Introduction to SparkR
Olgun Aydın
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and Numba
Uwe Korn
Introduction to Apache Spark
Introduction to Apache Spark
Samy Dindane
Introduction to apache spark
Introduction to apache spark
Aakashdata
Ibis: Scaling the Python Data Experience
Ibis: Scaling the Python Data Experience
Wes McKinney
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
Uwe Korn
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
InfoFarm
Tendances
(20)
Presto Fast SQL on Anything
Presto Fast SQL on Anything
An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015
Spark for big data analytics
Spark for big data analytics
Big data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & Scala
Intro to Apache Spark
Intro to Apache Spark
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Spark Core
Spark Core
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
How Apache Arrow and Parquet boost cross-language interoperability
How Apache Arrow and Parquet boost cross-language interoperability
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIs
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
Operational Tips for Deploying Spark
Operational Tips for Deploying Spark
Apache Spark 101
Apache Spark 101
Introduction to SparkR
Introduction to SparkR
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and Numba
Introduction to Apache Spark
Introduction to Apache Spark
Introduction to apache spark
Introduction to apache spark
Ibis: Scaling the Python Data Experience
Ibis: Scaling the Python Data Experience
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
PyData London 2017 – Efficient and portable DataFrame storage with Apache Par...
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
En vedette
kapilumak
kapilumak
kapil umak
Propuestas para pensar la enseñanza en la diversidad
Propuestas para pensar la enseñanza en la diversidad
LauStok
Democrats-Clinton&Obama
Democrats-Clinton&Obama
guest8d6534
Love Of A Tree. (Purple magosteen / Mangostán)
Love Of A Tree. (Purple magosteen / Mangostán)
Cachi Chien
Google Analytics Report on how to reduce bounce rate
Google Analytics Report on how to reduce bounce rate
Streebo
Human Alphabets 4
Human Alphabets 4
Sotirios Raptis
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD
{code}
SunGuard implementation kickoff meeting 07142016 pm
SunGuard implementation kickoff meeting 07142016 pm
Warren Payne PMP, M.S. IT Project Management
Resolución examen residentado 2016 26 de junio 2016
Resolución examen residentado 2016 26 de junio 2016
Villamedic Group
En vedette
(9)
kapilumak
kapilumak
Propuestas para pensar la enseñanza en la diversidad
Propuestas para pensar la enseñanza en la diversidad
Democrats-Clinton&Obama
Democrats-Clinton&Obama
Love Of A Tree. (Purple magosteen / Mangostán)
Love Of A Tree. (Purple magosteen / Mangostán)
Google Analytics Report on how to reduce bounce rate
Google Analytics Report on how to reduce bounce rate
Human Alphabets 4
Human Alphabets 4
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD
EMC World 2016 - code.05 Automating your Physical Data Center with RackHD
SunGuard implementation kickoff meeting 07142016 pm
SunGuard implementation kickoff meeting 07142016 pm
Resolución examen residentado 2016 26 de junio 2016
Resolución examen residentado 2016 26 de junio 2016
Similaire à Introduing spark
Spark and Hadoop Technology
Spark and Hadoop Technology
Avinash Gautam
Spark 101
Spark 101
Shahaf Azriely {TopLinked} ☁
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
Databricks
Introduction to spark
Introduction to spark
Home
Meet Spark
Meet Spark
Chicago Hadoop Users Group
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
YARN Ready: Apache Spark
YARN Ready: Apache Spark
Hortonworks
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
NTT DATA OSS Professional Services
Getting Started with Spark Scala
Getting Started with Spark Scala
Knoldus Inc.
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
Apache Spark in Scientific Applications
Apache Spark in Scientific Applications
Dr. Mirko Kämpf
spark_v1_2
spark_v1_2
Frank Schroeter
Started with-apache-spark
Started with-apache-spark
Happiest Minds Technologies
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
Gerd König
Hive on spark berlin buzzwords
Hive on spark berlin buzzwords
Szehon Ho
Introduction to Apache Spark
Introduction to Apache Spark
datamantra
Detailed guide to the Apache Spark Framework
Detailed guide to the Apache Spark Framework
Aegis Software Canada
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
Review on Apache Spark Technology
Review on Apache Spark Technology
IRJET Journal
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Edureka!
Similaire à Introduing spark
(20)
Spark and Hadoop Technology
Spark and Hadoop Technology
Spark 101
Spark 101
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
Introduction to spark
Introduction to spark
Meet Spark
Meet Spark
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
YARN Ready: Apache Spark
YARN Ready: Apache Spark
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)
Getting Started with Spark Scala
Getting Started with Spark Scala
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applications
Apache Spark in Scientific Applications
spark_v1_2
spark_v1_2
Started with-apache-spark
Started with-apache-spark
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
Hive on spark berlin buzzwords
Hive on spark berlin buzzwords
Introduction to Apache Spark
Introduction to Apache Spark
Detailed guide to the Apache Spark Framework
Detailed guide to the Apache Spark Framework
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Review on Apache Spark Technology
Review on Apache Spark Technology
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Dernier
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
Prabhanshu Chaturvedi
Online banking management system project.pdf
Online banking management system project.pdf
Kamal Acharya
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Dr.Costas Sachpazis
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Call Girls in Nagpur High Profile
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
Call Girls in Nagpur High Profile Call Girls
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur High Profile
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
SIVASHANKAR N
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
ranjana rawat
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur High Profile
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
simmis5
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
ranjana rawat
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
ranjana rawat
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
ranjana rawat
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
upamatechverse
Extrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
120cr0395
University management System project report..pdf
University management System project report..pdf
Kamal Acharya
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Call Girls in Nagpur High Profile
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
Asst.prof M.Gokilavani
Dernier
(20)
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
Online banking management system project.pdf
Online banking management system project.pdf
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
Extrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
University management System project report..pdf
University management System project report..pdf
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
Introduing spark
1.
2.
taotao.li@datayes.com 03/11/2016 Introducing Spark
3.
Copyright © 2014
DataYes. All rights reserved Agenda Spark ! When, What, Why ?1 Basic Concepts in Spark2 Programming Model in Spark3 Demo & Next4 5 Q & A
4.
Copyright © 2014
DataYes. All rights reserved Spark ! When, What, Why ? Top-level in Apache 2009 : Spark birth in AMPLab@UCB 2010 : open source Into Apache incubator 2009~2010 2013 2014 New Stage : more than an open source project
5.
Copyright © 2014
DataYes. All rights reserved Spark ! When, What, Why ? From official: Apache Spark™ is a fast and general engine for large-scale data processing. Key Points: ● A framework ● Birth for large-scale data processing ● Generalize programming model for data processing [ more than MR ] ● Provides high-level APIs : Scala, Python, R, Java ● Arm to teeth : SQL, Streaming, Machine Learning, GraphX ● Compatible with previous ecology : hadoop, mesos, hdfs, cassandra, hbase, s3 …
6.
Copyright © 2014
DataYes. All rights reserved Spark ! When, What, Why ? ● General ● Fast in develop ○ REPL explore ○ RDD operations ○ Less code ● Fast in processing ● Compatible ● Packges and 3-party packages ● Memory, cheaper and cheaper ● Company who accepts Spark
7.
Copyright © 2014
DataYes. All rights reserved Spark ! When, What, Why ?
8.
Copyright © 2014
DataYes. All rights reserved Spark ! When, What, Why ? DDR4-3000 288-pin DIMM 4x4GB Price Trend
9.
Copyright © 2014
DataYes. All rights reserved Basic Concepts in Spark
10.
Copyright © 2014
DataYes. All rights reserved Basic Concepts in Spark ● Driver, Master, Worker, Executor ● Application ● SparkContext, i.e : sc ● RDD ● Transform & Action in RDD need more ? check : 『 Spark 』2. spark 基本概念解析
11.
Copyright © 2014
DataYes. All rights reserved Programming Model in Spark
12.
Copyright © 2014
DataYes. All rights reserved Programming Model in Spark Three basic steps to build a Spark Application ● load dataset ○ static dataset ○ dynamic dataset ● Processing ○ RDD operation ○ UDF ○ Cache ● Output Display ○ collect ○ store in database, file system ...
13.
Copyright © 2014
DataYes. All rights reserved Demo & Next ● Wrapper Spark for Uqer Use Cases ● ● Try Tungsten ● ● Dataframe & Datasets ● ● SQL & Mlib & Streaming ● ● 3-party package wrapper [sklearn, pandas, numpy ...etc]
14.
Copyright © 2014
DataYes. All rights reserved Demo & Next
15.
Copyright © 2014
DataYes. All rights reserved Demo & Next ● Monte Carlo in Spark ● Spark in finance : index similarity calculating ● Spark in finance : distributed backtesting strategy
16.
Copyright © 2014
DataYes. All rights reserved Demo, Demo, Demo Q & A
17.
谢 谢
Télécharger maintenant