Getting Apache Spark Customers to Production

1© Cloudera, Inc. All rights reserved.
Getting Spark Customers to Production
Kostas Sakellis

Me
• Software Engineer at Cloudera
• Contributor to Apache Spark
• Before that, contributed to Cloudera Manager

Our customers
• Various degrees of sophistication with Spark
• In all stages of development
• From POC to production deployments
• 95% use Spark on YARN*
• Biweekly analysis of tickets

WARING: This is biased!

Building a proof of
concept!
Courtesy of: http://www.nefloridadesign.com/mbimages/6.jpg

“Why is my job failing?”

“Why is my job slow?”

Misconfiguration
accounts for 20% of
job failures
Courtesy of: http://blog.sdrock.com/pastors/files/2013/06/time-clock.jpg

Resource Declaration
• Not easy knowing what you need and how to specify it
• Compute:
• --num-executors vs. --num-cores
• Memory
• --executor-memory
• Includes JVM overhead
• Need to do the math yourself

Dynamic Allocation
• Let Spark do the work for you
• Available since Spark 1.2*
• No need to specify compute a priori
• Limitation: Still required to specify cores
• In future:
• Allow specification of “task size”
• Dynamically allocate cores

YARN Configuration mismatch
• Compute:
• yarn.nodemanager.resource.cpu-vcores
• yarn.scheduler.maximum-allocation.vcores
• Memory:
• yarn.nodemanager.resource.memory-mb
• yarn.scheduler.maximum-allocation-mb

YARN Configuration mismatch
• Common to ask for more resources than allowed
• Future work:
• Exposing relevant YARN configurations in Spark UI
• Requires changes to YARN itself

Container
[pid=63375,containerID=container_1388158490598_0001_01_00
0003] is running beyond physical memory limits. Current
usage: 2.1 GB of 2 GB physical memory used; 2.8 GB of 4.2
GB virtual memory used. Killing container.
[...]
Another YARN goodie…

yarn.nodemanager.resource.memory-mb
Executor Container
spark.yarn.executor.memoryOverhead (7%) (10% in 1.4)
spark.executor.memory
spark.shuffle.memoryFraction (0.4) spark.storage.memoryFraction (0.6)
Memory allocation

YARN Overhead
• Future work:
• Better understanding of off heap allocations
• Improve memory usage visibility

Run program
through all our
data
Courtesy of:https://conniehallscott.files.wordpress.com/2013/01/411748_538971446114753_1125606225_o.jpg

Data dependent tuning
• As data rates change, re-tuning Spark is usually necessary
• Spark is sensitive to shuffle spills
• The most common knob we modify is…

Partitions, Partitions, Partitions!

GC Stalls

Partitions
• Smaller is often better
• Parameterized partition size
• reduceByKey(…, nPartitions)
• Parameterize application
• Future work:
• Dynamically determine # of partitions (SPARK-4630)

But for now?
• Easy answer:
• Keep multiplying by 1.5 and see what works
• Harder answer:

Shuffle less!

Shuffles
Wide DependencyNarrow Dependencies

ReduceByKey when Possible
•ReduceByKey allows a map-side-combine
parsed
.map{line =>(line.level, 1)}
.reduceByKey{(a, b) => a + b}
.collect()
•GroupByKey transfers all the data
parsed
.map{line =>(line.level, 1)}
.groupByKey.map{case(word,counts) =>
(word,counts.sum)}
.collect()

ReduceByKey when Possible
•ReduceByKey
•GroupByKey

Security, now it’s
getting serious.
Courtesy of: https://www.iti.illinois.edu/sites/default/files/Cybersecurity_image.jpg

Authentication
• Kerberos – the necessary evil
• Ubiquitous amongst other services
• YARN, HDFS, Hive, HBase, etc.
• Spark utilizes delegation tokens

Encryption
• Control plane
• File distribution
• Block Manager
• User UI / REST API
• Data-at-rest (shuffle files)
SPARK-6028 (Replace with netty)
Replace with netty
Spark 1.4
SPARK-2750 (SSL)
SPARK-5682

Authorization
• Enterprises have sensitive data
• Beyond HDFS file permissions
• Partial access to data
• Column level granularity
• Apache Sentry
• HDFS-Sentry synchronization plugin

Customers often
have shared
infrastructure
Courtesy of: https://radioglobalistic.files.wordpress.com/2011/02/lagos-traffic.jpg

Multi-tenancy
• Cluster utilization is top metric
• Target: 70-80% utilization
• Mixed workloads from mixed customers
• We recommend YARN
• Built in resource manager

Underutilized
Clusters
Courtesy of: http://media.nbclosangeles.com/images/1200*675/60-freeway-repair-dec16-2-empty.JPG

Dynamic Allocation
• Allows jobs to scale to size according to load
• Knobs to control min, max and initial size
• Future Work:
• Target: Dynamic allocation enabled by default
• Data locality & Caching
• Open question with Streaming

Thank you
We’re Hiring!

Getting Apache Spark Customers to Production

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Getting Apache Spark Customers to Production

Similaire à Getting Apache Spark Customers to Production (20)

Plus de Cloudera, Inc.

Plus de Cloudera, Inc. (20)

Dernier

Dernier (20)

Getting Apache Spark Customers to Production

Notes de l'éditeur