Tachyon meetup San Francisco Oct 2014

•

1 j'aime•966 vues

Claudiu Barbura

Tachyon in xPatterns
Tachyon Meetup SF
1
Oct 2014

Agenda
• xPatterns architecture
• BDAS++
• Demos
• Tachyon internals (apis)
• Lessons learned
2

BDAS++
• Tachyon patch (https://github.com/amplab/tachyon/pull/482)
• Jaws, xPatterns http spark sql server http://github.com/Atigeo/http-spark-sql-
4
server
 Backward compatible with Shark and Spark 0.x stack
• Spark Job Server
 multiple Spark contexts in same JVM, job submission in Java + Scala
https://github.com/Atigeo/spark-job-rest
• Mesos framework starvation bug
 submitted patch… detailed Tech Blog link soon at http://xpatterns.com
• *SchedulerBackend update, 0.9.0 patches (shuffle spill, Mesos fine-grained)

Demos …
5

Lessons learned
• partial in-memory file storage bug
• journal file on hdfs -> backup of local master disk
• hdfs api
• RawTable in Shark
• persist(OFF_HEAP) temporary storage
• RDD.persist() OFF_HEAP > MEMORY_SER_AND_DISK
• native API: getInStream(CACHE|NO_CACHE) -> local workers
• do not evict blocks when streaming to Tachyon/hdfs
• Tachyon > Spark JVM Cache for long running jobs
• kryo/defaultCodec/sequenceFile format to minimize memory footprint
• 25million emails/month 2TB, 3-45 nodes, 120-170GB of RAM for Tachyon
6

Q & A
7

Recommandé

xPatterns on Spark, Shark, Mesos, Tachyon

xPatterns on Spark, Shark, Mesos, Tachyon

xPatterns on Spark, Shark, Mesos, TachyonClaudiu Barbura

Reactive streams

Reactive streams

Reactive streamscodepitbull

Spark, Tachyon and Mesos internals

Spark, Tachyon and Mesos internals

Spark, Tachyon and Mesos internalsClaudiu Barbura

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)Claudiu Barbura

Tachyon and Apache Spark

Tachyon and Apache Spark

Tachyon and Apache Sparkrhatr

Lessons learned from embedding Cassandra in xPatterns

Lessons learned from embedding Cassandra in xPatterns

Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura

Next CERN Accelerator Logging Service with Jakub Wozniak

Next CERN Accelerator Logging Service with Jakub Wozniak

Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and AkkaHelena Edelson

Recommandé

xPatterns on Spark, Shark, Mesos, Tachyon

xPatterns on Spark, Shark, Mesos, Tachyon

xPatterns on Spark, Shark, Mesos, TachyonClaudiu Barbura

Reactive streams

Reactive streams

Reactive streamscodepitbull

Spark, Tachyon and Mesos internals

Spark, Tachyon and Mesos internals

Spark, Tachyon and Mesos internalsClaudiu Barbura

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)Claudiu Barbura

Tachyon and Apache Spark

Tachyon and Apache Spark

Tachyon and Apache Sparkrhatr

Lessons learned from embedding Cassandra in xPatterns

Lessons learned from embedding Cassandra in xPatterns

Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura

Next CERN Accelerator Logging Service with Jakub Wozniak

Next CERN Accelerator Logging Service with Jakub Wozniak

Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and AkkaHelena Edelson

Lambda Architecture with Spark

Lambda Architecture with Spark

Lambda Architecture with SparkKnoldus Inc.

IEEE International Conference on Data Engineering 2015

IEEE International Conference on Data Engineering 2015

IEEE International Conference on Data Engineering 2015Yousun Jeong

Lambda architecture on Spark, Kafka for real-time large scale ML

Lambda architecture on Spark, Kafka for real-time large scale ML

Lambda architecture on Spark, Kafka for real-time large scale MLhuguk

How to deploy Apache Spark  to Mesos/DCOS

How to deploy Apache Spark  to Mesos/DCOS

How to deploy Apache Spark  to Mesos/DCOSLegacy Typesafe (now Lightbend)

Bridging the gap of Relational to Hadoop using Sqoop @ Expedia

Bridging the gap of Relational to Hadoop using Sqoop @ Expedia

Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaDataWorks Summit/Hadoop Summit

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira

Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraSpark Summit

How ReversingLabs Serves File Reputation Service for 10B Files

How ReversingLabs Serves File Reputation Service for 10B Files

How ReversingLabs Serves File Reputation Service for 10B FilesScyllaDB

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...DataStax

Performance Troubleshooting Using Apache Spark Metrics

Performance Troubleshooting Using Apache Spark Metrics

Performance Troubleshooting Using Apache Spark MetricsDatabricks

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveDataWorks Summit/Hadoop Summit

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Databricks

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark Summit

Feeding Cassandra with Spark-Streaming and Kafka

Feeding Cassandra with Spark-Streaming and Kafka

Feeding Cassandra with Spark-Streaming and KafkaDataStax Academy

An introduction into Spark ML plus how to go beyond when you get stuck

An introduction into Spark ML plus how to go beyond when you get stuck

An introduction into Spark ML plus how to go beyond when you get stuckData Con LA

Spark Summit EU talk by Mike Percy

Spark Summit EU talk by Mike Percy

Spark Summit EU talk by Mike PercySpark Summit

Spark Summit EU talk by Jorg Schad

Spark Summit EU talk by Jorg Schad

Spark Summit EU talk by Jorg SchadSpark Summit

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Databricks

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Helena Edelson

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020confluent

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson

Apache Deep Learning 201 - Philly Open Source

Apache Deep Learning 201 - Philly Open Source

Apache Deep Learning 201 - Philly Open SourceTimothy Spann

Caching in applications still matters

Caching in applications still matters

Caching in applications still mattersAnthony Dahanne

Contenu connexe

Tendances

Lambda Architecture with Spark

Lambda Architecture with Spark

Lambda Architecture with SparkKnoldus Inc.

IEEE International Conference on Data Engineering 2015

IEEE International Conference on Data Engineering 2015

IEEE International Conference on Data Engineering 2015Yousun Jeong

Lambda architecture on Spark, Kafka for real-time large scale ML

Lambda architecture on Spark, Kafka for real-time large scale ML

Lambda architecture on Spark, Kafka for real-time large scale MLhuguk

How to deploy Apache Spark  to Mesos/DCOS

How to deploy Apache Spark  to Mesos/DCOS

How to deploy Apache Spark  to Mesos/DCOSLegacy Typesafe (now Lightbend)

Bridging the gap of Relational to Hadoop using Sqoop @ Expedia

Bridging the gap of Relational to Hadoop using Sqoop @ Expedia

Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaDataWorks Summit/Hadoop Summit

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira

Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraSpark Summit

How ReversingLabs Serves File Reputation Service for 10B Files

How ReversingLabs Serves File Reputation Service for 10B Files

How ReversingLabs Serves File Reputation Service for 10B FilesScyllaDB

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...DataStax

Performance Troubleshooting Using Apache Spark Metrics

Performance Troubleshooting Using Apache Spark Metrics

Performance Troubleshooting Using Apache Spark MetricsDatabricks

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveDataWorks Summit/Hadoop Summit

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Databricks

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark Summit

Feeding Cassandra with Spark-Streaming and Kafka

Feeding Cassandra with Spark-Streaming and Kafka

Feeding Cassandra with Spark-Streaming and KafkaDataStax Academy

An introduction into Spark ML plus how to go beyond when you get stuck

An introduction into Spark ML plus how to go beyond when you get stuck

An introduction into Spark ML plus how to go beyond when you get stuckData Con LA

Spark Summit EU talk by Mike Percy

Spark Summit EU talk by Mike Percy

Spark Summit EU talk by Mike PercySpark Summit

Spark Summit EU talk by Jorg Schad

Spark Summit EU talk by Jorg Schad

Spark Summit EU talk by Jorg SchadSpark Summit

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Databricks

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Helena Edelson

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020confluent

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson

Tendances (20)

Lambda Architecture with Spark

Lambda Architecture with Spark

Lambda Architecture with Spark

IEEE International Conference on Data Engineering 2015

IEEE International Conference on Data Engineering 2015

IEEE International Conference on Data Engineering 2015

Lambda architecture on Spark, Kafka for real-time large scale ML

Lambda architecture on Spark, Kafka for real-time large scale ML

Lambda architecture on Spark, Kafka for real-time large scale ML

How to deploy Apache Spark  to Mesos/DCOS

How to deploy Apache Spark  to Mesos/DCOS

How to deploy Apache Spark  to Mesos/DCOS

Bridging the gap of Relational to Hadoop using Sqoop @ Expedia

Bridging the gap of Relational to Hadoop using Sqoop @ Expedia

Bridging the gap of Relational to Hadoop using Sqoop @ Expedia

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira

How ReversingLabs Serves File Reputation Service for 10B Files

How ReversingLabs Serves File Reputation Service for 10B Files

How ReversingLabs Serves File Reputation Service for 10B Files

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...

Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...

Performance Troubleshooting Using Apache Spark Metrics

Performance Troubleshooting Using Apache Spark Metrics

Performance Troubleshooting Using Apache Spark Metrics

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive

Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...

Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...

Feeding Cassandra with Spark-Streaming and Kafka

Feeding Cassandra with Spark-Streaming and Kafka

Feeding Cassandra with Spark-Streaming and Kafka

An introduction into Spark ML plus how to go beyond when you get stuck

An introduction into Spark ML plus how to go beyond when you get stuck

An introduction into Spark ML plus how to go beyond when you get stuck

Spark Summit EU talk by Mike Percy

Spark Summit EU talk by Mike Percy

Spark Summit EU talk by Mike Percy

Spark Summit EU talk by Jorg Schad

Spark Summit EU talk by Jorg Schad

Spark Summit EU talk by Jorg Schad

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...

Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020

Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...

Similaire à Tachyon meetup San Francisco Oct 2014

Apache Deep Learning 201 - Philly Open Source

Apache Deep Learning 201 - Philly Open Source

Apache Deep Learning 201 - Philly Open SourceTimothy Spann

Caching in applications still matters

Caching in applications still matters

Caching in applications still mattersAnthony Dahanne

IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...

IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...

IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...In-Memory Computing Summit

Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack

Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack

Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAlluxio, Inc.

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon

Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]

Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]

Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]David Buck

Caching and JCache with Greg Luck 18.02.16

Caching and JCache with Greg Luck 18.02.16

Caching and JCache with Greg Luck 18.02.16Comsysto Reply GmbH

Terraform - Taming Modern Clouds

Terraform - Taming Modern Clouds

Terraform - Taming Modern CloudsNic Jackson

Share point 2013 distributed cache

Share point 2013 distributed cache

Share point 2013 distributed cacheMichael Nokhamzon

Using Snap Clone with Enterprise Manager 12c

Using Snap Clone with Enterprise Manager 12c

Using Snap Clone with Enterprise Manager 12cPete Sharman

Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM

Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM

Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMYahoo!デベロッパーネットワーク

Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31

Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31

Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Timothy Spann

Xap memory xtend-tutorial-2014

Xap memory xtend-tutorial-2014

Xap memory xtend-tutorial-2014Shay Hassidim

Advanced caching techniques with ehcache, big memory, terracotta, and coldfusion

Advanced caching techniques with ehcache, big memory, terracotta, and coldfusion

Advanced caching techniques with ehcache, big memory, terracotta, and coldfusionColdFusionConference

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...Chris Fregly

HTTP/2 Comes to Java - What Servlet 4.0 Means to You

HTTP/2 Comes to Java - What Servlet 4.0 Means to You

HTTP/2 Comes to Java - What Servlet 4.0 Means to YouDavid Delabassee

HDInsight for Architects

HDInsight for Architects

HDInsight for ArchitectsAshish Thapliyal

HotSpotコトハジメYasumasa Suenaga

Streaming Solutions for Real time problems

Streaming Solutions for Real time problems

Streaming Solutions for Real time problemsAbhishek Gupta

Java is Container Ready - Vaibhav - Container Conference 2018

Java is Container Ready - Vaibhav - Container Conference 2018

Java is Container Ready - Vaibhav - Container Conference 2018CodeOps Technologies LLP

Similaire à Tachyon meetup San Francisco Oct 2014 (20)

Apache Deep Learning 201 - Philly Open Source

Apache Deep Learning 201 - Philly Open Source

Apache Deep Learning 201 - Philly Open Source

Caching in applications still matters

Caching in applications still matters

Caching in applications still matters

IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...

IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...

IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...

Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack

Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack

Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]

Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]

Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]

Caching and JCache with Greg Luck 18.02.16

Caching and JCache with Greg Luck 18.02.16

Caching and JCache with Greg Luck 18.02.16

Terraform - Taming Modern Clouds

Terraform - Taming Modern Clouds

Terraform - Taming Modern Clouds

Share point 2013 distributed cache

Share point 2013 distributed cache

Share point 2013 distributed cache

Using Snap Clone with Enterprise Manager 12c

Using Snap Clone with Enterprise Manager 12c

Using Snap Clone with Enterprise Manager 12c

Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM

Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM

Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM

Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31

Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31

Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31

Xap memory xtend-tutorial-2014

Xap memory xtend-tutorial-2014

Xap memory xtend-tutorial-2014

Advanced caching techniques with ehcache, big memory, terracotta, and coldfusion

Advanced caching techniques with ehcache, big memory, terracotta, and coldfusion

Advanced caching techniques with ehcache, big memory, terracotta, and coldfusion

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...

HTTP/2 Comes to Java - What Servlet 4.0 Means to You

HTTP/2 Comes to Java - What Servlet 4.0 Means to You

HTTP/2 Comes to Java - What Servlet 4.0 Means to You

HDInsight for Architects

HDInsight for Architects

HDInsight for Architects

HotSpotコトハジメ

Streaming Solutions for Real time problems

Streaming Solutions for Real time problems

Streaming Solutions for Real time problems

Java is Container Ready - Vaibhav - Container Conference 2018

Java is Container Ready - Vaibhav - Container Conference 2018

Java is Container Ready - Vaibhav - Container Conference 2018

Dernier

Microscopic Analysis of Ceramic Materials.pptx

Microscopic Analysis of Ceramic Materials.pptx

Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile

Introduction to Multiple Access Protocol.pptx

Introduction to Multiple Access Protocol.pptx

Introduction to Multiple Access Protocol.pptxupamatechverse

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

Coefficient of Thermal Expansion and their Importance.pptx

Coefficient of Thermal Expansion and their Importance.pptx

Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

KubeKraft presentation @CloudNativeHooghly

KubeKraft presentation @CloudNativeHooghly

KubeKraft presentation @CloudNativeHooghlysanyuktamishra911

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Water Industry Process Automation & Control Monthly - April 2024

Water Industry Process Automation & Control Monthly - April 2024

Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control

SPICE PARK APR2024 ( 6,793 SPICE Models )

SPICE PARK APR2024 ( 6,793 SPICE Models )

SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat

Roadmap to Membership of RICS - Pathways and Routes

Roadmap to Membership of RICS - Pathways and Routes

Roadmap to Membership of RICS - Pathways and RoutesM Maged Hegazy, LLM, MBA, CCP, P3O

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth

Dernier (20)

Microscopic Analysis of Ceramic Materials.pptx

Microscopic Analysis of Ceramic Materials.pptx

Microscopic Analysis of Ceramic Materials.pptx

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

Introduction to Multiple Access Protocol.pptx

Introduction to Multiple Access Protocol.pptx

Introduction to Multiple Access Protocol.pptx

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

Coefficient of Thermal Expansion and their Importance.pptx

Coefficient of Thermal Expansion and their Importance.pptx

Coefficient of Thermal Expansion and their Importance.pptx

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

KubeKraft presentation @CloudNativeHooghly

KubeKraft presentation @CloudNativeHooghly

KubeKraft presentation @CloudNativeHooghly

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Water Industry Process Automation & Control Monthly - April 2024

Water Industry Process Automation & Control Monthly - April 2024

Water Industry Process Automation & Control Monthly - April 2024

SPICE PARK APR2024 ( 6,793 SPICE Models )

SPICE PARK APR2024 ( 6,793 SPICE Models )

SPICE PARK APR2024 ( 6,793 SPICE Models )

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

Roadmap to Membership of RICS - Pathways and Routes

Roadmap to Membership of RICS - Pathways and Routes

Roadmap to Membership of RICS - Pathways and Routes

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Tachyon meetup San Francisco Oct 2014

1. Tachyon in xPatterns Tachyon Meetup SF 1 Oct 2014

2. Agenda • xPatterns architecture • BDAS++ • Demos • Tachyon internals (apis) • Lessons learned 2

4. BDAS++ • Tachyon patch (https://github.com/amplab/tachyon/pull/482) • Jaws, xPatterns http spark sql server http://github.com/Atigeo/http-spark-sql- 4 server  Backward compatible with Shark and Spark 0.x stack • Spark Job Server  multiple Spark contexts in same JVM, job submission in Java + Scala https://github.com/Atigeo/spark-job-rest • Mesos framework starvation bug  submitted patch… detailed Tech Blog link soon at http://xpatterns.com • *SchedulerBackend update, 0.9.0 patches (shuffle spill, Mesos fine-grained)

6. Lessons learned • partial in-memory file storage bug • journal file on hdfs -> backup of local master disk • hdfs api • RawTable in Shark • persist(OFF_HEAP) temporary storage • RDD.persist() OFF_HEAP > MEMORY_SER_AND_DISK • native API: getInStream(CACHE|NO_CACHE) -> local workers • do not evict blocks when streaming to Tachyon/hdfs • Tachyon > Spark JVM Cache for long running jobs • kryo/defaultCodec/sequenceFile format to minimize memory footprint • 25million emails/month 2TB, 3-45 nodes, 120-170GB of RAM for Tachyon 6

7. Q & A 7

8. © 2013 Atigeo, LLC. All rights reserved. Atigeo and the xPatterns logo are trademarks of Atigeo. The information herein is for informational purposes only and represents the current view of Atigeo as of the date of this presentation. Because Atigeo must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Atigeo, and Atigeo cannot guarantee the accuracy of any information provided after the date of this presentation. ATIGEO MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Notes de l'éditeur

The physical architecture diagram for our largest customer deployment, demonstrating the enterprise-grade attributes of the platform: scalability, high availability, performance, resilience, manageability while providing means for geo-failover (warehouse), geo-replication (real-time DB), data and system monitoring, instrumentation, backup & restore. Cassandra rings are DC-replicated across EC2 east and west coast regions, data between geo-replicas synchronized in real time through an ipsec tunnel (VPC-to-VPC). Geo-replicated apis behind an AWS Route 53 DNS service (latency based resource records sets) and ELBs ensures users requests are served from the closest geographical location. Failure to an entire region (happened to us during a big conference!) does not affect our availability and SLAs. User facing dashboards are served from Cassandra (real-time store), with data being exported from a data warehouse (Shark/Hive) build on top a Mesos-managed Spark/Hadoop cluster. Export jobs are instrumented and provide a throttling mechanism to control throughput. Export jobs run on the east-coast only, data is synchronized in real time with the west coast ring. Generated apis are automatically instrumented (Graphite) and monitored (Nagios).
Referral Provider Network: one of the 6 applications that we built for our healthcare customer using the xPatterns APIs and tools on the new beyond Hadoop infrastructure: ELT Pipeline, Export to NoSQL API. The dashboard for the RPN application was built using D3.js and angular against the generic api published by the export tool. The application allows for building a graph of downstream and upstream referred and referring providers, grouped by specialty and with computed aggregates like patient counts, claim counts and total charged amounts. RPN is used for both fraud detection and for aiding a clinic buying decision, by following the busiest graph paths. The dataset behind the app consists of 8 billion medical records, from which we extracted 1.7 million providers (Shark warehouse) and built 53 million relationships in the graph (persisted in Cassandra)