SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
@iconara
CHASING THE ELEPHANT
Theo / @iconara
chief architect at BURT
big data analytics with JRuby
RUBY
RUBY
RUBY JRUBY
JRUBY IS AWESOME
BECAUSE RUBY IS GREAT,
AND THE JVM IS GREAT
hot_bunnies, eurydice, multimeter, mikka, msgpack-jruby
HADOOP
JRUBY SUPERCOMPUTING
40 TiB data, 120 EC2 cc2.8xlarge, 1920 cores, 7260 GiB RAM
JAVA ALL THE WAY DOWN
HADOOP STREAMING
Wukong, Dumbo
RUBYDOOP
rubydoop.org
+ =
main()
Class.forName("...")
Class.forName("...")
module WordCount
class Mapper
def map(key, value, context)
value.to_s.downcase.split.each do |word|
key = Hadoop::Io::Text.new(word)
value = Hadoop::Io::IntWritable.new(1)
context.write(key, value)
end
end
end
end
RUBYDOOP IS LOW LEVEL
I would love to see someone write something
like Scalding or Cascading on top of it
RUBYDOOP
rubydoop.org
RUBYDOOP
rubydoop.org
v1.1.0
KTHXBAI
@iconara
github.com/iconara
architecturalatrocities.com
burtcorp.com

Contenu connexe

En vedette

Standing on the shoulders of giants with JRuby
Standing on the shoulders of giants with JRubyStanding on the shoulders of giants with JRuby
Standing on the shoulders of giants with JRubyTheo Hultberg
 
Shortcuts around the mistakes I've made scaling MongoDB
Shortcuts around the mistakes I've made scaling MongoDB Shortcuts around the mistakes I've made scaling MongoDB
Shortcuts around the mistakes I've made scaling MongoDB Theo Hultberg
 
Evaluation question 2
Evaluation question 2Evaluation question 2
Evaluation question 2Connor27
 
Karl murdock POL
Karl murdock POLKarl murdock POL
Karl murdock POLmurdockka
 
Learning to build distributed systems the hard way
Learning to build distributed systems the hard wayLearning to build distributed systems the hard way
Learning to build distributed systems the hard wayTheo Hultberg
 
La republic dominicana
La republic dominicanaLa republic dominicana
La republic dominicanacielonicole
 

En vedette (7)

Standing on the shoulders of giants with JRuby
Standing on the shoulders of giants with JRubyStanding on the shoulders of giants with JRuby
Standing on the shoulders of giants with JRuby
 
Shortcuts around the mistakes I've made scaling MongoDB
Shortcuts around the mistakes I've made scaling MongoDB Shortcuts around the mistakes I've made scaling MongoDB
Shortcuts around the mistakes I've made scaling MongoDB
 
Oplag3
Oplag3Oplag3
Oplag3
 
Evaluation question 2
Evaluation question 2Evaluation question 2
Evaluation question 2
 
Karl murdock POL
Karl murdock POLKarl murdock POL
Karl murdock POL
 
Learning to build distributed systems the hard way
Learning to build distributed systems the hard wayLearning to build distributed systems the hard way
Learning to build distributed systems the hard way
 
La republic dominicana
La republic dominicanaLa republic dominicana
La republic dominicana
 

Similaire à Chasing the Elephant with JRuby and Big Data

Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...IndicThreads
 
Mac ruby to the max - Brendan G. Lim
Mac ruby to the max - Brendan G. LimMac ruby to the max - Brendan G. Lim
Mac ruby to the max - Brendan G. LimThoughtWorks
 
MacRuby to The Max
MacRuby to The MaxMacRuby to The Max
MacRuby to The MaxBrendan Lim
 
Macruby& Hotcocoa presentation by Rich Kilmer
Macruby& Hotcocoa presentation by Rich KilmerMacruby& Hotcocoa presentation by Rich Kilmer
Macruby& Hotcocoa presentation by Rich KilmerMatt Aimonetti
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightGert Drapers
 
GC in Ruby. RubyC, Kiev, 2014.
GC in Ruby. RubyC, Kiev, 2014.GC in Ruby. RubyC, Kiev, 2014.
GC in Ruby. RubyC, Kiev, 2014.Timothy Tsvetkov
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKSkills Matter
 
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...Databricks
 
Scalding Presentation
Scalding PresentationScalding Presentation
Scalding PresentationLandoop Ltd
 
PySpark with Juypter
PySpark with JuypterPySpark with Juypter
PySpark with JuypterLi Ming Tsai
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
CouchDB and Rails on the Cloud
CouchDB and Rails on the CloudCouchDB and Rails on the Cloud
CouchDB and Rails on the Cloudrockyjaiswal
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Holden Karau
 
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...AMD Developer Central
 
Developing cross platform desktop application with Ruby
Developing cross platform desktop application with RubyDeveloping cross platform desktop application with Ruby
Developing cross platform desktop application with RubyAnis Ahmad
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourPeter Friese
 
An Overview of Apache Spark
An Overview of Apache SparkAn Overview of Apache Spark
An Overview of Apache SparkYasoda Jayaweera
 

Similaire à Chasing the Elephant with JRuby and Big Data (20)

Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
 
Hadoop Jungle
Hadoop JungleHadoop Jungle
Hadoop Jungle
 
Mac ruby to the max - Brendan G. Lim
Mac ruby to the max - Brendan G. LimMac ruby to the max - Brendan G. Lim
Mac ruby to the max - Brendan G. Lim
 
MacRuby to The Max
MacRuby to The MaxMacRuby to The Max
MacRuby to The Max
 
Macruby& Hotcocoa presentation by Rich Kilmer
Macruby& Hotcocoa presentation by Rich KilmerMacruby& Hotcocoa presentation by Rich Kilmer
Macruby& Hotcocoa presentation by Rich Kilmer
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsight
 
GC in Ruby. RubyC, Kiev, 2014.
GC in Ruby. RubyC, Kiev, 2014.GC in Ruby. RubyC, Kiev, 2014.
GC in Ruby. RubyC, Kiev, 2014.
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
 
Scala+data
Scala+dataScala+data
Scala+data
 
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
 
Scalding Presentation
Scalding PresentationScalding Presentation
Scalding Presentation
 
PySpark with Juypter
PySpark with JuypterPySpark with Juypter
PySpark with Juypter
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Up and running with pyspark
Up and running with pysparkUp and running with pyspark
Up and running with pyspark
 
CouchDB and Rails on the Cloud
CouchDB and Rails on the CloudCouchDB and Rails on the Cloud
CouchDB and Rails on the Cloud
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018
 
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
 
Developing cross platform desktop application with Ruby
Developing cross platform desktop application with RubyDeveloping cross platform desktop application with Ruby
Developing cross platform desktop application with Ruby
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
 
An Overview of Apache Spark
An Overview of Apache SparkAn Overview of Apache Spark
An Overview of Apache Spark
 

Plus de Theo Hultberg

AWS Cost Optimization
AWS Cost OptimizationAWS Cost Optimization
AWS Cost OptimizationTheo Hultberg
 
Cassandra for all the Things
Cassandra for all the ThingsCassandra for all the Things
Cassandra for all the ThingsTheo Hultberg
 
Building a CQL driver
Building a CQL driverBuilding a CQL driver
Building a CQL driverTheo Hultberg
 
Learning to build distributed systems the hard way
Learning to build distributed systems the hard wayLearning to build distributed systems the hard way
Learning to build distributed systems the hard wayTheo Hultberg
 
Learning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard WayLearning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard WayTheo Hultberg
 
A Guide to the Post Relational Revolution
A Guide to the Post Relational RevolutionA Guide to the Post Relational Revolution
A Guide to the Post Relational RevolutionTheo Hultberg
 
Concurrency and Distributed Systems Using JRuby
Concurrency and Distributed Systems Using JRubyConcurrency and Distributed Systems Using JRuby
Concurrency and Distributed Systems Using JRubyTheo Hultberg
 

Plus de Theo Hultberg (7)

AWS Cost Optimization
AWS Cost OptimizationAWS Cost Optimization
AWS Cost Optimization
 
Cassandra for all the Things
Cassandra for all the ThingsCassandra for all the Things
Cassandra for all the Things
 
Building a CQL driver
Building a CQL driverBuilding a CQL driver
Building a CQL driver
 
Learning to build distributed systems the hard way
Learning to build distributed systems the hard wayLearning to build distributed systems the hard way
Learning to build distributed systems the hard way
 
Learning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard WayLearning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard Way
 
A Guide to the Post Relational Revolution
A Guide to the Post Relational RevolutionA Guide to the Post Relational Revolution
A Guide to the Post Relational Revolution
 
Concurrency and Distributed Systems Using JRuby
Concurrency and Distributed Systems Using JRubyConcurrency and Distributed Systems Using JRuby
Concurrency and Distributed Systems Using JRuby
 

Dernier

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Dernier (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Chasing the Elephant with JRuby and Big Data