SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Data Science With R
~ for ~
Java Developers
@Sander_Mak
Agenda
Data Science
The R language
Gimme some Java!
1
1
1
1
11
1
1
0
0
0
0
0
0
90% of the world’s data was
produced in the last 2 years
- SINTEF/ScienceDaily June 2013!!!!!!!!
We need more than
just CRUD
Stand back.
I know Data Science!
Software
Engineering
Domain
Expertise
Math &
Statistics
Data
Science
Machine
Learning
Operations
Research
Danger!
Perl ahead!
Software
Engineering
Domain
Expertise
Math &
Statistics
Data
Science
Machine
Learning
Operations
Research
Danger!
Perl ahead!
Data Science:
Achievement Unlocked
R, R-Studio
Today
Data Science:
Achievement Unlocked
Agenda
Data Science
The R language
Gimme some Java!
1
1
1 1
11
1
1
0
0
0
0
0
0
Language
Designers
Statisticians
Language
Designers?
Statisticians?
The best thing about R is that it was developed by statisticians. The
worst thing about R is that... it was developed by statisticians.
- Bo Cowgill, Google
Why R, then?
Open Source
De-facto standard (in statistical research)
“It’s a DSL posing as general purpose language”
Interactive data exploration
Why not R, then?
Slow
Memory Bound
(Did I mention it’s a quirky language?)
Try googling for R...
Why not R, then?
‘If you are using R and you think
you’re in hell, this is a map for you.’
- The R Inferno
Slow
Memory Bound
(Did I mention it’s a quirky language?)
Try googling for R...
Apparently, statisticians aren’t designers, either...
VS
Dynamic (eval)
Interpreted
Static types
Compiled
Functional/OO/Procedural OO
Factor Enum
numeric
character String
Integer/Double/...
Factor Enum
numeric
character String
vector
list
dataframe
Integer/Double/...
1-based 0-based1
2
3
4
0
1
2
3
1-based 0-based1
2
3
4
0
1
2
3
for-loops
higher-order functions
sapply(vec, function(elm) {
elm + 1;
})
eager evalutationlazy evaluation
eager evalutationlazy evaluation
pass-by-value
(copy-on-write)
pass-by-reference
Function F
Value A Value A
Value A’
call F(A) modify
Studio
Central
Comprehensive
R
Archive
Network
Studio
Coding time!
Titanic Competition:
Machine Learning from Disaster
Titanic Competition:
Machine Learning from Disaster
Titanic Competition:
Machine Learning from Disaster
Sex == Female
Decision Tree
Age > 50Age > 16
Fare > 100
T FT T F
Titanic Competition:
Machine Learning from Disaster
Sex == Female
Decision Tree
Age > 50Age > 16
Random Forest
Fare > 100
T FT T F
T
FT T FT
FT T F
T
FT T FT
FT T F
Demo time!
...
...
Agenda
Data Science
The R language
Gimme some Java!
1
1
1 1
11
1
1
0
0
0
0
0
0
Bridging R and Java
Integrate
Assimilate
Replace
rJava & Java/R interface
Integrate
Two way native interface
- JNI: libjri
- or TCP to RServe
Rengine re = new Rengine(new String[] {}, false, null);
// wait until engine is ready
if (!re.waitForR()) {
throw new IllegalStateException(“Can’t load R engine”);
}
re.eval("data(cars)", false);
REXP cars = re.eval("cars");
RVector carsVector = cars.asVector();
// dissect carsVector...
Assimilate
Reimplementation of R on JVM
Fast & lean
Parallelized
Just-another-lib
... not production ready yet...
Assimilate
// create a script engine manager
ScriptEngineManager factory =
new ScriptEngineManager();
// create an R engine
ScriptEngine engine =
factory.getEngineByName("Renjin");
// load package from classpath
engine.eval(“library(survey)");
// evaluate R code from String
engine.eval("print('Hello from R')");
Reimplementation of R on JVM
Fast & lean
Parallelized
Just-another-lib
... not production ready yet...
Reimplementation of R on JVM
Share data:
Integer[] data = {1, 2, 3};
engine.put("data", data);
engine.eval("print(sum(data))");
Assimilate
Reimplementation of R on JVM
Share data:
import(com.foo.User)
# instantiate Java beans
tim <- User$new(name='Tim', age=23)
tom <- User$new(name='Tom', age=45)
# invoke setter
tim$name <- "Timmy"
Use Java from Renjin:
Integer[] data = {1, 2, 3};
engine.put("data", data);
engine.eval("print(sum(data))");
Assimilate
Big Data?
Replace
JVM Libraries/platforms
Replace
Scalable R distributions
(non-JVM)
Revolution Analytics
Oracle Enterprise R
Wrap-up
Data Science
The R language
Gimme some Java!
1
1
1 1
11
1
1
0
0
0
0
0
0
Sanitize
Explore
Model Predict
Scale
Next steps
Computing for Data Analysis
starts Sept. 23rd
Install R Read
Questions?
Data Science
The R language
Gimme some Java!
1
1
1 1
1
1
110
0
0
0
0
0
@Sander_Mak
branchandbound.net

Contenu connexe

Plus de Sander Mak (@Sander_Mak)

TypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the painTypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the painSander Mak (@Sander_Mak)
 
The Ultimate Dependency Manager Shootout (QCon NY 2014)
The Ultimate Dependency Manager Shootout (QCon NY 2014)The Ultimate Dependency Manager Shootout (QCon NY 2014)
The Ultimate Dependency Manager Shootout (QCon NY 2014)Sander Mak (@Sander_Mak)
 
Cross-Build Injection attacks: how safe is your Java build?
Cross-Build Injection attacks: how safe is your Java build?Cross-Build Injection attacks: how safe is your Java build?
Cross-Build Injection attacks: how safe is your Java build?Sander Mak (@Sander_Mak)
 
Hibernate Performance Tuning (JEEConf 2012)
Hibernate Performance Tuning (JEEConf 2012)Hibernate Performance Tuning (JEEConf 2012)
Hibernate Performance Tuning (JEEConf 2012)Sander Mak (@Sander_Mak)
 
Java 7: Fork/Join, Invokedynamic and the future
Java 7: Fork/Join, Invokedynamic and the futureJava 7: Fork/Join, Invokedynamic and the future
Java 7: Fork/Join, Invokedynamic and the futureSander Mak (@Sander_Mak)
 

Plus de Sander Mak (@Sander_Mak) (20)

Modules or microservices?
Modules or microservices?Modules or microservices?
Modules or microservices?
 
Migrating to Java 9 Modules
Migrating to Java 9 ModulesMigrating to Java 9 Modules
Migrating to Java 9 Modules
 
Java 9 Modularity in Action
Java 9 Modularity in ActionJava 9 Modularity in Action
Java 9 Modularity in Action
 
Java modularity: life after Java 9
Java modularity: life after Java 9Java modularity: life after Java 9
Java modularity: life after Java 9
 
Provisioning the IoT
Provisioning the IoTProvisioning the IoT
Provisioning the IoT
 
Event-sourced architectures with Akka
Event-sourced architectures with AkkaEvent-sourced architectures with Akka
Event-sourced architectures with Akka
 
TypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the painTypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the pain
 
The Ultimate Dependency Manager Shootout (QCon NY 2014)
The Ultimate Dependency Manager Shootout (QCon NY 2014)The Ultimate Dependency Manager Shootout (QCon NY 2014)
The Ultimate Dependency Manager Shootout (QCon NY 2014)
 
Modular JavaScript
Modular JavaScriptModular JavaScript
Modular JavaScript
 
Modularity in the Cloud
Modularity in the CloudModularity in the Cloud
Modularity in the Cloud
 
Cross-Build Injection attacks: how safe is your Java build?
Cross-Build Injection attacks: how safe is your Java build?Cross-Build Injection attacks: how safe is your Java build?
Cross-Build Injection attacks: how safe is your Java build?
 
Scala & Lift (JEEConf 2012)
Scala & Lift (JEEConf 2012)Scala & Lift (JEEConf 2012)
Scala & Lift (JEEConf 2012)
 
Hibernate Performance Tuning (JEEConf 2012)
Hibernate Performance Tuning (JEEConf 2012)Hibernate Performance Tuning (JEEConf 2012)
Hibernate Performance Tuning (JEEConf 2012)
 
Akka (BeJUG)
Akka (BeJUG)Akka (BeJUG)
Akka (BeJUG)
 
Fork Join (BeJUG 2012)
Fork Join (BeJUG 2012)Fork Join (BeJUG 2012)
Fork Join (BeJUG 2012)
 
Fork/Join for Fun and Profit!
Fork/Join for Fun and Profit!Fork/Join for Fun and Profit!
Fork/Join for Fun and Profit!
 
Kscope11 recap
Kscope11 recapKscope11 recap
Kscope11 recap
 
Java 7: Fork/Join, Invokedynamic and the future
Java 7: Fork/Join, Invokedynamic and the futureJava 7: Fork/Join, Invokedynamic and the future
Java 7: Fork/Join, Invokedynamic and the future
 
Scala and Lift
Scala and LiftScala and Lift
Scala and Lift
 
Elevate your webapps with Scala and Lift
Elevate your webapps with Scala and LiftElevate your webapps with Scala and Lift
Elevate your webapps with Scala and Lift
 

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Data Science with R for Java developers