SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
INTRODUCTION TO R
AGENDA
• History and evolution of R
• Principle and software paradigm
• Description of R interface
• Advantages of R
• Drawbacks of R
• So why use R?
• References for learning R
Origin in the Bell Labs in the 1970’s
HISTORY AND EVOLUTION OF R
R has developed from the S language
HISTORY AND EVOLUTION OF R
SVersion 1
SVersion 4
SVersion 3
SVersion 2
Developed 30 years ago for research
applied to the high-tech industry
1990’s: R developed concurrently
with S
1993: R made public
The regular development of R
HISTORY AND EVOLUTION OF R
Acceleration of R development
 R-Help and R-Devl mailing-lists
 Creation of the R Core Group
Source: R Journal Vol 1/2
Growing number of packages
HISTORY AND EVOLUTION OF R
2001: ~100 packages
2009: Over 2000 packages
Source: R Journal Vol 1/2
2000: R version 1.0.1
Today: R version 2.14
Explosion of R popularity in the last decade
HISTORY AND EVOLUTION OF R
 Object-oriented, growing user base, scripting features
 Free and open-source
 Irrational reasons: R seen as « cool »
Comparison of Mailing Lists
HISTORY AND EVOLUTION OF R
Evolution of the traffic on software main mailing-lists. Source: R.A. Muenchen, r4stats.com
Popularity amongst programming languages
HISTORY AND EVOLUTION OF R
KD Nuggets 2012 survey
Number of Blogs
HISTORY AND EVOLUTION OF R
Software Number of Blogs
R 365
SAS 40
Stata 8
Others 0-3
Data as on Mar 2012
AGENDA
• History and evolution of R
• Principle and software paradigm
• Description of R interface
• Advantages of R
• Drawbacks of R
• So why using R?
• References for learning R
 R is rather a programming language
 Limited user-friendly interfaces for data analysis
 Is object oriented and almost non declarative
 Similar to programming languages like Fortran, C, Java, Python
R is not really a (statistical) software
PRINCIPLE AND SOFTWARE PARADIGM
Recent endeavours to enhance R user-friendliness
R has limited Graphical User Interface (GUI) options
PRINCIPLE AND SOFTWARE PARADIGM
Several GUIs in development
R-commander
RKWard
Rattle
R Commander (RCmdr)
PRINCIPLE AND SOFTWARE PARADIGM
RKWard
PRINCIPLE AND SOFTWARE PARADIGM
Rattle
PRINCIPLE AND SOFTWARE PARADIGM
Inherent limitations of pervasive Excel-like spreadsheets
PRINCIPLE AND SOFTWARE PARADIGM
VS.
Sophisticated but costly SAS
PRINCIPLE AND SOFTWARE PARADIGM
VS.
Screenshot of SAS enteprise Miner
7.1. Source: sas.com
AGENDA
• History and evolution of R
• Principle and software paradigm
• Description of R interface
• Advantages of R
• Drawbacks of R
• So why using R?
• References for learning R
R console
DESCRIPTION OF R INTERFACE
R desktop
shortcut
RGui: R basic
interface
R command
line (space to
write
instructions)
Using the command line in R console
DESCRIPTION OF R INTERFACE
First false sentence
followed by R’s
error message
Second correct
sentence
Declaration and
printing of the
sentence as a R
object
Simple math
computations
Basic information
about the R object
containing the
sentence
RGui menu: File tab
DESCRIPTION OF R INTERFACE
File tab: Usual basic
and general
operations
RGui menu: Edit tab
DESCRIPTION OF R INTERFACE
Edit tab: basic
and general
editing
Results of the
data editor
Data editor:
entering the
object’s name
RGui menu: View tab
DESCRIPTION OF R INTERFACE
View tab: viewing
Toolbar and/or
Status bar
RGui menu: Misc tab
DESCRIPTION OF R INTERFACE
Misc tab:
diverse
operations
RGui menu: Packages tabs
DESCRIPTION OF R INTERFACE
Packages tab:
adding functions
to R foundation
RGui menu: Windows tab
DESCRIPTION OF R INTERFACE
Windows tab:
usual options
to arrange the
tiles
RGui menu: Help tab
DESCRIPTION OF R INTERFACE
Help tab: very
important links
to help
AGENDA
• History and evolution of R
• Principle and software paradigm
• Description of R interface
• Advantages of R
• Drawbacks of R
• So why using R?
• References for learning R
 Open source code
 You can access the code of the software
 In-depth understanding of what R does
 Modify the code
R “philosophy”
ADVANTAGES OF R
Screenshot of the CRAN webpage of the « mgcv » package. Source: CRAN
Adress of the
« mgcv » package
Link with Package
sources (.tar.gz
file)
Example “mgcv”
package webpage
Example of source code of the “mgcv” package
R access to source code
ADVANTAGES OF R
Screenshot of unzipping the « mgcv » package and browsing through the package’s files.
Unzipping
mgcv_1.7-13.tar.gz
file (with 7zip)
List of directories
in the « mgcv »
package
List of functions (i.e
open code) in the « src »
(i.e code sources)
directory the « mgcv »
package1 2 3
R is free
ADVANTAGES OF R
Software Academics Demo Commercial
(basic)
Commercial
(full)
R Free Free Free Free
SAS Free to $100s Not available $1 000s $10 000s
Statistica $100s 30 days limit ~$1 000 $10 000
Excel
(Microsoft)
Free to $10s Limited ~$100 $100s
SPSS (IBM) $100s 14 days limit ~$2 000 $1 000s
Interface with other languages and scripting capabilities
ADVANTAGES OF R
Screenshot of the file « mgcv.c » of the « mgcv » package open in WordPad
« mgcv.c » file
in the
« mgcv »
package
coded in
typical C
programming
language
Interfaces with virtually any other programming language
 Fortran, C, C++, Python…
 Tailor or rewrite your old codes in R
R as a scripting language
 R scripts can launch or be launched by other languages
R visualization capabilities
ADVANTAGES OF R
R visualization capabilities
ADVANTAGES OF R
R visualization capabilities
ADVANTAGES OF R
 R ~ tool used by the finest researchers
 Top-notch analytics capabilities
R role in academia
ADVANTAGES OF R
Screenshot of a user’s Facebook map . Source: Paul Butler/Facebook, DG Rossiter, spatialanalysis.co.uk
Free open source philosophy
To summarize
ADVANTAGES OF R
 R websites with many examples
 Free books
 Free online open courses
 Twitter accounts
Online help and discussion
 Mailing-lists
 Very active and diverse forums
 Communities of developers and helpers
AGENDA
• History and evolution of R
• Principle and software paradigm
• Description of R interface
• Advantages of R
• Drawbacks of R
• So why using R?
• References for learning R
Poor management of large datasets
 Avoid imbricated loops
 Prefer R advanced language for data structure
Average memory performance
DRAWBACKS OF R
Complicated structure of packages in R
 Dozen of packages
 To be loaded every time in memory
R packages to better manage memory
 Rhadoop (inspiration from Google)
 Ff
 bigmemory
No default parallel execution
 R packages to use several cores
 Top skills needed for high performance computing
Average computing performance
DRAWBACKS OF R
A high-level programming language
 Abstract and modern (Python…)
 More productive coding
 But further from « machine language »…
 … meaning 100 times slower than C
Difficult to inspect data sets
Difficult data visualization and management
DRAWBACKS OF R
Screenshot of the R data editor and « Viewtable » tab in SAS 9.3
Problems for large organizations
 R made of several thousands independent packages
 No deployment plan for complex organizations
 No installation support
Difficult architecture management
DRAWBACKS OF R
Lack of code accountability
 Thousands of individual independent R developers
 Nobody responsible for the quality of the code
Potentially high hidden costs with R
 Total cost may favour commercial solutions for complex computations made in large
corporations
Steep learning curve
 R code far from undergrad computer science courses
 Very complex data structures (useful if mastered)
 Is R’s syntax not logical?
Relatively difficult to learn
DRAWBACKS OF R
Still, not more difficult to learn than SAS
 Both SAS and R more abstract than basic programming languages (Fortran, C…)
 Difficult to learn = more rewarding professionally!!
AGENDA
• History and evolution of R
• Principle and software paradigm
• Description of R interface
• Advantages of R
• Drawbacks of R
• So why use R?
• References for learning R
No language is perfect!!
 Contradictory objectives to meet
 Strengths and weaknesses of each language
More positive than negative points
SO WHY LEARN R?
Different needs imply different tools
 Large corporations + defined procedures  SAS-like
 Less financial resources + quick proof of concept  R
Effect of legacy and the culture of the organization
 Use existing solutions (system architecture, BA tools…)
 Habits in business analytics
Very appealing solution
SO WHY LEARN R?
Popularity of business analytics software (green = very popular, red = unpopular). Source: Rexer Analytics
Overall Corporate Consultants Academics NGO/Gov't
R
SAS
IBMSPSS
STATISTICA
Owncode
AGENDA
• History and evolution of R
• Principle and software paradigm
• Description of R interface
• Advantages of R
• Drawbacks of R
• So why using R?
• References for learning R
Many books available: choose the one that fits you!
 Style, pedagogy, theory vs practice
 Browse several books at local library or store
Books
REFERENCES FOR LEARNING R
Springer’s UseR! Series (http://www.springer.com/series/6991)
 Recent, concise, good quality, affordable, diverse
Pure rookies: « A beginners’ guide to R », « R by example »
One step forward: « Business analytics for managers »
Intensive Excel users: « R through Excel »
O’Reilly R series (for programmers)
« R cookbook », « R in a nuttshell »
Websites
REFERENCES FOR LEARNING R
R official websites
 The R project for statistical computing (www.r-project.org )
 Mailing lists (« R-help », Special Interest Groups) and R journal
 Official (austere) manuals (« An introduction to R »)
Other websites
 UCLA online R resources http://www.ats.ucla.edu/stat/r/)
 R blogs aggregator (www.r-bloggers.com)
 Social networks: LinkedIn groups (The R project for statistical computing), Twitter accounts
(@RevolutionR, @inside_R), jobboards (Analytical Bridge…)
Growing number of conferences about R
Conferences
REFERENCES FOR LEARNING R
 Annual during a few days in new venue (Google it!)
 Lots of materials about many topics
Other conferences or venues
 Conferences about business analytics (data mining, specialized topics…) with sessions
involving R
 Find (or even start!) a R user group close to your location (R Wiki geographical list, map of
groups on « meetup.com »)
 Events and news from R-bloggers blog
Official International R UseR! conference

Contenu connexe

Tendances

R Programming Language
R Programming LanguageR Programming Language
R Programming LanguageNareshKarela1
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R StudioRupak Roy
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)PyData
 
Data Visualization in Python
Data Visualization in PythonData Visualization in Python
Data Visualization in PythonJagriti Goswami
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using RVictoria López
 
Data Types and Structures in R
Data Types and Structures in RData Types and Structures in R
Data Types and Structures in RRupak Roy
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with RKazuki Yoshida
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingVictor Ordu
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programmingNimrita Koul
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandasPiyush rai
 
R programming Fundamentals
R programming  FundamentalsR programming  Fundamentals
R programming FundamentalsRagia Ibrahim
 
Data visualization using R
Data visualization using RData visualization using R
Data visualization using RUmmiya Mohammedi
 
2 R Tutorial Programming
2 R Tutorial Programming2 R Tutorial Programming
2 R Tutorial ProgrammingSakthi Dasans
 
Data Structures in Python
Data Structures in PythonData Structures in Python
Data Structures in PythonDevashish Kumar
 

Tendances (20)

R Programming Language
R Programming LanguageR Programming Language
R Programming Language
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)
 
Data Visualization in Python
Data Visualization in PythonData Visualization in Python
Data Visualization in Python
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
 
Data Types and Structures in R
Data Types and Structures in RData Types and Structures in R
Data Types and Structures in R
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with R
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
R programming
R programmingR programming
R programming
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programming
 
Getting Started with R
Getting Started with RGetting Started with R
Getting Started with R
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
R programming Fundamentals
R programming  FundamentalsR programming  Fundamentals
R programming Fundamentals
 
Data visualization using R
Data visualization using RData visualization using R
Data visualization using R
 
An introduction to R
An introduction to RAn introduction to R
An introduction to R
 
2 R Tutorial Programming
2 R Tutorial Programming2 R Tutorial Programming
2 R Tutorial Programming
 
Data Management in R
Data Management in RData Management in R
Data Management in R
 
Data Structures in Python
Data Structures in PythonData Structures in Python
Data Structures in Python
 
Data Mining with R programming
Data Mining with R programmingData Mining with R programming
Data Mining with R programming
 
R programming
R programmingR programming
R programming
 

Similaire à Class ppt intro to r

R as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationR as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationAlvaro Gil
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programminghemasri56
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with REdureka!
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with REdureka!
 
R programming language
R programming languageR programming language
R programming languageKeerti Verma
 
Introduction to R and Installation.pptx
Introduction  to R and Installation.pptxIntroduction  to R and Installation.pptx
Introduction to R and Installation.pptxDhanshyam Mahavadi
 
R meet up slides.pptx
R meet up slides.pptxR meet up slides.pptx
R meet up slides.pptxCorey Sparks
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTHaritikaChhatwal1
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document usefulssuser3c3f88
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of RAnalyticsWeek
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programmingRamon Salazar
 

Similaire à Class ppt intro to r (20)

R programming
R programmingR programming
R programming
 
Reason To learn & use r
Reason To learn & use rReason To learn & use r
Reason To learn & use r
 
R as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationR as supporting tool for analytics and simulation
R as supporting tool for analytics and simulation
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with R
 
R introduction
R introductionR introduction
R introduction
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with R
 
R programming language
R programming languageR programming language
R programming language
 
Introduction to R and Installation.pptx
Introduction  to R and Installation.pptxIntroduction  to R and Installation.pptx
Introduction to R and Installation.pptx
 
R program
R programR program
R program
 
R crash course
R crash courseR crash course
R crash course
 
Introtor
IntrotorIntrotor
Introtor
 
R meet up slides.pptx
R meet up slides.pptxR meet up slides.pptx
R meet up slides.pptx
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
 
Garishma xcs
Garishma xcsGarishma xcs
Garishma xcs
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document useful
 
R for data analytics
R for data analyticsR for data analytics
R for data analytics
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of R
 
R presentation
R presentationR presentation
R presentation
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
 

Plus de JigsawAcademy2014 (11)

WebC2 t1 t2-t3
WebC2 t1 t2-t3WebC2 t1 t2-t3
WebC2 t1 t2-t3
 
C1 t1,t2,t3,t4 complete
C1 t1,t2,t3,t4 completeC1 t1,t2,t3,t4 complete
C1 t1,t2,t3,t4 complete
 
Bd class 2 complete
Bd class 2 completeBd class 2 complete
Bd class 2 complete
 
Big data gaurav
Big data gauravBig data gaurav
Big data gaurav
 
Class ppt intro to-sas
Class ppt   intro to-sasClass ppt   intro to-sas
Class ppt intro to-sas
 
Analytics overview class-ppt
Analytics overview  class-pptAnalytics overview  class-ppt
Analytics overview class-ppt
 
Class ppt overview of analytics
Class ppt overview of analyticsClass ppt overview of analytics
Class ppt overview of analytics
 
About us ver2 ppt
About us ver2 pptAbout us ver2 ppt
About us ver2 ppt
 
About us ppt
About us pptAbout us ppt
About us ppt
 
Testimonials ppt
Testimonials pptTestimonials ppt
Testimonials ppt
 
Newspaper clips
Newspaper clipsNewspaper clips
Newspaper clips
 

Dernier

Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 

Dernier (20)

YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 

Class ppt intro to r

  • 2. AGENDA • History and evolution of R • Principle and software paradigm • Description of R interface • Advantages of R • Drawbacks of R • So why use R? • References for learning R
  • 3. Origin in the Bell Labs in the 1970’s HISTORY AND EVOLUTION OF R
  • 4. R has developed from the S language HISTORY AND EVOLUTION OF R SVersion 1 SVersion 4 SVersion 3 SVersion 2 Developed 30 years ago for research applied to the high-tech industry
  • 5. 1990’s: R developed concurrently with S 1993: R made public The regular development of R HISTORY AND EVOLUTION OF R Acceleration of R development  R-Help and R-Devl mailing-lists  Creation of the R Core Group Source: R Journal Vol 1/2
  • 6. Growing number of packages HISTORY AND EVOLUTION OF R 2001: ~100 packages 2009: Over 2000 packages Source: R Journal Vol 1/2 2000: R version 1.0.1 Today: R version 2.14
  • 7. Explosion of R popularity in the last decade HISTORY AND EVOLUTION OF R  Object-oriented, growing user base, scripting features  Free and open-source  Irrational reasons: R seen as « cool »
  • 8. Comparison of Mailing Lists HISTORY AND EVOLUTION OF R Evolution of the traffic on software main mailing-lists. Source: R.A. Muenchen, r4stats.com
  • 9. Popularity amongst programming languages HISTORY AND EVOLUTION OF R KD Nuggets 2012 survey
  • 10. Number of Blogs HISTORY AND EVOLUTION OF R Software Number of Blogs R 365 SAS 40 Stata 8 Others 0-3 Data as on Mar 2012
  • 11. AGENDA • History and evolution of R • Principle and software paradigm • Description of R interface • Advantages of R • Drawbacks of R • So why using R? • References for learning R
  • 12.  R is rather a programming language  Limited user-friendly interfaces for data analysis  Is object oriented and almost non declarative  Similar to programming languages like Fortran, C, Java, Python R is not really a (statistical) software PRINCIPLE AND SOFTWARE PARADIGM
  • 13. Recent endeavours to enhance R user-friendliness R has limited Graphical User Interface (GUI) options PRINCIPLE AND SOFTWARE PARADIGM Several GUIs in development R-commander RKWard Rattle
  • 14. R Commander (RCmdr) PRINCIPLE AND SOFTWARE PARADIGM
  • 17. Inherent limitations of pervasive Excel-like spreadsheets PRINCIPLE AND SOFTWARE PARADIGM VS.
  • 18. Sophisticated but costly SAS PRINCIPLE AND SOFTWARE PARADIGM VS. Screenshot of SAS enteprise Miner 7.1. Source: sas.com
  • 19. AGENDA • History and evolution of R • Principle and software paradigm • Description of R interface • Advantages of R • Drawbacks of R • So why using R? • References for learning R
  • 20. R console DESCRIPTION OF R INTERFACE R desktop shortcut RGui: R basic interface R command line (space to write instructions)
  • 21. Using the command line in R console DESCRIPTION OF R INTERFACE First false sentence followed by R’s error message Second correct sentence Declaration and printing of the sentence as a R object Simple math computations Basic information about the R object containing the sentence
  • 22. RGui menu: File tab DESCRIPTION OF R INTERFACE File tab: Usual basic and general operations
  • 23. RGui menu: Edit tab DESCRIPTION OF R INTERFACE Edit tab: basic and general editing Results of the data editor Data editor: entering the object’s name
  • 24. RGui menu: View tab DESCRIPTION OF R INTERFACE View tab: viewing Toolbar and/or Status bar
  • 25. RGui menu: Misc tab DESCRIPTION OF R INTERFACE Misc tab: diverse operations
  • 26. RGui menu: Packages tabs DESCRIPTION OF R INTERFACE Packages tab: adding functions to R foundation
  • 27. RGui menu: Windows tab DESCRIPTION OF R INTERFACE Windows tab: usual options to arrange the tiles
  • 28. RGui menu: Help tab DESCRIPTION OF R INTERFACE Help tab: very important links to help
  • 29. AGENDA • History and evolution of R • Principle and software paradigm • Description of R interface • Advantages of R • Drawbacks of R • So why using R? • References for learning R
  • 30.  Open source code  You can access the code of the software  In-depth understanding of what R does  Modify the code R “philosophy” ADVANTAGES OF R Screenshot of the CRAN webpage of the « mgcv » package. Source: CRAN Adress of the « mgcv » package Link with Package sources (.tar.gz file) Example “mgcv” package webpage
  • 31. Example of source code of the “mgcv” package R access to source code ADVANTAGES OF R Screenshot of unzipping the « mgcv » package and browsing through the package’s files. Unzipping mgcv_1.7-13.tar.gz file (with 7zip) List of directories in the « mgcv » package List of functions (i.e open code) in the « src » (i.e code sources) directory the « mgcv » package1 2 3
  • 32. R is free ADVANTAGES OF R Software Academics Demo Commercial (basic) Commercial (full) R Free Free Free Free SAS Free to $100s Not available $1 000s $10 000s Statistica $100s 30 days limit ~$1 000 $10 000 Excel (Microsoft) Free to $10s Limited ~$100 $100s SPSS (IBM) $100s 14 days limit ~$2 000 $1 000s
  • 33. Interface with other languages and scripting capabilities ADVANTAGES OF R Screenshot of the file « mgcv.c » of the « mgcv » package open in WordPad « mgcv.c » file in the « mgcv » package coded in typical C programming language Interfaces with virtually any other programming language  Fortran, C, C++, Python…  Tailor or rewrite your old codes in R R as a scripting language  R scripts can launch or be launched by other languages
  • 37.  R ~ tool used by the finest researchers  Top-notch analytics capabilities R role in academia ADVANTAGES OF R Screenshot of a user’s Facebook map . Source: Paul Butler/Facebook, DG Rossiter, spatialanalysis.co.uk
  • 38. Free open source philosophy To summarize ADVANTAGES OF R  R websites with many examples  Free books  Free online open courses  Twitter accounts Online help and discussion  Mailing-lists  Very active and diverse forums  Communities of developers and helpers
  • 39. AGENDA • History and evolution of R • Principle and software paradigm • Description of R interface • Advantages of R • Drawbacks of R • So why using R? • References for learning R
  • 40. Poor management of large datasets  Avoid imbricated loops  Prefer R advanced language for data structure Average memory performance DRAWBACKS OF R Complicated structure of packages in R  Dozen of packages  To be loaded every time in memory R packages to better manage memory  Rhadoop (inspiration from Google)  Ff  bigmemory
  • 41. No default parallel execution  R packages to use several cores  Top skills needed for high performance computing Average computing performance DRAWBACKS OF R A high-level programming language  Abstract and modern (Python…)  More productive coding  But further from « machine language »…  … meaning 100 times slower than C
  • 42. Difficult to inspect data sets Difficult data visualization and management DRAWBACKS OF R Screenshot of the R data editor and « Viewtable » tab in SAS 9.3
  • 43. Problems for large organizations  R made of several thousands independent packages  No deployment plan for complex organizations  No installation support Difficult architecture management DRAWBACKS OF R Lack of code accountability  Thousands of individual independent R developers  Nobody responsible for the quality of the code Potentially high hidden costs with R  Total cost may favour commercial solutions for complex computations made in large corporations
  • 44. Steep learning curve  R code far from undergrad computer science courses  Very complex data structures (useful if mastered)  Is R’s syntax not logical? Relatively difficult to learn DRAWBACKS OF R Still, not more difficult to learn than SAS  Both SAS and R more abstract than basic programming languages (Fortran, C…)  Difficult to learn = more rewarding professionally!!
  • 45. AGENDA • History and evolution of R • Principle and software paradigm • Description of R interface • Advantages of R • Drawbacks of R • So why use R? • References for learning R
  • 46. No language is perfect!!  Contradictory objectives to meet  Strengths and weaknesses of each language More positive than negative points SO WHY LEARN R? Different needs imply different tools  Large corporations + defined procedures  SAS-like  Less financial resources + quick proof of concept  R Effect of legacy and the culture of the organization  Use existing solutions (system architecture, BA tools…)  Habits in business analytics
  • 47. Very appealing solution SO WHY LEARN R? Popularity of business analytics software (green = very popular, red = unpopular). Source: Rexer Analytics Overall Corporate Consultants Academics NGO/Gov't R SAS IBMSPSS STATISTICA Owncode
  • 48. AGENDA • History and evolution of R • Principle and software paradigm • Description of R interface • Advantages of R • Drawbacks of R • So why using R? • References for learning R
  • 49. Many books available: choose the one that fits you!  Style, pedagogy, theory vs practice  Browse several books at local library or store Books REFERENCES FOR LEARNING R Springer’s UseR! Series (http://www.springer.com/series/6991)  Recent, concise, good quality, affordable, diverse Pure rookies: « A beginners’ guide to R », « R by example » One step forward: « Business analytics for managers » Intensive Excel users: « R through Excel » O’Reilly R series (for programmers) « R cookbook », « R in a nuttshell »
  • 50. Websites REFERENCES FOR LEARNING R R official websites  The R project for statistical computing (www.r-project.org )  Mailing lists (« R-help », Special Interest Groups) and R journal  Official (austere) manuals (« An introduction to R ») Other websites  UCLA online R resources http://www.ats.ucla.edu/stat/r/)  R blogs aggregator (www.r-bloggers.com)  Social networks: LinkedIn groups (The R project for statistical computing), Twitter accounts (@RevolutionR, @inside_R), jobboards (Analytical Bridge…)
  • 51. Growing number of conferences about R Conferences REFERENCES FOR LEARNING R  Annual during a few days in new venue (Google it!)  Lots of materials about many topics Other conferences or venues  Conferences about business analytics (data mining, specialized topics…) with sessions involving R  Find (or even start!) a R user group close to your location (R Wiki geographical list, map of groups on « meetup.com »)  Events and news from R-bloggers blog Official International R UseR! conference