SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
 What is R?
                                                                                         R’s Advantages
                                                                                         R’s Disadvantages
                                                                                         Installing and Maintaining R
                                                                                         Ways of Running R
Bob Muenchen, Author R for SAS and SPSS Users,
                                                                                         An Example Program
         Co-Author R for Stata Users
                                                                                         Where to Learn More
 muenchen.bob@gmail.com, http://r4stats.com




            Copyright © 2010, 2011, Robert A Muenchen. All rights reserved.                                                             2




                                                                                   “The most powerful statistical computing language
                                                                                      on the planet.” -Norman Nie, Developer of SPSS
                                                                                     Language + package + environment for
                                                                                      graphics and data analysis
                                                                                     Free and open source
                                                                                     Created by Ross Ihaka & Robert Gentleman 1996
                                                                                      & extended by many more
                                                                                     An implementation of the S language by
                                                                                      John Chambers and others
                                                                                     R has 4,950 add-ons, or nearly 100,000 procs


                                                                              3                                                         4
5                Source: r4stats.com/popularity
                                                                                                      6
                    http://r4stats.com/popularity




1.   Data input & management (data step)                * SAS Approach;
2.   Analytics & graphics procedures (proc step)        DATA A; SET A;
3.   Macro language                                       logX = log(X);
4.   Matrix language                                    PROC REG;
5.   Output management systems (ODS/OMS)                  MODEL Y = logX;

R integrates these all seamlessly.                      # R Approach
                                                        lm( Y ~ log(X) )

                                                    7                                                 8
 Vast selection of analytics & graphics
 New methods are available sooner
 Many packages can run R (SAS, SPSS, Excel…)
 Its object orientation “does the right thing”
 Its language is powerful & fully integrated
 Procedures you write are on an equal footing
 It is the universal language of data analysis
 It runs on any computer
 Being open source, you can study and modify it
 It is free

                                                          9                                                             10




* Using SAS;                                                    Language is somewhat harder to learn
PROC TTEST DATA=classroom;                                      Help files are sparse & complex
CLASS gender;                                                   Must find R and its add-ons yourself
VAR score;
                                                                Graphical user interfaces not as polished
                                                                Most R functions hold data in main memory
# In R
                                                                  Rule-of-thumb: 10 million values per gigabyte
t.test(score ~ gender, data=classroom)
                                                                  SAS/SPSS: billions of records
                                                                  Several efforts underway to break R’s memory limit
t.test(posttest, pretest , paired=TRUE, data=classroom)            including Revolution Analytics’ distribution


                                                          11                                                            12
 Base R plus Recommended Packages like:                        Email support is free, quick, 24-hours:
      Base SAS, SAS/STAT, SAS/GRAPH, SAS/IML Studio              www.r-project.org/mail.html
      SPSS Stat. Base, SPSS Stat. Advanced, Regression           Stackoverflow.com
 Tested via extensive validation programs                        Quora.com
 But add-on packages written by…                                 Crossvalidated stats.stackexchange.com
      Professor who invented the method?                          /questions/tagged/r
      A student interpreting the method?                       Phone support available commercially




                                                          13                                                14




1. Go to cran.r-project.org,                                    Comprehensive R Archive Network
   the Comprehensive R Archive Network
                                                                Crantastic.com
2. Download binaries for Base & run                             Inside-R.org
3. Add-ons:                                                     R4Stats.com
   install.packages(“myPackage”)
4. To update: update.packages()




                                                  15                                                        16
17   18




19   20
 Run code interactively
      Submit code from Excel, SAS, SPSS,…
      Point-n-click using
       Graphical User Interfaces (GUIs)
      Batch mode




21
                                             22




23
                                             24
Copyright © 2010, 2011, Robert A Muenchen. All rights reserved.        26
                               25




run ExportDataSetToR("mydata");     GET FILE=‘mydata.sav’.
                                    BEGIN PROGRAM R.
submit/r;
                                    mydata <- spssdata.GetDataFromSPSS(
   mydata$workshop <-
                                      variables = c("workshop gender
     factor(mydata$workshop)
                                      q1 to q4"),
   summary(mydata)                    missingValueToNA = TRUE,
endsubmit;                            row.label = "id" )
                                    summary(mydata)
                                    END PROGRAM.

                               27                                                                               28
29   30




          32
31
34
                                              33




 A company focused on R development & support
 Run by SPSS founder Norman Nie
 Their enhanced distribution of R:
  Revolution R Enterprise
 Free for colleges and universities, including for
  outside consulting




                                                      35
43   44
mydata <- read.csv("mydata.csv")                                      > mydata <- read.csv("mydata.csv")
 print(mydata)                                                         > print(mydata)
                                                                          workshop gender q1 q2 q3 q4
 mydata$workshop <- factor(mydata$workshop)
                                                                       1        1      f 1 1 5 1
 summary(mydata)
                                                                       2        2      f 2 1 4 1
 plot( mydata$q1, mydata$q4 )                                          3        1      f 2 2 4 3
                                                                       4        2   <NA> 3 1 NA 3
 myModel <- lm( q4~q1+q2+q3, data=mydata )                             5        1      m 4 5 2 4
 summary( myModel )                                                    6        2      m 5 4 5 5
 anova( myModel )                                                      7        1      m 5 3 4 4
 plot( myModel )
                                                                       8        2      m 4 5 5 5
                                                                  45                                        46




> mydata$workshop <-factor(mydata$workshop)
> summary(mydata)
 workshop       gender
 1:4        f      :3
 2:4        m      :4
            NA's:1
q1                  q2             q3              q4
Min.   :1.00        Min.   :1.00   Min.   :2.000   Min.   :1.00
1st Qu.:2.00        1st Qu.:1.00   1st Qu.:4.000   1st Qu.:2.50
Median :3.50        Median :2.50   Median :4.000   Median :3.50
Mean   :3.25        Mean   :2.75   Mean   :4.143   Mean   :3.25
3rd Qu.:4.25        3rd Qu.:4.25   3rd Qu.:5.000   3rd Qu.:4.25
Max.   :5.00        Max.   :5.00   Max.   :5.000   Max.   :5.00
                                   NA's   :1.000
                                                                  47                                        48
> myModel <- lm(q4 ~ q1+q2+q3, data=mydata)
> summary(myModel)

Call:
lm(formula = q4 ~ q1 + q2 + q3, data = mydata)
Residuals:
      1       2       3       5       6        7      8
-0.3113 -0.4261 0.9428 -0.1797 0.0765 0.0225 -0.1246
Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.3243      1.2877 -1.028     0.379
q1            0.4297     0.2623   1.638    0.200
q2            0.6310     0.2503   2.521    0.086
q3            0.3150     0.2557   1.232    0.306
Multiple R-squared: 0.9299,     Adjusted R-squared: 0.8598
F-statistic: 13.27 on 3 and 3 DF, p-value: 0.03084


                                                             49   Copyright © 2010, 2011, Robert A Muenchen. All rights reserved.   50




                                                             51                                                                     52
 R for SAS and SPSS Users, Muenchen
                                                          R for Stata Users, Muenchen & Hilbe
                                                          R Through Excel: A Spreadsheet Interface for Statistics,
                                                           Data Analysis, and Graphics, Heiberger & Neuwirth
                                                          Data Mining with Rattle and R: The Art of Excavating
                                                           Data for Knowledge Discovery, Williams




                                                    53                                                                54




 R is powerful, extensible, free
 Download it from CRAN
 Academics download Revolution R Enterprise
  for free at www.revolutionanalytics.com
 You run it many ways & from many packages
                                                                              muenchen@utk.edu
 Several graphical user interfaces are available
 R's programming language is the way                                   Slides: r4stats.com/misc/webinar
                                                                         Presentation: bit.ly/R-sas-spss
  to access its full power


                                                    55

Contenu connexe

Tendances

DeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence ApplicationsDeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence ApplicationsRevolution Analytics
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for HadoopWilly Marroquin (WillyDevNET)
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and VerilogGanesan Narayanasamy
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution Analytics
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalRevolution Analytics
 
Microsoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaMicrosoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaData Science Thailand
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondRevolution Analytics
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationCesare Cugnasco
 

Tendances (20)

DeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence ApplicationsDeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence Applications
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
Big data analytics using R
Big data analytics using RBig data analytics using R
Big data analytics using R
 
Microsoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaMicrosoft R Server for Data Sciencea
Microsoft R Server for Data Sciencea
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
R for data analytics
R for data analyticsR for data analytics
R for data analytics
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
R and-hadoop
R and-hadoopR and-hadoop
R and-hadoop
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Meetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management TrendsMeetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management Trends
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
 
Data Analytics Domain
Data Analytics DomainData Analytics Domain
Data Analytics Domain
 

En vedette

Retail Business Software
Retail Business SoftwareRetail Business Software
Retail Business Softwarejsmith786
 
Supply Chain Analytic Solution
Supply Chain Analytic SolutionSupply Chain Analytic Solution
Supply Chain Analytic Solutionjsmith786
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)gdusbabek
 
Topic 4 intro spss_stata
Topic 4 intro spss_stataTopic 4 intro spss_stata
Topic 4 intro spss_stataSizwan Ahammed
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SASizahn
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache HadoopHortonworks
 
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...Hortonworks
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notesDavid mbwiga
 

En vedette (15)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Retail Business Software
Retail Business SoftwareRetail Business Software
Retail Business Software
 
Supply Chain Analytic Solution
Supply Chain Analytic SolutionSupply Chain Analytic Solution
Supply Chain Analytic Solution
 
R-Excel Integration
R-Excel IntegrationR-Excel Integration
R-Excel Integration
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
 
INTRODUCTION TO SAS
INTRODUCTION TO SASINTRODUCTION TO SAS
INTRODUCTION TO SAS
 
Topic 4 intro spss_stata
Topic 4 intro spss_stataTopic 4 intro spss_stata
Topic 4 intro spss_stata
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
 
Sas demo
Sas demoSas demo
Sas demo
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 
Introduction to EpiData
Introduction to EpiDataIntroduction to EpiData
Introduction to EpiData
 
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notes
 
Data analysis using spss
Data analysis using spssData analysis using spss
Data analysis using spss
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
 

Similaire à Intro to R for SAS and SPSS User Webinar

R vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageR vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageStat Analytica
 
R programming Language , Rahul Singh
R programming Language , Rahul SinghR programming Language , Rahul Singh
R programming Language , Rahul SinghRavi Basil
 
R programming language
R programming languageR programming language
R programming languageKeerti Verma
 
The Statistical Significance of &quot;R&quot;
The Statistical Significance of &quot;R&quot;The Statistical Significance of &quot;R&quot;
The Statistical Significance of &quot;R&quot;ppvora
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programminghemasri56
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R StudioRupak Roy
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with RTechsparks
 
R as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationR as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationAlvaro Gil
 
Study of R Programming
Study of R ProgrammingStudy of R Programming
Study of R ProgrammingIRJET Journal
 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning rNetaji Gandi
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
1_Introduction.pptx
1_Introduction.pptx1_Introduction.pptx
1_Introduction.pptxranapoonam1
 

Similaire à Intro to R for SAS and SPSS User Webinar (20)

R vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageR vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical Language
 
Reason To learn & use r
Reason To learn & use rReason To learn & use r
Reason To learn & use r
 
R programming Language , Rahul Singh
R programming Language , Rahul SinghR programming Language , Rahul Singh
R programming Language , Rahul Singh
 
R_L1-Aug-2022.pptx
R_L1-Aug-2022.pptxR_L1-Aug-2022.pptx
R_L1-Aug-2022.pptx
 
R programming language
R programming languageR programming language
R programming language
 
The Statistical Significance of &quot;R&quot;
The Statistical Significance of &quot;R&quot;The Statistical Significance of &quot;R&quot;
The Statistical Significance of &quot;R&quot;
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
R programming
R programmingR programming
R programming
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with R
 
R as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationR as supporting tool for analytics and simulation
R as supporting tool for analytics and simulation
 
Study of R Programming
Study of R ProgrammingStudy of R Programming
Study of R Programming
 
R programming
R programmingR programming
R programming
 
Introtor
IntrotorIntrotor
Introtor
 
R Course Online
R Course OnlineR Course Online
R Course Online
 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
 
UNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdfUNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdf
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
1_Introduction.pptx
1_Introduction.pptx1_Introduction.pptx
1_Introduction.pptx
 
R program
R programR program
R program
 

Plus de Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source CommunitiesRevolution Analytics
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorRevolution Analytics
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint packageRevolution Analytics
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution Analytics
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solutionRevolution Analytics
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceRevolution Analytics
 
Reproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageReproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageRevolution Analytics
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R OpenRevolution Analytics
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionRevolution Analytics
 

Plus de Revolution Analytics (20)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solution
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R Conference
 
Reproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageReproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint Package
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R Open
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
 

Dernier

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Dernier (20)

Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

Intro to R for SAS and SPSS User Webinar

  • 1.  What is R?  R’s Advantages  R’s Disadvantages  Installing and Maintaining R  Ways of Running R Bob Muenchen, Author R for SAS and SPSS Users,  An Example Program Co-Author R for Stata Users  Where to Learn More muenchen.bob@gmail.com, http://r4stats.com Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 2  “The most powerful statistical computing language on the planet.” -Norman Nie, Developer of SPSS  Language + package + environment for graphics and data analysis  Free and open source  Created by Ross Ihaka & Robert Gentleman 1996 & extended by many more  An implementation of the S language by John Chambers and others  R has 4,950 add-ons, or nearly 100,000 procs 3 4
  • 2. 5 Source: r4stats.com/popularity 6 http://r4stats.com/popularity 1. Data input & management (data step) * SAS Approach; 2. Analytics & graphics procedures (proc step) DATA A; SET A; 3. Macro language logX = log(X); 4. Matrix language PROC REG; 5. Output management systems (ODS/OMS) MODEL Y = logX; R integrates these all seamlessly. # R Approach lm( Y ~ log(X) ) 7 8
  • 3.  Vast selection of analytics & graphics  New methods are available sooner  Many packages can run R (SAS, SPSS, Excel…)  Its object orientation “does the right thing”  Its language is powerful & fully integrated  Procedures you write are on an equal footing  It is the universal language of data analysis  It runs on any computer  Being open source, you can study and modify it  It is free 9 10 * Using SAS;  Language is somewhat harder to learn PROC TTEST DATA=classroom;  Help files are sparse & complex CLASS gender;  Must find R and its add-ons yourself VAR score;  Graphical user interfaces not as polished  Most R functions hold data in main memory # In R  Rule-of-thumb: 10 million values per gigabyte t.test(score ~ gender, data=classroom)  SAS/SPSS: billions of records  Several efforts underway to break R’s memory limit t.test(posttest, pretest , paired=TRUE, data=classroom) including Revolution Analytics’ distribution 11 12
  • 4.  Base R plus Recommended Packages like:  Email support is free, quick, 24-hours:  Base SAS, SAS/STAT, SAS/GRAPH, SAS/IML Studio  www.r-project.org/mail.html  SPSS Stat. Base, SPSS Stat. Advanced, Regression  Stackoverflow.com  Tested via extensive validation programs  Quora.com  But add-on packages written by…  Crossvalidated stats.stackexchange.com  Professor who invented the method? /questions/tagged/r  A student interpreting the method?  Phone support available commercially 13 14 1. Go to cran.r-project.org,  Comprehensive R Archive Network the Comprehensive R Archive Network  Crantastic.com 2. Download binaries for Base & run  Inside-R.org 3. Add-ons:  R4Stats.com install.packages(“myPackage”) 4. To update: update.packages() 15 16
  • 5. 17 18 19 20
  • 6.  Run code interactively  Submit code from Excel, SAS, SPSS,…  Point-n-click using Graphical User Interfaces (GUIs)  Batch mode 21 22 23 24
  • 7. Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 26 25 run ExportDataSetToR("mydata"); GET FILE=‘mydata.sav’. BEGIN PROGRAM R. submit/r; mydata <- spssdata.GetDataFromSPSS( mydata$workshop <- variables = c("workshop gender factor(mydata$workshop) q1 to q4"), summary(mydata) missingValueToNA = TRUE, endsubmit; row.label = "id" ) summary(mydata) END PROGRAM. 27 28
  • 8. 29 30 32 31
  • 9. 34 33  A company focused on R development & support  Run by SPSS founder Norman Nie  Their enhanced distribution of R: Revolution R Enterprise  Free for colleges and universities, including for outside consulting 35
  • 10.
  • 11. 43 44
  • 12. mydata <- read.csv("mydata.csv") > mydata <- read.csv("mydata.csv") print(mydata) > print(mydata) workshop gender q1 q2 q3 q4 mydata$workshop <- factor(mydata$workshop) 1 1 f 1 1 5 1 summary(mydata) 2 2 f 2 1 4 1 plot( mydata$q1, mydata$q4 ) 3 1 f 2 2 4 3 4 2 <NA> 3 1 NA 3 myModel <- lm( q4~q1+q2+q3, data=mydata ) 5 1 m 4 5 2 4 summary( myModel ) 6 2 m 5 4 5 5 anova( myModel ) 7 1 m 5 3 4 4 plot( myModel ) 8 2 m 4 5 5 5 45 46 > mydata$workshop <-factor(mydata$workshop) > summary(mydata) workshop gender 1:4 f :3 2:4 m :4 NA's:1 q1 q2 q3 q4 Min. :1.00 Min. :1.00 Min. :2.000 Min. :1.00 1st Qu.:2.00 1st Qu.:1.00 1st Qu.:4.000 1st Qu.:2.50 Median :3.50 Median :2.50 Median :4.000 Median :3.50 Mean :3.25 Mean :2.75 Mean :4.143 Mean :3.25 3rd Qu.:4.25 3rd Qu.:4.25 3rd Qu.:5.000 3rd Qu.:4.25 Max. :5.00 Max. :5.00 Max. :5.000 Max. :5.00 NA's :1.000 47 48
  • 13. > myModel <- lm(q4 ~ q1+q2+q3, data=mydata) > summary(myModel) Call: lm(formula = q4 ~ q1 + q2 + q3, data = mydata) Residuals: 1 2 3 5 6 7 8 -0.3113 -0.4261 0.9428 -0.1797 0.0765 0.0225 -0.1246 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.3243 1.2877 -1.028 0.379 q1 0.4297 0.2623 1.638 0.200 q2 0.6310 0.2503 2.521 0.086 q3 0.3150 0.2557 1.232 0.306 Multiple R-squared: 0.9299, Adjusted R-squared: 0.8598 F-statistic: 13.27 on 3 and 3 DF, p-value: 0.03084 49 Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 50 51 52
  • 14.  R for SAS and SPSS Users, Muenchen  R for Stata Users, Muenchen & Hilbe  R Through Excel: A Spreadsheet Interface for Statistics, Data Analysis, and Graphics, Heiberger & Neuwirth  Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery, Williams 53 54  R is powerful, extensible, free  Download it from CRAN  Academics download Revolution R Enterprise for free at www.revolutionanalytics.com  You run it many ways & from many packages muenchen@utk.edu  Several graphical user interfaces are available  R's programming language is the way Slides: r4stats.com/misc/webinar Presentation: bit.ly/R-sas-spss to access its full power 55