Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Introduction to R programming

1 566 vues

Publié le

Quantitative Data Analysis -
Part I: Introduction to R programming -

Publié dans : Technologie
  • Soyez le premier à commenter

Introduction to R programming

  1. 1. QuantitativeData Analysis Working with R
  2. 2. Working with RWhat is R A computer language, with orientation toward statistical applicationsAdvantages Completely free, just download from Internet Many add-on packages for specialized uses Open source
  3. 3. Getting Started: Installing RHave Internet connectionGo to http://cran.r-project/R for Windows screen, click “base”Find, click on download RClick Run, OK, or Next for all screensEnd up with R icon on desktop
  4. 4. At http://cran.r-project.org/ Haga clic para modificar el estilo de texto del patrón Segundo nivel ● Tercer nivel ● Cuarto nivel ● Quinto nivel
  5. 5. Downloading Base RClick on WindowsThen in next screen, click on “base”Then screens for Run, OK, or NextAnd finally “Finish” will put R icon on desktop
  6. 6. Rgui and R Consolen ending with R prompt (>)Haga clic para modificar el estilo de texto del patrón Segundo nivel ● Tercer nivel ● Cuarto nivel ● Quinto nivel
  7. 7. The R prompt (>)> This is the “R prompt.” It says R is ready to take your command.Enter these after the prompt, observe output >2+3 >2^3+(5) >6/2+(8+5) >2 ^ 3 + (5)
  8. 8. Installing Packages and Librariesinstall.packages("akima")install.packages("chron")install.packages("lme4")install.packages("mcmc")install.packages("odesolve")install.packages("spdep")install.packages("spatstat")install.packages("tree")install.packages("lattice")
  9. 9. Installing Packages and Libraries
  10. 10. Installing Packages and LibrariesR.versioninstalled.packages()update.packages()setRepositories()
  11. 11. Helphelp(mean)?meanhelp will not find a function in a package unless you install it andload it with libraryhelp.search(“aspline”) will find functions in packages installedbut not loadedapropos("lm")
  12. 12. HelpFor help on whole package: help(package=akima) objects(grep("akima",search()))library(“akima”)my.packages <- search()aki <- grep("akima",my.packages)my.objects <- objects(aki)
  13. 13. Helpexample(mean)demo()demo(package = packages(all.available = TRUE))demo(graphics)vignette(all=TRUE)V <- vignette("sp")print(V)edit(V)
  14. 14. Maintenancels() / objects()search()class(a)rm(a,b,c)rm(list=ls())
  15. 15. Maintenancegetwd()setwd()source("myprogram.R ")save(list = ls(all=TRUE), file= "all.Rdata")load("all.Rdata")save.image()savehistory()
  16. 16. To cite use of RTo cite the use of R for statistical work, Rdocumentation recommends the following: R Development Core Team (2010). R: A language andenvironment for statistical computing. R Foundation forStatistical Computing, Vienna, Austria. ISBN 3-900051-07-0,URL http://www.R-project.org/.Get the latest citation by typing citation ( ) at theprompt.
  17. 17. Email Support Listshttp://r-project.org under "mailing lists"r-help is the most general oneBefore posting, read: http://www.R-project.org/postingguide.htmlSend the smallest possible example of your problem (generated datais handy)sessionInfo() will list your computer & R details to cut/paste toyour question
  18. 18. QuantitativeData AnalysisProgramming with R
  19. 19. Basic conceptsCodeCommandsProgramsObjectsTypesFunctionsOperators
  20. 20. assignmenta <- 1assign("b", 2)
  21. 21. Mathematical operators+ - */ ^ arithmetic> >= < <= == != relational! & logical$ list indexing (the ‘element name’ operator): create a sequence~ model formulae
  22. 22. Logical operators! logical NOT& logical AND| logical OR< less than<= less than or equal to> greater than>= greater than or equal to== logical equals (double =)!= not equal&& AND with IF|| OR with IFxor(x,y) exclusive ORisTRUE(x) an abbreviation of identical(TRUE,x)all(x)any(x)
  23. 23. Mathematical functionslog(x) log to base e of xexp(x) antilog of x exlog(x,n) log to base n of xlog10(x) log to base 10 of xsqrt(x) square root of xfactorial(x) x!choose(n,x) binomial coefficients n!/(x! n−x!)gamma(x) x, for real x x−1!, for integer xlgamma(x) natural log of x
  24. 24. Mathematical functionsfloor(x) greatest integer <xceiling(x) smallest integer >xtrunc(x) round(x, digits=0) round the value of x to an integerabs(x) the absolute value of x, ignoring the minus sign if there is onesignif(x, digits=6) give x to 6 digits in scientific notation
  25. 25. Trigonometrical functionscos(x) cosine of x in radianssin(x) sine of x in radianstan(x) tangent of x in radiansacos(x), asin(x), atan(x) inverse trigonometric transformations of realor complex numbersacosh(x), asinh(x), atanh(x) inverse hyperbolic trigonometrictransformations of real or complex numbers
  26. 26. Infinity and Things that Are Not a NumberInf (is.finite,is.infinite) 3/0 2 / Inf exp(-Inf) (0:3)^InfNaN (is.nan) 0/0
  27. 27. Vectorsa <- c(1,2,3,4,5)a <- 1:5a <- scan()a <- seq(1,10,2)b <- 1:4a <- seq(1,10,along=b)x <- runif(10)which(a == 2)
  28. 28. Plotting functionsx<-seq(-10,10,0.1)y<-x^3plot(x,y,type=‘l’)
  29. 29. Vector functionsmax(x) maximum value in xmin(x) minimum value in xsum(x) total of all the values in xsort(x) a sorted version of xrank(x) vector of the ranks of the values in xorder(x) an integer vector containing the permutation to sort x intoascending orderrange(x) vector of minx and maxx
  30. 30. More functionscumsum(x) vector containing the sum of all of the elements up tothat pointcumprod(x) vector containing the product of all of the elements up tothat pointcummax(x) vector of non-decreasing numbers which are thecumulative maxima of the values in x up to that pointcummin(x) vector of non-increasing numbers which are thecumulative minima of the values in x up to that pointpmax(x,y,z) vector, of length equal to the longest of x y or z,containing the maximum of x y or z for the ith position ineachpmin(x,y,z) vector, of length equal to the longest of x y or z,containing the minimum of x y or z for the ith position in eachrowSums(x) row totals of dataframe or matrix xcolSums(x) column totals of dataframe or matrix x
  31. 31. functionsGeometric mean (p.49)geometric<-function (x)exp(mean(log(x)))Harmonic mean (p.51)harmonic<-function (x) 1/mean(1/x)
  32. 32. ExercisesFinding the value in a vector that is closest to a specified valueclosest<-function(xv,sv){ xv[which(abs(xv-sv)==min(abs(xv-sv)))]}Calculate a trimmed mean of x which ignores both thesmallest and largest valuestrimmed.mean <- function (x) { mean(x[-c(which(x==min(x)),which(x==max(x)))])}
  33. 33. Setsunion(x,y)intersect(x,y)setdiff(x,y)setequal(x,y),is.element(el,set)
  34. 34. MatricesX<-matrix(c(1,0,0,0,1,0,0,0,1),nrow=3)dim(X)is.matrix(X)vector<-c(1,2,3,4,4,3,2,1)V<-matrix(vector,byrow=T,nrow=2)dim(vector) <- c(2,4)
  35. 35. MatricesX<-rbind(X,apply(X,2,mean))X<-cbind(X,apply(X,1,var))
  36. 36. sweepmatdata<-read.table("datasweepdata.txt")cols<-apply(matdata,2,mean)sweep(matdata,2,cols)
  37. 37. listsperson <- list()person$name <- "Alberto”person$age <- 37person$nationality <- "Spain“class(persona)[1] "list"> persona$name[1] "Alberto"$age[1] 37$nationality[1] "Spain"names(persona)[1] “name" “age" "nationality"
  38. 38. Stringsphrase<-"the quick brown fox jumps over the lazy dog"letras <- table(strsplit(phrase,split=character(0)))numwords<-1+table(strsplit(phrase,split=character(0)))[1]words <- unlist(strsplit(phrase,split=" "))words[grep("o",words)]"fox" %in% unlist(strsplit(phrase,split=" "))unlist(strsplit(phrase,,split=" ")) %in% c("fox","dog")
  39. 39. Stringsnchar(words)paste(words[1],words[2])toupper(words)
  40. 40. Regular expressionsgrep("^t", words)words[grep("^t", words)]words[grep("s$", words)]gsub("o","O",words)regexp()
  41. 41. Dataframeslista <- data.frame()lista[1,1] = "Alberto"lista[1,2] = 37lista[2,1] = "Ana"lista[2,2] = 23names(lista) <- c("Ana", "Edad")
  42. 42. Missing valuesNA (is.na)x<-c(1:8,NA)mean(x)mean(x,na.rm=T)which(is.na(x))as.vector(na.omit(x))x[!is.na(x)]
  43. 43. Dates and Times in Rdate()date<- as.POSIXlt(Sys.time())unlist(unclass(date))difftime()excel.dates <- c("27/02/2004", "27/02/2005","14/01/2003“,"28/06/2005", "01/01/1999")strptime(excel.dates,format="%d/%m/%Y")
  44. 44. Testing and Coercing in R
  45. 45. ifif (y > 0) print(1) else print (-1)z <- ifelse (y < 0, -1, 1)
  46. 46. Loops and Repeatsfor (i in 1:10) print(i^2)t = 1while(t<=10) { print(i^2) i <- i + 1}t = 1repeat { if (i > 10)break print(i^2) i <- i + 1 }
  47. 47. ExerciseCompute the Fibonacci series 1, 1, 2, 3, 5, 8 fibonacci<-function(n) { a<-1 b<-0 while(n>0) {swap<-a a<-a+b b<-swap n<-n-1 } b }
  48. 48. Avoid loopsx<-runif(10000000)system.time(max(x))pc<-proc.time()cmax<-x[1]for (i in 2:length(x)) { if(x[i]>cmax) cmax<-x[i]}proc.time()-pc
  49. 49. switchcentral<-function(y, measure) { switch(measure, Mean = mean(y), Geometric = exp(mean(log(y))), Harmonic = 1/mean(1/y), Median = median(y), stop("Measure not included"))}
  50. 50. QuantitativeData AnalysisWorking with datasets
  51. 51. Help for DatasetsTo list built-in datasets:data()data(package = .packages(all.available = TRUE))data(swiss)For help on a dataset: help(swiss) “Standardized fertility measure and socio-economic indicators foreach of 47 French-speaking provinces of Switzerland at about 1888.”
  52. 52. The attach CommandTo access individual variables, do this:> attach(swiss)Now try:> mean(Fertility)> detach(swiss)
  53. 53. Using R Functions: Simple Stuffrownames(swiss)colnames(swiss)• summary(swiss)Applying functions mean(swiss$Fertility) sd(swiss$Fertility) apply(swiss,2,max)
  54. 54. Factorsclass(Detergent)nlevels(Detergent)levels(Detergent)as.factor()
  55. 55. Working with your datasetfix(swiss)hist(Agriculture)plot(Catholic,Fertility)
  56. 56. Working with your own datasetswrite.table(swiss, "swiss.txt")swiss2 <- read.table("swiss.txt")data<-read.table(file.choose(),header=T)readLines()
  57. 57. Reading data from filesread.table(file) reads a file in table format and creates a data framefrom it; the default separator sep="" is any whitespace; useheader=TRUE to read the first line as a header of column names; useas.is=TRUE to prevent character vectors from being converted tofactors; use comment.char="" to prevent "#" from being interpretedasa comment; use skip=n to skip n lines before reading data; see thehelp for options on row naming, NA treatment, and othersread.csv("filename", header=TRUE) id. but with defaults set forreading comma-delimited filesread.delim("filename", header=TRUE) id. but with defaults setfor reading tab-delimited filesread.fwf(file,widths)read a table of f ixed width f ormatted data into a ’data.frame’; widthsis an integer vector, giving the widths of the fixed-width fields
  58. 58. Exampledata<-read.table(".datadaphnia.txt",header=T)names(data)attach(data)table(Detergent)tapply(Growth.rate,Detergent,mean)aggregate(Growth.rate,list(Detergent), mean)tapply(Growth.rate,list(Water,Daphnia),median)with(data,boxplot(Growth.rate ~ Detergent))