Publicité
Publicité

Contenu connexe

Publicité
Publicité

Introduction to R and R Studio

  1. Introduction to Rupak Roy
  2.  R is a language and a platform for statistical computing and graphics. It is a GNU project and was developed at Bell Laboratories(formerly AT & T, now Lucent Technologies) by John Chambers and his colleagues.  R provides a wide variety of statistical and graphical techniques with highly scalable features.  R is available as a Free Software under the terms of Free Software Foundation’s GNU General Public License in source code form . What is R-language? Rupak Roy
  3.  It is includes an effective data handling and storage facility.  It is the most comprehensive statistical analysis package available as it incorporates all of the standard statistical tests, models and analysis as well as providing a comprehensive language for managing and manipulating the data.  Everyone is welcome to provide code enhancements, debug the bug issues and also add new packages. So the wealth of quality packages available for R is testament to this approach to software development and sharing.  R has over 4800 packages available from multiple repositories specializing in topics like econometrics, data mining, spatial analysis and bio-informatics.  R can handle as many types of data from csv, sas, spss , excel, mysql, sql server, oracle and even can be integrated with hadoop for big data analysis . Introduction to R-language Rupak Roy
  4.  R is been listed in the top open source analytical tools 2016 list after SAS which is a license version. Therefore in 2019 R took the lead in analytical tools with its robustness and versatile in nature. Introduction to R-language Rupak Roy
  5.  R Studio is again a free and open source integrated development environment(IDE) for R programming language for statistical computing and graphics. R studio was founded by JJ Allaire.  R studio is available in 2 editions. R-Studio Desktop, where the program is run locally as a regular desktop application and R-Studio Server which allows accessing R Studio remotely using a web browser. Introduction to R-Studio Rupak Roy
  6. Difference between R and R Studio. R and R Studio are two different versions of the same thing. R is a programming language for statistical calculation and R Studio is a IDE integrated Development Environment that has more GUI interface to make analytics easy . We can use R without R Studio but we cant use R Studio without R . Or we can say R Studio is a front end IDE to R. Introduction to R Studio Rupak Roy
  7.  The CRAN (Comprehensive R Archive Network) is a network of ftp and web servers around the world that stores identical, up-to-date versions of code and documentations for R. What is CRAN? Rupak Roy
  8. What is Big Data ?  Extremely large data sets are analyzed computationally to reveal patterns, trends and associations especially relating to human behavior or machines.  They can be from terabyte to petabyte consisting of millions to trillions of rows and columns.  However R is not made for big data analytics but it has its advantage to integrate with big data technologies named as hadoop. One of the big advantage over hadoop is that hadoop is specially designed for programmers and data scientist, analyst or anyone not from programing background don’t have to spend more time in programming rather than analyzing their data. So what R does in this, will send instructions to the hadoop and hadoop will process all the instructions and return back the results to R.  R also have the advantage to extract multiple samples from hadoop, which is required for statistical modeling computing.  R can handle data as much as the memory available from the system i.e. RAM.
  9.  Source editor: contains a text editor where multiple lines of code can be entered.  Users can also save it as script file to disk.  Console editor: where all the interactive work of R is performed like objects created, analysis, filter etc. R Studio Environment
  10.  Packages: this is the place where a user can view all the list of install packages. Packages are a self contained set of codes to perform specific task similar to add-ins in excel.  Help: this is where we can browse the built-in help system for any R related topics.  Files: the place where user can browse their files of the computer.  Plots: this is the place where R displays its visual analysis like histogram, bar diagram, boxplots etc.  Workspace/history: The workspace is our current R working environment and includes any user-defined objects (vectors, matrices, data frames, lists, functions). At the end of an R session, the user can save an image of the current workspace that is automatically reloaded the next time when R is started. R Studio Environment Rupak Roy
  11.  To install R first, kindly follow the following steps:  Visit https://cran.r-project.org/  Then according to your operating system, select one, in this case we choose ‘Download R for Windows’.  In the next page, click ‘install R for the first time’ from base category.  Now Download R 3.3.3 for Windows.  Run the R setup file and choose the appropriate options according to the needs (we will keep the default setting for this course) and finish the installation.  Select the RGUI and it should something look like this. Installing R and R Studio Rupak Roy
  12. Installing R and R Studio Rupak Roy
  13.  Now let’s install the R Studio  Go to https://www.rstudio.com/products/rstudio/  Download R Studio desktop, select the installation file for your systems and run the installation file.  Later we can even change the settings by choosing Tools -> options Installing R and R Studio
  14. Next: Data types and their structure in R. Installing R and R Studio Rupak Roy
Publicité