3. History of R programming
• R is a programming language and free software
environment for statistical computing and
graphics.
• R was created by Ross Ihaka and Robert
Gentleman at the University of Auckland, New
Zealand, and further developed by the R
Development Core Team.
• R is named after the first names of the first two R
authors.
• The project was conceived in 1992, with an initial
version released in 1995 and a stable beta version
(v1.0) on 29 February 2000
4. Programming features
• R is an interpreted language; users typically
access it through a command-line interpreter
• R's data structures include vectors, matrices,
arrays, data frames and lists.
• R supports procedural programming with
functions and for some functions, object-oriented
programming with generic functions.
• Although used mainly by statisticians requiring an
environment for statistical computation and
software development, R can also operate as a
general calculation toolbox – with performance
benchmarks comparable to MATLAB.
5. Statistical features
• R and its libraries implement a wide variety of
statistical and graphical techniques, including linear
and nonlinear modeling, classical statistical tests, time-
series analysis, classification, clustering, and others.
• R is easily extensible through functions and extensions,
and the R community is noted for its active
contributions in terms of packages.
• Many of R's standard functions are written in R itself,
which makes it easy for users to follow the algorithmic
choices made.
• Another strength of R is static graphics, which can
produce publication-quality graphs, including
mathematical symbols. Dynamic and interactive
graphics are available through additional packages.
6. Packages
• The capabilities of R are extended through user-created
packages, which allow specialized statistical
techniques, graphical devices, import/export
capabilities, reporting tools.
• The R packaging system is also used by researchers to
create and organize research data, code and report
files in a systematic way for sharing and public
archiving.
• A core set of packages is included with the installation
of R, with more than 15,000 additional packages
available at the Comprehensive R Archive Network
(CRAN), Bioconductor, Omegahat, GitHub, and other
repositories.
7. C R A N
Comprehensive R Archive Network
• CRAN is a network of ftp and web servers
around the world that store identical, up-to-
date, versions of code and documentation
for R.
• Please use the CRAN mirror nearest to you to
minimize network load.
CRAN mirror?
8.
9. ANACONDA
• Anaconda is the birthplace of Python data
science.
• Anaconda is a free and open-source distribution
of the Python and R programming languages for
scientific computing.
• This aims to simplify package management and
deployment.
• The distribution includes data-science packages
suitable for Windows, Linux, and macOS.
10. Why should adopt R?
• R can be integrated with other programming
languages like C, C++, Python, etc.
• R has more than 10,000 packages in its
repository.
• R has community support of developers world-
wide.
• Easy interface for data treatment &
visualization.
11. Companies using
‘R’eal time
• Google:
– calculate ROI on advertising campaigns
– Economic forecasting
– Big-data statistical modeling
• Facebook:
– User behavior analysis related to status update
and profile pictures.
– Exploratory data analysis, Big-data visualization.
12. Companies using ‘R’eal time
• Twitter:
– Use for semantic clustering & data visualization
– Anomaly & breakout detection for improving their customer
experience.
• John Deere:
– Use to forecasting crop yields.
– Optimizing the build order on production line.
• ANZ Bank:
– Use for Credit Risk analysis.
– Fit models for mortgage loss.