Handwritten Text Recognition for manuscripts and early printed texts
R Language for Statistics and Big Data Analysis
1. R Language
INTRODUCTION:
R is an open source programming language and software environment for statistical computing
and graphics. The R language is widely used among statisticians and data miners for developing
statistical software and data analysis. But, we are going to integrate R with Hadoop in order to
manage BigData efficiently.
BASIC COMMANDS:
GetDirectory:
getwd() [ R language is case sensitive ]
Assignment:
Single value:
s < - 3 or s = 3 [The value of s is 3]
Multiple values:
s < - c (1, 2, 3) [The value of s is 1, 2, 3]
Or
s < - c (1:3) [The value of s is 1, 2, 3]
Mean:
mean(x) [The mean of x i.e. 1, 2, 3 is 2]
2. Variance:
var(x) [The variance of x i.e.1, 2, 3 is 1]
Linear Model:
lm_1 < - lm(y~x) [The linear model between two variables will be shown]
Graphical Representation:
plot (lm_1) [The linear model will be graphically represented]
Summary:
summary (lm_1) [The summary of the linear model will be shown]
List of variables:
ls () [ The list of all variables used will be shown]
Reading files:
read.table ( file=”sample.csv”)
[Files can be read in this way, mostly csv files]
CONCLUSION:
Some basic commands are being noted down. These may help to gain some primary knowledge
on R. R with Hadoop is still in progress.