Talk given at R Rosetta Stone meetup in NYC on 1/7/2010 about MATLAB and R.
Co-authored with Harlan Harris.
Video of the talk available at:
http://www.vcasmo.com/video/drewconway/7211
Graph Algorithms, Sparse Algebra, and the GraphBLAS with Janice McMahon
Matlab/R Dictionary
1. MATLAB/R
Dic,onary
R
meetup
NYC
January
7,
2010
Harlan
Harris
harlan@harris.name
@HarlanH
Marck
Vaisman
marck@vaisman.us
@wahalulu
MATLAB
and
the
MATLAB
logo
are
registered
trademarks
of
The
Mathworks.
2. About
MATLAB
What
is
MATLAB
MATLAB
History
• Commercial
numerical
• Developed
by
Cleve
Moler
programming
language,
(Math/CS
Prof
at
UNM)
in
the
simula,on
and
visualiza,on
1970’s
as
a
higher-‐level
• One
million
users
(engineers,
numerical
programming
scien,sts,
academics)
language
(vs.
Fortran
LINPACK)
• MATrix
LABoratory
–
• Adopted
by
engineers
for
specializes
in
matrix
signal
processing,
control
opera,ons
modeling
• Mathworks
-‐
base
&
add-‐ons
• Mul,purpose
programming
• Open-‐source
Octave
project
language
3. Notes
• Today’s
focus:
Compare
MATLAB
&
R
for
data
analysis,
contrast
as
programming
languages
• MATLAB
is
Base
plus
many
toolboxes
– Base
includes:
descrip,ve
stats,
covariance
and
correla,on,
linear
and
nonlinear
regression
– Sta,s,cs
toolbox
adds:
dataset
and
category
(like
data.frames
and
factors)
arrays,
more
visualiza,ons,
distribu,ons,
ANOVA,
mul,variate
regression,
hypothesis
tests
4. -‐>
• Interac,ve
programming:
Scripts
and
Read-‐Evaluate-‐
Print
Loop
• Similar
representa,ons
of
data
– Both
use
vectors/arrays
as
the
primary
data
structures
• Matlab
is
based
on
2-‐D
matricies;
R
is
based
on
1-‐D
vectors
– Both
prefer
vectorized
func,ons
to
for
loops
– Variables
are
declared
dynamically
• Can
do
most
MATLAB
func,onality
in
R;
can
do
most
R
func,onality
in
MATLAB.
5. The
basics:
vectors,
matrices
and
indexing
Task
Create
a
row
vector
v
=
[1
2
3
4]
v<-‐c(1,2,3,4)
Create
a
column
vector
v=[1;2;3;4]
or
v=[1
2
3
4]’
v<-‐c(1,2,3,4)
Note:
R
does
not
distinguish
between
row
and
column
vectors
Enter
a
matrix
A
A=[1
2
3;
4
5
6]
Enter
values
by
row:
A<-‐matrix(c(1,2,3,4,5,6),
nrow=2,
byrow=TRUE)
Enter
values
by
column:
A<-‐matrix(c(1,4,2,5,3,6),
nrow=2)
Access
third
element
of
vector
v
v(3)
v[3]
or
v[[3]]
Access
element
of
matrix
A
A(2,3)
A[2,3]
“Glue”
two
matrices
a1
and
a2,
A=[a1
a2]
A<-‐cbind(a1,a2)
same
number
of
rows,
side
by
side
“Stack”
two
matrices
a1
and
a2,
A=[a1;a2]
A<-‐rbind(a1,a2)
same
number
of
columns
Reshape*
matrix
A,
making
it
an
m
A=reshape(A,m,n)
dim(A)<-‐c(m,n)
x
n
matrix
with
elements
taken
columnwise
from
A
6. Operators
Task
Assignment
=
<-‐
or
=
Whole
Matrix
Opera,ons:
Multiplication:
A*B
A
%*%
B
Square
the
matrix:
A^2
A
%*%
A
Raise
to
power
k:
A^k
A
%*%
A
%*%
A
…
Element-‐by-‐element
A.*B
A*B
A./B
A/B
Opera,ons:
A.^k
A^k
Compute
A-‐1B
AB
A%*%
solve(B)
Sums
Columns
of
matrix:
sum(A)
colSums(A)
Rows
of
matrix:
sum(A,2)
rowSums(A)
Logical
operators
(element-‐by-‐ a
<
b,
a
>
b,
a
<=
b,
a
>=
b
a
<
b,
a
>
b,
a
<=
b,
a
>=
b
a
==
b
a
==
b
element
on
vectors/matrices)
a
~=
b
a
!=
b
AND:
a
&&
b
AND:
a
&&
b
(short-‐circuit)
a
&
b
(element-‐wise)
OR:
a
||
b
OR:
a
||
b
a
|
b
XOR:
xor(a,b)
XOR:
xor(a,b)
NOT:
~a
NOT:
!a
7. Working
with
data
structures
Task
Build
a
structure
v
of
length
n,
v=cell(1,n)
In
general,
cell v<-‐vector(’list’,n)
capable
of
containing
different
(m,n)
makes
an
m
×
n
cell
Then
you
can
do
e.g.:
array.
Then
you
can
do
e.g.:
v[[1]]<-‐12
data
types
in
different
elements.
v{1}=12
v[[2]]<-‐’hi
there’
MATLAB:
cell
array
v{2}=’hi
there’
v[[3]]<-‐matrix(runif(9),3)
R:
list
v{3}=rand(3)
Create
a
matrix-‐like
object
with
avals=2*ones(1,6);
v<-‐c(1,5,3,2,3,7)
different
named
columns.
yvals=6:-‐1:1;
v=[1
5
3
2
3
7];
d<-‐data.frame(cbind(a=2,
d=struct(’a’,
avals,
yy=6:1),
v)
MATLAB:
struct
array
’yy’,
yyvals,
’fac’,
v);
R:
data.frame
8. Condi,onals,
control
structures,
loops
Task
for
loops
over
values
in
vector
for
i=v
If
only
one
command:
command1
for
(i
in
v)
v
command2
command
end
If
multiple
commands:
for
(i
in
v)
{
command1
command2
}
If/else
statement
if
cond
if
(cond)
{
command1
command1
command2
command2
else
}
else
{
command3
command3
command4
command4
end
}
MATLAB
also
has
the
elseif
R
uses
chained
“else
if”
statement.
statements.
ifelse()
func,on
>
print(ifelse(c(T,F),
2,
3))
[1]
2
3
9. Help!
Task
Get
help
on
a
func,on
help
fminsearch
help(pmin)
or
?pmin
Search
the
help
for
a
word
lookfor
inverse
??inverse
Describe
a
variable
class(a)
class(a)
str(a)
Show
variables
in
environment
who
ls()
Underlying
type
of
variable
whos(‘a’)
typeof(a)
10. Example:
k-‐means
clustering
of
Fisher
Iris
data
Fisher
Iris
Dataset
sepal_length,sepal_width,petal_length,petal_width,species
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
…
11. Matlab
and
R
as
programming
languages
Scrip,ng,
real-‐,me
analysis
Scrip,ng,
real-‐,me
analysis
File-‐based
environments
Files
unimportant
Impera,ve
programming
style
Func,onal
programming
style
(impure)
Sta,cally
scoped
Dynamically
scoped
Func,ons
with
mul,ple
return
values
Func,ons
with
named
arguments,
lazy
evalua,on
Evolving
OOP
system
Mul,ple
compe,ng
OOP
systems
Can
be
compiled
Cannot
be
compiled
Large
library
of
func,ons
Large
library
of
func,ons
Professional
developed,
cost
money
Varying
quality
and
support
Can
embed
(in)
many
other
languages
Can
embed
(in)
many
other
languages
12. Func,ons
function
[a,
b]
=
minmax(z)
minmax
<-‐
function(c,
opt=12)
{
%
one
function
per
.m
file!
#
functions
are
assigned
to
%
assign
to
formal
return
names
#
variables
a
=
min(z)
ret
<-‐
list(min
=
min(z),
b
=
max(z)
max
=
max(z))
end
ret
#
last
statement
is
#
return
value
}
%
if
minmax.m
in
path
#
if
minmax
was
created
in
current
[smallest,
largest]
=
…
#
environment
minmax([1
30
3])
x
<-‐
minmax(c(1,
30,
3))
smallest
<-‐
x$min
13. Object-‐Oriented
Programming
• Formerly:
objects
were
• S3
classes:
anributes
+
defined
by
a
directory
syntax
tree,
with
one
method
– class(object)
per
file
– plot.lm()
• As
of
2008:
new
• S4
classes:
defini,ons
+
classdef
syntax
methods
resembles
other
• R.oo,
proto,
etc…
languages
14. Other
notes
• r.matlab
package
• Graphics
– Matlab
has
much
bener
3-‐d/interac,ve
graphics
support
– R
has
ggplot2
and
much
bener
sta,s,cal
graphics
15. Addi,onal
Resources
• Will
Dwinell,
Data
Mining
in
MATLAB
• Computerworld
ar,cle
on
Cleve
Moler
• Mathworks
• Matlabcentral
• Comparison
of
Data
Analysis
packages
(
hnp://anyall.org/blog/2009/02/comparison-‐of-‐data-‐
analysis-‐packages-‐r-‐matlab-‐scipy-‐excel-‐sas-‐spss-‐
stata/)
• R.matlab
package
• stackoverflow
16. References
used
for
this
talk
• David
Hiebeler
MATLAB/R
Reference
document:
hnp://www.math.umaine.edu/~hiebeler/comp/
matlabR.html
• hnp://www.cyclismo.org/tutorial/R/index.html
• hnp://www.stat.berkeley.edu/~spector/R.pdf
• MATLAB
documenta,on
• hnp://www.r-‐cookbook.com/node/23