Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetup talk.
1. An
e-‐learning
pla,orm
for
Data
Analysis
based
on
R
Jonathan
Cornelissen,
Dieter
De
Mesmaeker,
Albert
Jorissen,
Mar5jn
Theuwissen
24/5/2013,
RBelgium
meetup
FEB,
KU
Leuven
Welcome!
2. 1.
MoIvaIon:
Why
e-‐learning
with
and
for
R?
2.
Learner
experience
3.
Technical
overview
4.
Course
creators
experience
on
DataMind
5.
Submission
Correctness
Tests
(examples)
6.
QuesIons
and
answers?
3. Why
e-‐learning
with
and
for
R?
Need
for
scalable
tools
to
learn
R
and
Data
Analysis…
4. Because of exponentially growing R user base
More
than
2
million
R
users
growing
at
40-‐60%
yearly
Source:
hWp://r4stats.com/arIcles/popularity/
and
hWp://prezi.com/s1qrgfm9ko4i/the-‐r-‐ecosystem/
5. Keyword Competition Global2Monthly2Searches
r"tutorial 0 6600
introduction"to"r 0 1600
online"statistics"course 0.98 1600
ggplot2"tutorial 0 880
statistics"course 0.85 880
an"introduction"to"r 0.01 880
r"book 0.06 590
learning"statistics 0.38 590
r"tutorials 0 590
r"introduction 0.01 480
statistics"courses 0.84 480
statistics"introduction 0.1 480
online"statistics"courses 0.99 320
r"course 0.04 260
r"training 0.17 260
free"online"statistics"course 0.56 260
statistics"training 0.62 210
online"statistics"class 0.98 170
statistics"class"online 0.98 140
data"analysis"tutorial 0.5 110
Analysis of r-project.org Analysis of Google keywords
Compare
to:
SAS
tutorial:
4400
Eviews
tutorial:
390
Stata
tutorial:
1900
Matlab
tutorial:
22200
Hadoop
tutorial:
12100
Source:
Analysis
based
on
h?p://cran.r-‐project.org/report_cran.html
Source:
Analysis
based
on
h?p://adwords.google.com/select/keywordtoolexternal
That needs to learn the basics and the specifics of R
• Number
of
downloads
per
month
for:
• IntroducIon
to
R
pdfs:
140.000
• Summary
pdfs:
50.000
• Some
of
the
“top”
package:
(reliability/stability
of
numbers
below?)
kernlab.pdf
349,780
party.pdf
167,396
igraph.pdf
59,969
VennDiagram.pdf
30,889
mclust.pdf
19,347
KnitR.pdf
10,697
twitteR.pdf
7,507
randomForest.pdf
6,824
Ggplot2
5,924
raster.pdf
5,326
6. Source:
hWp://r4stats.com/arIcles/popularity/
6,275
R
packages
at
all
major
repositories,
4,315
of
which
were
at
CRAN
Across
a
broad
spectrum
of
domains:
Financial
engineering,
biostaSsScs,
data
mining,
…
Because of the exponentially growing functionality
8. • Great
books,
tutorials,…
on
R
• But
coding
is
learned
by
doing
• No
online
learning
interface
for
R
• DocumentaIon
made
by
experts
for
experts,
not
for
beginners
or
intermediate
users
Learners :
Students, Professionals, Researchers, Employees
Why e-learning with and for R?
9. • Great
books,
tutorials,…
on
R
• But
coding
is
learned
by
doing
• No
online
learning
interface
for
R
• DocumentaIon
made
by
experts
for
experts,
not
for
beginners
or
intermediate
users
Teachers :
Learners :
• Ofen
give
the
same
or
similar
feedback
to
students
in
exercise
sessions
• Manually
correct
assignments
• StaIc
content
• Hard
to
get
feedback
Students, Professionals, Researchers, Employees
Why e-learning with and for R?
Data Analysis Professors, Consultants, Researchers, Book authors
10. InteracIve
training
Learning
by
doing
Two pillars of learning experience on DataMind
In
a
compelling
way
GamificaSon
11. Benefits for students of learning R online
1. Everything
in
one
place:
Assignments,
sample
code,
R-‐console,
…
2. Lowering
the
barrier:
Start
right-‐away
with
R,
no
installaIon,
version
problems,
..
since
R
runs
in
the
background
on
our
servers
3. Automated
correcIon
and
feedback
through
Submission
Correctness
Tests
(SCT)
4. More
fun
through
gamificaIon
of
the
learning
process
13. Exercises versus Challenges
1. Read
challenge
2. Type
code
to
solve
the
challenge
3. Get
result
on
certain
metric
4. Get
ranked
on
the
leaderboard
5. Possibility
to
improve
your
code
6. Learn
from
others’
soluIons
1. Read
exercise
descripIon
2. Read
instrucIons
3. Type
code
to
solve
the
Exercise
4. Get
personalized
feedback
on
the
correctness
of
your
soluIon
• For
example:
• Forecast
R
usage
in
next
month
Metric
=
accuracy
of
forecast
• Find
most
efficient
way
to
calculate
certain
parameter
of
a
model
Metric
=
Sme
to
compute
• …
15. R
Open-‐source
staIsIcal
language
DataMind leverages state of the art open-source
frameworks in the cloud
• Scaling
• Automated
• Affordable
16. • Scalable
• Plug
&
Play
• Easy
R
serve
Ruby
on
Rails
High
producIvity
web
applicaIon
framework
Node.js
Pla,orm
for
real-‐Ime
scalable
network
applicaIons
R
Open-‐source
staIsIcal
language
DataMind leverages state of the art open-source
frameworks in the cloud
17. WebSockets
AJAX
requests
R
serve
Ruby
on
Rails
High
producIvity
web
applicaIon
framework
Node.js
Pla,orm
for
real-‐Ime
scalable
network
applicaIons
RESTful
API
R
Open-‐source
staIsIcal
language
Angular.js
MVC
JavaScript
framework
for
single-‐page
applicaIons,
maintained
by
Google
DataMind leverages state of the art open-source
frameworks in the cloud
18. Rserve: Communication with R
• Package
of
Simon
Urbanek
• Manages
sessions
and
workspaces
• Binary
communicaIon
• Emulate
console
with
capture.output()
• Detect
incomplete
statements
with
parse()
• Catch
and
print
errors
19. RAppArmor: Security
• EvaluaIon
of
external
code
è
Huge
security
risk
• SoluIon:
• Limited
access
to
OS
• RAppArmor
• Package
of
Jeroen
Ooms
• R-‐interface
to
OS
Security
• Limit
CPU,
Memory,
Spawned
processes
21. Benefits for course creation
1. Save
Time!
1. Automated
correcIon
of
student
exercises
2. Efficient
way
to
get
feedback
from
course
takers
3. Scalable
distribuIon
of
course
content
2. Visibility
for
your
package
/
courses
3. Insights
in
your
course
4. Per
student
tracking
1. Number
of
aWempts
per
exercise
2. Use
of
“hint”
and
“soluIon”
3. Time
to
complete
per
exercise
5. Possibility
to
use
courses/exercises
from
other
creators
22. How to create courses
We want your feedback!
1.
Write
the
Assignment
23. How to create courses
We want your feedback!
2.
Provide
instruc5ons
to
student
24. How to create courses
We want your feedback!
3.
Provide
sample
code
to
help
student
geZng
started
25. How to create courses
We want your feedback!
4.
Pre-‐exercise
code
is
run
in
the
background
to
pre-‐load
a
dataset,
graphs,
etc.
26. How to create courses
We want your feedback!
5.
Provide
sample
solu5on
27. How to create courses
We want your feedback!
6.
Write
Submission
Correctness
Test
wriNen
in
R
that
checks
the
input
of
the
student
and
returns
feedback
29. Submission Correctness Tests (SCT)
A
Submission
Correctness
Test
checks
the
input
from
a
student
and
returns
(i)
whether
the
student’s
input
was
correct
and
(ii)
feedback
to
student.
• These
tests
are
wriWen
in
R
• Should
be
easy
for
a
course
creator
-‐>
started
developing
an
R
package
DataMind
package
to
aid
course
creators
to
write
simple
tests*
*hWps://github.com/jonathancornelissen/DM
"Mistakes
are
not
errors
but
parSally
correct
soluSons
with
underlying
logic."
30. 1. Assignment
to
student:
x
should
be
5
2. Student
types:
x <- 4
3. Submission
Correctness
Test:
if( x == 5 ){
DM.result <- list(TRUE, “Well done, you genius!”)
}else{
DM.result <- list(FALSE, “Please assign 5 to x”)
}
4. Output
to
student
“Please assign 5 to x”
Simple Submission Correctness Tests (SCT)
31. 1. Assignment
to
student:
x
should
be
5
2. Student
types:
x <- 5
3. Submission
Correctness
Test:
if( x == 5 ){
DM.result <- list(TRUE, “Well done, you genius!”)
}else{
DM.result <- list(FALSE, “Please assign 5 to x”)
}
4. Output
to
student
“Well done, you genius!”
Simple Submission Correctness Tests (SCT)
32. • Everything
in
the
student’s
workspace
• DM.user.code
all
code
wri?en
by
student
• DM.console.output
everything
printed
to
user
console
• DM.errors
errors
generated
when
running
students
code
INPUT
Automated exercise correction with SCT
Assignment
to
the
student:
Print
a
matrix
with
3
rows
containing
the
numbers
1
up
to
9
If
Student
does
this
correctly
then:
DM.console.ouput
contains
[,1]
[,2]
[,3]
[1,]
1
2
3
[2,]
4
5
6
[3,]
7
8
9
33. • Everything
in
the
student’s
workspace
• DM.user.code
all
code
wri?en
by
student
• DM.console.output
everything
printed
to
user
console
• DM.errors
errors
generated
when
running
students
code
INPUT
Automated exercise correction with SCT
Submission
Correctness
Test
wriNen
by
course
creator
(poten5ally
using
DM
package)
Assignment
to
the
student:
Print
a
matrix
with
3
rows
containing
the
numbers
1
up
to
9
If
Student
does
this
correctly
then:
DM.console.ouput
contains
[,1]
[,2]
[,3]
[1,]
1
2
3
[2,]
4
5
6
[3,]
7
8
9
DM.result <-
DM.outputContains("matrix(1:9,
byrow=TRUE, nrow=3)”)
34. • Everything
in
the
student’s
workspace
• DM.user.code
all
code
wri?en
by
student
• DM.console.output
everything
printed
to
user
console
• DM.errors
errors
generated
when
running
students
code
INPUT
Automated exercise correction with SCT
Submission
Correctness
Test
wriNen
by
course
creator
(poten5ally
using
DM
package)
• Assigned
to
variable
DM.result
• List
with
two
elements
1. TRUE
/
FALSE
2. Message
to
provide
to
student
with
feedback
OUTPUT
Assignment
to
the
student:
Print
a
matrix
with
3
rows
containing
the
numbers
1
up
to
9
If
Student
does
this
correctly
then:
DM.console.ouput
contains
[,1]
[,2]
[,3]
[1,]
1
2
3
[2,]
4
5
6
[3,]
7
8
9
DM.result <-
DM.outputContains("matrix(1:9,
byrow=TRUE, nrow=3)”)
DM.
result
is
shown
to
student
35. SCT enable wide variety of options
• Has
the
student
esImated
a
certain
model
correctly?
• Generated
a
transformed
Ime
series
that
fulfills
certain
condiIons?
• Generated
a
certain
type
of
graph
?
• Forecasted
a
metric
of
interest
within
certain
bounds?
• …
36. Albert Jorissen
Martijn Theuwissen
Dieter De Mesmaeker
Jonathan Cornelissen
Want to help us to build a community !
for learning and teaching R online?
Contact us!!
Jonathan@datamind.org
Dieter@datamind.org
Albert@datamind.org
Martijn@datamind.org
38. Filled out by 286 Academics,
professionals
and
students
from
around
the
globe.
Majority
of
respondents
interested
in
free
interacIve
courses
Most
package
authors
willing
to
create
free
interacIve
tutorials
Full
data
set
of
the
survey
and
discussion
of
results
at
www.datamind.org/survey
Survey on R and education to verify interest
of community
Notes de l'éditeur
University project, early stage, in heavy development, we are looking forward to your feedback....
Bij punt 2: - Wijdoen promo + wijradengoeie lessen aanaan course takersBij punt 3: - What type of users?
Bijpunt 2: - Wijdoen promo + wijradengoeie lessen aanaan course takersBijpunt 3: - What type of users?