2. Schedule
8.30 Intro to the tutorial
Linked data and its potential in learning analytics scenarios
Basics of manipulating linked data
10.30 Coffee break
11.00 Using Linked Data in Analytics Tools
Evaluation of the Linked Data applications
12.30 Lunch
13.30 Introduction to the LAK Data challenge
Presentations from the LAK Data Challenge particiants
15.30 Tea break
16.30 Current state of Linked Data in Learning Analytics
Results of the challenge
Wrap up
17.30 Finished
3. Using Linked Data?
Tools for Analytics?
We of course cannot cover all the possible usages of Linked Data, and every
possible tools that can be used to process data for analytics.
We therefore focus here on three basic tools/scenarios (to show that it is
simple):
1. Load the results of a SPARQL select query about UK-based schools into Open
Refine, and use it for some facet-based exploration of the data (tool:
https://github.com/OpenRefine)
2. Send a SPARQL select query for information on courses at the Open
University from R, and display charts showing this information (tool:
http://www.r-project.org/)
3. Build a visualisation of the network of co-authors from the University of
Southampton in Gephi, using a SPARQL construct query (tool:
https://gephi.org/)
4. Open Refine
/* formerly Google Refine */
A powerful data cleaning/ manipulation / exploration tool.
Originally developed by Freebase, taken over by Google.
Can import and export from a lot of different formats (import CSV, export RDF –
through extension)
Import from SPARQL?
https://github.com/OpenRefine
5. SPARQL proxy to the rescue!
http://data-gov.tw.rpi.edu/ws/sparqlproxy.php
11. Going further
Some
other tool
SPARQL endpoint
SPARQL SPARQL CSV
Results proxy
Open Refine Excel
RDF
12. R
Free statistics processing programme and language
Very popular as a visualisation, data science, etc. tool
http://www.r-project.org/
And there is a SPARQL
import tool library !
13. R SPARQL Library
Getting Started: https://code.google.com/p/r-sparql/
In R Console:
install.packages("rJava")
install.packages(“SPARQL")
In R: Packages Install Package(s) Choose a mirror Choose the package
In R SPARQL:
library(SPARQL)
Note: There might be errors about other packages needed (e.g. XML). If that happens
install them with install.packages(“XML”)
Run query (in R console):
Results <- SPARQL(“endpoint”, “query”)
14. Let’s try
Query:
select distinct ?subjectlabel ( count(distinct ?course) as ?nbcourse )
( avg(?creds) as ?avgcredits) ( avg(?price) as ?avgprice) where {
?course a <http://courseware.rkbexplorer.com/ontologies/courseware#Course>.
?course <http://purl.org/dc/terms/subject> ?subject.
<http://data.open.ac.uk/topic>
<http://www.w3.org/2004/02/skos/core#hasTopConcept> ?subject.
?subject <http://www.w3.org/2000/01/rdf-schema#label> ?subjectlabel.
?course <http://data.open.ac.uk/saou/ontology#eu-number-of-credits> ?creds.
?course <http://purl.org/net/mlo/specifies> ?presentation.
?offer <http://purl.org/goodrelations/v1#includes> ?course.
?offer <http://purl.org/goodrelations/v1#availableAtOrFrom>
<http://sws.geonames.org/2802361/>.
?offer <http://purl.org/goodrelations/v1#hasPriceSpecification> ?pricespec.
?pricespec <http://purl.org/goodrelations/v1#hasCurrencyValue> ?price.
} group by ?subjectlabel
(number of courses and average number of credits/average price of courses per top level topics of the
Open University)
Endpoint: http://data.open.ac.uk/query
15. In R
> library(“SPARQL”)
> results <- SPARQL(“http://data.open.ac.uk/query”, “select distinct ?
subjectlabel ( count(distinct ?course) as ?nbcourse ) ( avg (?creds) as ?
avgcredits) ( avg(?price) as ?avgprice) where {?course a
<http://courseware.rkbexplorer.com/ontologies/courseware#Course>.?course
<http://purl.org/dc/terms/subject> ?subject. <http://data.open.ac.uk/topic>
<http://www.w3.org/2004/02/skos/core#hasTopConcept> ?subject. ?subject
<http://www.w3.org/2000/01/rdf-schema#label> ?subjectlabel. ?course
<http://data.open.ac.uk/saou/ontology#eu-number-of-credits> ?creds. ?course
<http://purl.org/net/mlo/specifies> ?presentation. ?offer
<http://purl.org/goodrelations/v1#includes> ?course. ?offer
<http://purl.org/goodrelations/v1#availableAtOrFrom>
<http://sws.geonames.org/2802361/>. ?offer
<http://purl.org/goodrelations/v1#hasPriceSpecification> ?pricespec. ?
pricespec <http://purl.org/goodrelations/v1#hasCurrencyValue> ?price.} group
by ?subjectlabel”)
> print(results)
16. Results
$results
subjectlabel nbcourse avgcredits avgprice
1 "Mathematics and Statistics"@en 32 14.85714 1255.857
2 "Education"@en 26 28.26087 2149.681
3 "Business and Management"@en 40 14.11017 1506.797
4 "Environment, Development and International Studies"@en 41 24.20000 2197.533
5 "Childhood and Youth"@en 27 28.02632 2235.342
6 "Law"@en 17 16.95652 1625.652
7 "Health and Social Care"@en 45 22.05263 1852.579
8 "Science"@en 81 13.33333 1115.463
9 "Engineering and Technology"@en 44 14.67949 1598.192
10 "Computing and ICT"@en 44 14.04762 1330.488
11 "Languages"@en 24 20.20408 1712.143
12 "Social Sciences"@en 25 21.04478 1779.522
13 "Arts and Humanities"@en 51 27.26667 2302.173
14 "Psychology"@en 13 17.94643 1519.393
$namespaces
NULL
17. Draw some charts
> restable <- results[[1]]
> pie(restable$nbcourse)
> pie(restable$nbcourse, restable$subjectlabel,
col=rainbow(length(restable$nbcourse)))
Distribution of number of
courses in topics
18. Draw some charts
> barplot(sort(restable$avgprice),
col=rainbow(length(restable$avgprice)))
Distribution of
average price of a
course in topics
19. Bar chart
> pricepercredit <- restable$avgprice / restable$avgcredits
> barplot(pricepercredit, horiz=TRUE, legend=restable$subjectlabel,
col=rainbow(length(pricepercredit)))
Price by credit for different high-level topics
LinkedUp – Author Name 9. April 2013 19
20. Gephi
Network visualisation and analysis tool.
Very popular for its powerfull rendering engine, its ability to deal with
reasonably large networks and the analysis tools it provides
https://gephi.org/
And it has a “Semantic
Web Import” plugin
to import networks
from SPARQL
21. Install the plugin
In Gephi:
Menu Tools Plugins Available plugins Choose “Semantic Web Import”
22.
23. Needs a construct query:
the resulting graph will be
the one visualised
24. Let’s try!
Endpoint: http://sparql.data.southampton.ac.uk/
Query:
construct {
?author1 <http://myonto.com/coauthor> ?author2.
?author1 <http://gephi.org/label> ?name1.
Special properties
to set the labels of
?author2 <http://gephi.org/label> ?name2.
nodes
}
where {
?pub <http://purl.org/dc/terms/creator> ?author1.
?pub <http://purl.org/dc/terms/creator> ?author2.
?author1 <http://xmlns.com/foaf/0.1/name> ?name1.
?author2 <http://xmlns.com/foaf/0.1/name> ?name2.
filter ( ?author1 != ?author2 )
}
limit 15000