Networking in the Penumbra presented by Geoff Huston at NZNOG
Adding More Semantics to the Social Web
1. SIG-SWO Invited Lecture, NII
Adding more semantics to the
Social Web: structure and content
John Breslin, Leader, Unit for Social Software, Insight, NUI Galway
ジョン・ブレスリン、上級講師、 アイルランド国立大学ゴールウェイ校
10th July 2015
Slides http://bit.ly/ジョン・ブレスリン
3. 1986: SW radio
Before the Web and satellite phones,
the world’s news was communicated
and shared by shortwave radio
• I loved it!
Stations “crowdsourced” SINPO data
collection, and “gami"ed” this process
by returning QSL cards à
• SINPO = signal-interference-noise-propagation-overall report
4. 1990: MAP.COM
• My "rst “social software” program from 25 years ago
• Fusing data from user logins on a VAX/VMS mainframe with
computer locations in terminal rooms
• Terminal numbers were my unique identi"ers
5. 1998: Set up a gaming forum
2000: Co-founded boards.ie from this
• Ireland’s largest discussion forum site
• 2.5 million visitors/month (~40% of Irish population)
• Irish people seeking information, or just chatting about sports, TV,
politics, health, whatever
• Spin-o# site adverts.ie
• Classi"ed ads
• Bought by Schibsted Media Group yesterday
6. 2004: Joined DERI at NUI Galway, founded
the SIOC project
• Semantically Interlinked
Online Communities
• Enables interoperability and
exchange of social content:
• Blogs, forums, wikis
• More later…
7. 2011: Co-founder of StreamGlider, Inc.
• Real-time streaming
newsreader for the iPad
• With Nova Spivack and the
late Bill McDaniel
• Supports social, multimedia,
news
• Can be used as an enterprise
dashboard or event display
8. 2013-2015: Startup ecosystem activation
• Co-founder of NUI Galway
Entrepreneurship Society
• Co-founder of Startup
Galway, Galway City
Innovation District and the
PorterShed
• Advisor to Irish startups
including AYLIEN, BirdLeaf,
BuilderEngine and Pocket
Anatomy
• Author of yearly series on
“Talented 38 Tech Women”
9. Big fan of Japanese culture!
• Set up the mangatoanime.com
discussion forum in 2001
• First anime seen in 1978: Battle of
the Planets 科学忍者隊ガッチャマン
• Fave manga: Battle Angel Alita 銃夢
• Established the "rst Isao Tomita
冨田勲 website in 1996
• Synthesizer musician known for
classical reworkings, "lm / TV works
• Interviewed Tomita in 1999 à
• Also like 喜多郎, 坂本龍一, YMO,
ピタゴラスイッチ, 本田
10. National University of Ireland Galway
アイルランド国立大学ゴールウェイ校
• Galway is a small city in the
West of Ireland
• NUI Galway was established
in 1845:
• One of Ireland’s seven
universities
• 105 hectares (260 acres)
• 120 links with universities
around the world
• 17,300 students:
• 12,500 undergraduates,
3,600 postgraduates, 1,200
other
• 2,541 sta#:
• 1,078 academics, 1,015 admin
and support, 448 research
• 90,000 alumni in over a
hundred countries
11. Notable people and interesting
connections to NUI Galway
• Alice Perry, "rst female graduate engineer in the world, 1849
• Michael O’Shaughnessy, chief engineer of San Francisco who
commissioned / named the Golden Gate Bridge, graduated in 1884
• Also, two Galway men founded Menlo Park in Silicon Valley in 1854
• JRR Tolkien J・R・R・トールキン was an external examiner in 1949
• John Ryan, Macrovision inventor, studied here in the 1960s
• Current Irish President Higgins and Taoiseach Kenny are graduates
• Honorary degrees given to Nelson Mandela ネルソン・マンデラ,
Hillary Clinton ヒラリー・クリントン, Enya エンヤ
• Actor Martin Sheen マーティン・シーン studied here in 2006
• The JK Rowling J・K・ローリング (ハリー・ポッター) charity Lumos
partnered with NUI Galway last week to help orphaned children
worldwide stay with their families
12. “Electron”
[I am a lecturer in Electronic Engineering]
• The term “electron” was coined
by George Johnstone Stoney
• Professor of Natural
Philosophy QCG (NUI Galway)
from 1852-1857
• Calculated charge (1874)
• Proposed the term
“electron” (1891)
• Electronic: 1,170,000,000
Google results
• Email: 7,510,000,000 Google
results
13. Insight Centre for Data Analytics
インサイト
• Ireland’s largest multi-
institution ICT research
institute, funded by SFI
• 200 researchers
• 8 institutions
• 30 partners
• €88M in funding
Subsumes DERI (NUI Galway),
Clarity (UCD / DCU), Clique (UCD /
NUI Galway), TRIL (UCD), 4C (UCC)
14. Unit for Social Software
(USS)
• Established in 2005 as a sub-cluster of Semantic Web at DERI
• 7 PhDs in progress (including 2 industry-based PhDs, 1 part-
time PhD), plus 1 PhD and 1 MSc submitting
• 9 PhD alumni, 3 MSc alumni
• 6 postdoc alumni, 2 RA alumni, 15 visiting researcher alumni
(including Yuki Matsuoka PhD from NII)
21. A two-way street: the Semantic Web can
help the Social Web, and vice versa
• Can use the Semantic Web
to describe people, content
objects and the connections
that bind them all together
so that social sites can
interoperate via semantics
• In the other direction,
object-centered social
websites can serve as rich
social data sources for
semantic applications
“I think we could...have
both Semantic Web
technology supporting
online communities, but at
the same time also online
communities can also
support Semantic Web data
by being the sources of
people voluntarily
connecting things
together.” – Tim Berners-
Lee ティム・バーナーズ=リー
Image from tinyurl.com/highway2
22. Object-centred sociality (AKA social
objects)
• Users are connected via a common object:
• Their job, university, hobbies, interests, a date…
• “According to this theory, people don’t just connect to each
other. They connect through a shared object. […] Good
services allow people to create social objects that add
value.” – Jyri Engestrom
• Flickr or Instagram = photos
• YouTube or Vimeo = videos
• WordPress or Tumblr = posts
• etc.
23. The social objects that connect us to
others can be represented by semantics
25. What is the Social Semantic Web (SSW)?
ソーシャル・セマンティック・ウェブ
26. Some SSW vocabularies
• FOAF
• SIOC
• Created at NUI Galway
• Online Presence Ontology
[OPO]
• Co-created at NUI Galway
• Semantic Cloud of Tags
[SCOT]
• Created at NUI Galway
• Meaning of a Tag [MOAT]
• Facebook OGP
• Contributions to RDF version
from NUI Galway
• schema.org
• RDF version at NUI Galway
Facebook Open
Graph Protocol
schema.org
OPO
SCOT
30. Creating an ontology needs more than just
a spec page: community, evangelism, etc.
31. A range of SIOC modules were created to
extend SIOC Core while avoiding clutter
• SIOC Access (sioce)
• SIOC Actions (sioca)
• SIOC Argumentation (siocr)
• SIOC Chat (siocc)
• SIOC Mining (siocm)
• SIOC Quotes (siocq)
• SIOC Services (siocs)
• SIOC Types (sioct)
• SWAN/SIOC (swansioc)
32. The foundations are there, so what have
we been aiming for since 2008?
1. Continue dissemination of
Social Semantic Web
ontologies to increase the
level and quality of social
semantic data
2. Transition social semantic
data (existing and future)
into knowledge
35. Impact: RDFa in Drupal 7
• Drupal has a 6-7% market share of content management
systems
• Drupal 7 release has Semantic Web support built-in:
• NUI Galway hosted and sponsored the Semantic Drupal “hackathon”
that introduced this RDFa support
• Used on energy.gov, london.gov.uk, www.iq.harvard.edu,
software.intel.com…
• RDFa (SIOC, FOAF, Dublin Core, SKOS) data used for blog
posts, forums, etc.
• E#orts are currently underway in Drupal 8 to replace some
of these terms with types from the schema.org vocabulary
(recommended by four major search engines)
Image from tinyurl.com/drupaper
37. How much SIOC data is out there?
Images (this one and later backgrounds) from publicdomainpictures.net
38. Sindice 2012: classes
• Total instances of SIOC classes: 7.7M
• Up 200k in three months
• Most occurences: sioc:Item (2.2M)
• Followed by: UserAccount (1.6M), MicroblogPost (1.3M), Post (800k),
User (700k), Comment (400k)…
• Note: 1 billion foaf:Person instances!!!
• Used on most [distinct] sites:
• Item (7k), UserAccount (7k), Post (3k)…
• Consistent with "ndings by Mika and Potter in 2012: Item (20k),
UserAccount (15k), Post (5k), BlogPost (3k) and Comment (3k) in that
order
39. Sindice 2012: predicates
• Total instances of SIOC predicates: 22.5M
• Up 400k in three months
• Most occurences: sioc:follows (4.6M)
• Followed by: topic (4M), account_of (3.5M), has_creator (2.7M),
links_to (1.5M), has_discussion (1.3M)...
• Used on most [distinct] sites:
• has_creator (8k), num_replies (7k), name (2k), account_of (1.5k),
reply_of (1.5k)...
40. Sindice 2012: namespaces
• SIOC data is being generated from 10k distinct domains (2k
SLDs) (plus 2k domains for the SIOC Types module)
• Increasing by about 100 domains a month
• No doubt helped by Drupal!
• FOAF data is being generated from 3M distinct domains
(100k SLDs)
• Increasing by over 1000 domains a month
41. Web Data Commons: RDFa data sets from
December 2014
1. foaf:Image (143,818,149 Entities)
2. og:"article" (65,233,945 Entities)
3. gd:Breadcrumb (56,755,178 Entities)
4. foaf:Document (35,991,377 Entities)
5. sioc:Item (34,880,432 Entities)
6. skos:Concept (26,315,007 Entities)
7. og:"website" (23,429,568 Entities)
8. sioc:Post (19,457,818 Entities)
9. sioc:Comment (18,946,600 Entities)
10. gd:Review-aggregate (14,970,496
Entities)
11. sioc:UserAccount (14,846,680 Entities)
• Bizer et al., 2012-2014
• 2.01 billion web pages
• 20.48 billion RDF triples
• SIOC available from 6-7% of
the PLDs (pay-level
domains) with RDFa
• Top RDFa classes shown on
the right
• Lots of SSW terms still used
42. 2008-2010: Online Presence Ontology
• OPO aims to unify presence information and status
noti"cation processes across di#erent services:
• Twitter, Facebook, Foursquare, etc.
• Help solve the information overload issue by providing a
means to identify to whom / which community presence
information should be directed: “sharing spaces”
• Collaborative e#ort between Université Paris-Sud XI, Orsay,
University of Belgrade, NUI Galway and Université Paris-
Sorbonne
• Leveraged in NUI Galway’s collaboration with Cisco
48. Structure and display opinions and
arguments to support their (re)use
Original
Discussion
Ontology
Semantic
Enrichment
Semantically
Enriched
RDFa
Querying
Queryable
User Interface
With Barchart
Schneider, Web Science 2010, ACM SAC 2011, CSCW 2013
50. Augmenting social media items with
metadata using related web content
tags?
topic?
location?
Kinsella, ECIR 2011, ESWC 2011, Web Science 2010
Last night I saw Connacht
play at The Sportsground.
The match started well for
Connacht with a great try
but after half time the
opposition closed the gap.
Finally we managed to hold
out for the win. It was a
great game from both sides.
Here's a clip of the "rst try.
51. TAG PREDICTION
GEOLOCATION
TOPIC
CLASSIFICATION
...didn’t see
t h e m a tc h
but here’s a
s u m m a r y
from John..
..............This
review of the
C o n n a c h t
match shows
that they are
getting back in
form!......
href
href
YouTube
Title:
Fionn Carr try
Category:
Sport
Tags:
rugby, try, carr,
connacht
Last night I saw Connacht
play at The Sportsground.
The match started well for
Connacht with a great try
but after half time the
opposition closed the gap.
Finally we managed to hold
out for the win. It was a
great game from both sides.
Here's a clip of the "rst try.
JohnSmith John Smith
I’m at the Galway
Sportsground
Enhanced topics and tags on items (that
can propagate to a user’s pro"le)
href
52. Aggregated, interoperable and multi-
domain SSW user interest pro"les
Orlandi, IEEE WI 2013, I-SEMANTICS 2012, SWJ 2011; S2E Gift Funding from Cisco
Foundation
57. • PPM provides two main tasks:
• A user creates his or her privacy preferences
• A requester logs into the other user’s PPM which in turn will give
back a faceted pro"le - "ltered based on the privacy preferences
User B
Requester
Privacy Preference
Manager
Private Social Semantic Data
Privacy
Preferences
User A
WebID
Sacco, 2nd Prize Award in I-Semantics 2012 Demo Track; Led to collaboration with the
late George Thomas’ team in the US Department of Health and Human Services
Privacy Preference Manager (PPM)
58. New SSW topics of interest
• Semantically Enabled Social Hub to Control Personal Data
and Ownership
• Scalable Topic-Level Sentiment Analysis on Streaming Feeds
(with AYLIEN)
• Social Semantic User Modeling in Online Social Networks for
Recommendation (using SAP HANA) [next slide]
• High-Level Cross-Medium Open-Set Authorship
Identi"cation (with AYLIEN)
• Semantic Crisis Management Framework
59. “Who’s learning what in MOOCs?” …and
where are they learning it
• Social pro"le data (currently
from LinkedIn search) and
MOOC data (currently from
the Coursera API) combined
with geographic LOD
• Semantic model based on a
newly proposed resume
ontology along with
GeoNames and a proposed
schema.org extension for
online courses
managerscientist software
analyst analyticssenior
data
engineer
developer
student
research
The Data
Scientist’s
Toolbox
Intro
to Data
Science
Data
Analysis
Machine
Learning
Computing
for Data
Analysis
R Prog-
ramming
Piao, work in progress
61. Join the Social [Web] working group and
interest group at the W3C
• www.w3.org/Social/WG
• Social data syntax
• Social API
• Federation protocol
• www.w3.org/Social/IG
• Use cases to drive social
standards for both businesses
and consumers
• Social architecture report
• Social vocabularies
62. Stay tuned for future activities organised
by IFIP WG 12.7
• International Federation
for Information Processing
Working Group on Social
Networking Semantics and
Collective Intelligence
• John Breslin is vice-chair
and a co-founding member
63. The Social Semantic Web
ソーシャル・セマンティック・ウェブ
• Read our "rst book on this
topic, published by Springer
in 2009
• Recommended reading for
seminars and courses o#ered
by the Utrecht Graduate
School of Humanities, Anna
University Chennai, and the
Technical University of Munich
• Authors from NUI Galway
(Breslin, Passant, Decker)
64. Social Semantic Web Mining
ソーシャル・セマンティック・ウェブ・マイニング
• A follow-up book published
by Morgan & Claypool in
2015
• Combines the structures put
in place by the Social
Semantic Web with
knowledge derived from the
content of those structures
• Co-authors from University
of Southampton and the
University of Chile (Omitola,
Ríos, Breslin)
66. Interested in collaborating and/or being a
visiting researcher at NUI Galway?
• SFI ISCA Japan Bilateral
Short-Term Visits
• 国際戦略協力賞ー日本
• Funding available for
Japanese researchers to visit
NUI Galway
• Researchers of all levels are
eligible to apply
• http://bit.ly/iscajapan
• JSPS Bilateral Programs
• “Open partnership”
• Joint research projects
• Joint seminars
• Deadline 8 September 2015
• JSPS Long-Term Awards
• Postdoctoral fellowships
[Deadline 4 September 2015]
• Talented researchers abroad
(lecturers, professors)
[probably May 2016]
67. ありがとうございます!
Any questions?
• Ask me now…
• Or email me later at
john@bresl.in
• Thanks to Science
Foundation Ireland’s ISCA
Japan for funding my visit
to NII!