8. Bush’s camera on the
head
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
9. Memex
Digital Enterprise Research Institute www.deri.ie
Posited by Vannevar Bush in “As We May Think”
The Atlantic Monthly, July 1945
“A memex is a device
in which an individual stores
all his books, records, and communications,
and which is mechanized so that it may be consulted
with exceeding speed and flexibility”
Enabling Networked Knowledge
9
10. Sketch of memex
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
10
11. oNLine System- NLS, 1968
(Doug Engelbart, SRI)
Digital Enterprise Research Institute www.deri.ie
“By ‘augmenting human
intellect’ we mean
increasing the
capability of a man to
approach a complex
problem situation, to
gain comprehension
to suit his particular
needs, and to derive
solutions to
problems.”
The Mouse;
Word Processing;
Data Sharing;
Hypertext;
Enabling Networked Knowledge
12. ARPANET (1969)
(John Postel, David Crocker, Vint Cerf)
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
13. Xanadu (Ted Nelson
~1960-???)
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
13
14. World Wide Web
(Tim Berners-Lee 1989)
Digital Enterprise Research Institute www.deri.ie
WWW (Tim Berners-
Lee)
“There was a second
part of the dream […]
we could then use
computers to help us
analyse it, make sense
of what we re doing,
where we individually fit
in, and how we can
better work together.”
Enabling Networked Knowledge
15. Making Progress…
Digital Enterprise Research Institute www.deri.ie
Memex (Vannevar Bush)
A memex is “a device in which an individual
stores all his books, records, and
communications.”
Augmenting Human Intellect
(Doug Engelbart)
“By "augmenting human intellect" we mean
increasing the capability of a man to approach a
complex problem situation, to gain
comprehension to suit his particular needs, and to
derive solutions to problems.”
WWW (Tim Berners-Lee)
“There was a second part of the dream […] we
could then use computers to help us analyse it,
make sense of what we re doing, where we
individually fit in, and how we can better work
together.”
15 of 46
Enabling Networked Knowledge
16. A Network of Data
and Knowledge
?
Digital Enterprise Research Institute www.deri.ie
Interconnected
Universal
All encompassing
assists humans,
organisations and systems
with problem solving
enabling innovation and
increased productivity
Enabling Networked Knowledge
18. What enabled the Web?
Digital Enterprise Research Institute www.deri.ie
1. Scalability: No growth scalability problem
(e.g., no back links from HTML pages)
2. No censorship: no lengthy permission or
review process
3. Positive feedback loop: exploit Metcalf’s
Law
Enabling Networked Knowledge
19. Digital Enterprise Research Institute www.deri.ie
Metcalfe's law:
The value of a
network is
proportional to the
square of the number
of connected members
Enabling Networked Knowledge
20. Metcalfe’s Law 1: Links
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
23. Requirements for a Data Web
Digital Enterprise Research Institute www.deri.ie
1. Scalability. No centralized
infrastructure (e.g., a central object
repository) required.
2. No censorship. It must be possible
to publish data without having to ask
for prior permission.
3. Positive feedback loop. Capitalize on
Metcalfe’s Law.
Enabling Networked Knowledge
24. Enabling Metcalfe’s Law
Digital Enterprise Research Institute www.deri.ie
1. Global Object Identity.
2. Composability: The value of data can be increased if it can be
combined with other data. Composability has a number of
consequences:
1. schema-less. ( Combined data originating from difference sources
unlikely to conform to a schema)
2. self-describing
3. “object centric”. In order to integrate information about different
entities data must be related to these entities.
4. graph-based. The composition of multiple object-centric data
sources results in a graph in the general case.
Enabling Networked Knowledge
25. Observations
Digital Enterprise Research Institute www.deri.ie
• The relational model does not fulfill these requirements (not
composable, no global object id)
• XML is not object centric and not composable.
• Graph based data formats are composable
• RDF fulfills these requirements.
• Claim: Any data format that fulfills the requirements is “more
or less” isomorphic to RDF.
Enabling Networked Knowledge
26. The usual two
Digital Enterprise Research Institute
Ingredients www.deri.ie
1. RDF – Resource Description Framework
Graph based Data – nodes and arcs
Identifies objects (URIs)
Interlink information (Relationships)
1. Vocabularies (Ontologies)
provide shared understanding of a domain
organise knowledge in a machine-comprehensible way
give an exploitable meaning to the data
Enabling Networked Knowledge
26 of 46
27. Linked Open Data cloud
- domains
Digital Enterprise Research Institute www.deri.ie
BestBuy
http://lod-cloud.net/ Overstock.com
Facebook
US government
UK government
Media
User-generated
Government Publications
BBC
New York Times
Cross-domain
Geo
Life sciences
LinkedGeoData
Over 200 open data sets with more than 25 billion facts,
interlinked by 400 million typed links, doubling every 10 month!
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Enabling Networked Knowledge
27
29. Issues with Ontologies
Digital Enterprise Research Institute www.deri.ie
Differentiation in Classes and Instances is difficult:
no single way to abstract the world (observe Upper
Ontology wars…..aehm…discussions!)
Choices between Instances and Classes done at
design cause usability issues (different treatment in
applications) (animal-mammal-whale)
Ontologies cement power structures (prevent
information sharing)
Sharing is only top-down
Enabling Networked Knowledge
30. How did
classes/instances
Digital Enterprise Research Institute www.deri.ie
happen?
Predecessor: Frame Representation Systems
“Prototypes: KRL, RLL, and JOSIE employ prototype
frames to represent information about a typical
instance of a class as opposed to the class itself and
as opposed to actual instances of the class.” [Karp,
1993]
AFAIK: Classes as subsets and instances as
elements [Hayes, 1979].
Formalization of Frame Systems (Description Logic)
picked up on [Hayes, 1979] and left out alternatives
Enabling Networked Knowledge
31. Note: How did DL &
Ontologies/Classes happen in the
Semantic Web?
Digital Enterprise Research Institute www.deri.ie
Stefan Decker, Dieter Fensel, Frank van Harmelen,
Ian Horrocks, Sergey Melnik, Michel C. A. Klein,
Jeen Broekstra: Knowledge Representation on the
Web. Description Logics 2000: 89-97
OIL -> DAML+OIL -> OWL -> OWL 2.0
Enabling Networked Knowledge
33. Examples
Digital Enterprise Research Institute www.deri.ie
> JavaScript,
> Self,
> NewtonScript,
> Omega, Cecil,
From: A. Lienhard, O. Nierstrasz: Prototype based programming
http://www.slidefinder.net/0/03prototypes/03prototypes/10603817
Enabling Networked Knowledge
34. How it could look like:
(Horizontal Information
Digital Enterprise Research Institute www.deri.ie
Sharing)
Enabling Networked Knowledge
36. Research Agenda
Digital Enterprise Research Institute www.deri.ie
Knowledge Representation Constructs
(Specialisation)
Logic based Formalisation of Prototypes
Reasoning (e.g., with Rules)
Complexity
Large Scale Storage, Querying
Collaboration facilities
Enabling Networked Knowledge
37. Linked Data Vocabularies
as Social Constructs
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
38. Neologism
Digital Enterprise Research Institute www.deri.ie
http://vocab.deri.ie/
Neologism is a simple, Drupal-based RDF-S
vocabulary editor and publishing system, that
allows for:
• Collaborativelly creating and maintaining
RDFS vocabularies
• Making the vocab available for humans
(HTML, graph) and machines (RDF/XML,
Turtle)
• Importing external vocabularies
• Working with external namespaces such
as via PURL.org, etc.
• More at http://neologism.deri.ie/
Enabling Networked Knowledge
39. Linked Open Data cloud
- domains
Digital Enterprise Research Institute
BestBuy
www.deri.ie
http://lod-cloud.net/ Overstock.com
Facebook
US government
UK government
Media
User-generated
Government Publications
BBC
New York Times
Cross-domain
Geo
Life sciences
LinkedGeoData
Over 200 open data sets with more than 25 billion facts,
interlinked by 400 million typed links, doubling every 10 month!
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Enabling Networked Knowledge
39
40. Digital Enterprise Research Institute www.deri.ie
n
Actio
Visualisation,
Collaboration,
Exploitation
Abstraction,
Reasoning,
Analytics
Networked Data
Management n
o
m ati
Infor
Enabling Networked Knowledge
41. User Role Analysis
Digital Enterprise Research Institute www.deri.ie
Digital Enterprise Research Institute www.deri.ie
Who are the
influencers?
• Who are the
initiators?
• Who tends to answer
questions?
• What fraction of the
network is non-social?
• How stable are these
roles?
Work by Vaclav Belak, Conor Hayes et al, DERI.
6 Enabling Networked Knowledge
43. User Role Analysis:
Orthogonal Features
Digital Enterprise Research Institute www.deri.ie
■Persistence
■ Mean/Std. Dev. posts
per thread
C
D ■Initialisation
B
■ % initiated threads
A
■Popularity
■ % in-degree
■ % posts that receive
reply
Post
■Reciprocity
Response ■ % bi-directional
neighbours
■ % bi-directional threads
Enabling Networked Knowledge
43
44. Role Analysis
Digital Enterprise Research Institute www.deri.ie
Reciprocity Persistence Popularity Initiation
Popular Initiator High High Very High
Popular Participant High High Low
Supporter Medium Medium Low
Elitist Low L-M
neighborhood
Hi thread
reponse
Grunt Low to Low to
Medium Medium
Taciturn Very Low Low to
Medium
Enabling Networked Knowledge
44
45. Example
Digital Enterprise Research Institute www.deri.ie
Post
C
D Response
B
A
Reciprocity Persistence Popularity Initiation
Popular initiator A 2/3 1/7 2/7 1
Popular participant B 2/3 2/7 2/7 0
Grunt C 1/3 3/7 0 0
Taciturn D 0 1/7 0 0
Enabling Networked Knowledge
45
46. Analysis of Forums
Digital Enterprise Research Institute www.deri.ie
Boards.ie data from 01/07/2006 to 31/12/2006
Personal Issues
Christianity
Weather
Windows
Development
Humanities
Politics
Enabling Networked Knowledge
47. Personal Issues Forum
Digital Enterprise Research Institute www.deri.ie
Mostly taciturns
Not a lot of dialog
Enabling Networked Knowledge
48. Christianity vs Weather
Digital Enterprise Research Institute www.deri.ie
Some popular Popular initiators
initiators Large portion of grunts
Some lengthy Not as much discussion
discussions
Enabling Networked Knowledge
49. Windows, Development, Politics
Digital Enterprise Research Institute www.deri.ie
Less social (technical)
No popular initiators
Lots of grunts
Enabling Networked Knowledge
51. The Evolution of
Communities
Digital Enterprise Research Institute www.deri.ie
Work by Vaclav Belak, Conor Hayes et al, DERI.
Enabling Networked Knowledge
52. Motivation
Digital Enterprise Research Institute www.deri.ie
Kuhn claimed the development of
scientific knowledge proceeds in
discrete steps:
1.Pre-paradigm period
2.Paradigm period (normal
science)
paradigm articulation
1.Crisis
2.Reaction to the crisis
paradigm shift
Enabling Networked Knowledge
53. Cross-Community Effects
Digital Enterprise Research Institute www.deri.ie
community
shift
community
specialization
Co-citation networks of Semantic Web community
Enabling Networked Knowledge
54. Methodology Pipeline
Digital Enterprise Research Institute www.deri.ie
Publications from major conferences
selected from DBLP
Community shifts and specializations
Enabling Networked Knowledge
55. Community & Topic
Detection
Digital Enterprise Research Institute www.deri.ie
Communities identified using:
Infomap
Reasons:
publicly available implementations
weighted directed networks
Communities traced from one snapshot to the next
according to the highest Jaccard coefficient
Ancestors and descendant obtained by a
modification of Jaccard coefficient
Enabling Networked Knowledge
57. Topics of Louvain Community 26
Digital Enterprise Research Institute www.deri.ie
15
Enabling Networked Knowledge
58. A Network of
Knowledge
Digital Enterprise Research Institute www.deri.ie
Interconnected
Universal
All encompassing
•Search •Science
•Collaboration •Commercialization`
•Text Mining
Linked Data
assists humans,
organisations and systems
with problem solving
enabling innovation and
increased productivity
Enabling Networked Knowledge
Notes de l'éditeur
The Mouse; Word Processing; Data Sharing; voice & video 4 minutes
The Mouse; Word Processing; Data Sharing; voice & video 4 minutes
The Mouse; Word Processing; Data Sharing; voice & video 4 minutes
In order to create a global data network the data format needs to be able to identify entities. Therefore it needs a world-wide accepted way to identify entities. Object identity does not impose name uniqueness: information is created independently. Different object identifiers may be used for the same object as information is been created independently.
In order to create a global data network the data format needs to be able to identify entities. Therefore it needs a world-wide accepted way to identify entities. Object identity does not impose name uniqueness: information is created independently. Different object identifiers may be used for the same object as information is been created independently.
Work by Conor Hayes, Vaclav Belak et at, DERI.
Agglomerative hierachical clustering
Using principle component analysis to analyse the features, we found that the the amplitude of the largest principal component constituted more than 95% of the variance in the features, and the size of the ego-centric networks was the dominant feature in the largest component. Hence, we use the size of the ego-centric networks as our feature to partition the users into the three bands. We discard the lowest band, which consists one-post users, and the middle band, which does not have enough neighbours to have an accurate power law exponent fit. Using agglomerative hierarchical clustering, we cluster the feature profile data of the remaining top band users from all forums. To determine the optimal number of clusters, we used five different validation techniques: Rand, Silhouette, RS, Root mean square and DB Index (Handl, Knowles, and Kell 2005). We found that the optimal number of clusters was either 8, 13, 15 or 21. After manual inspection, we selected 8 and 15 as the best numbers of clusters. Each cluster approximately corresponds to one user role type. The average value of the nine features and the number of users in each cluster are used to build a quantitative description of the clusters/user role types.
For example, the taciturn role makes up 95% of all users in the Personal Issues forum (grouping 1). This suggests that, despite its name, there is little dialogue happening.
strong component of popular initiators, suggesting that a few users regularly initiate threads that subsequently generate discussion (large percentage of popular participants and supporters).
Instead of paradigm shift, we were looking for community shift Instead of paradigm articulation, we were looking for community specialization We call it ‘community shift’, because we reveal less dramatic changes in the scientific discourse. Very significant and important shifts may eventually turn out to be ‘paradigm shifts’. Similar argument applies also to the use of the notion ‘community specialization’. Paradigm articulation – scientists successfully apply the methods within the paradigm to new problems, until they eventually reach the limits of the paradigm by finding problems/questions not answerable/solvable by the methods, which leads to the crisis of the paradigm and call for a radical change in the scientific field.
Publications from major IR and SW conferences obtained from DBLP for 2000–2009 (ISWC, ESWC, SIGIR, …) Co-citation network of 5772 authors and 817642 edges over all years was extracted 3-year time-steps with 2-years overlap: 2000–2002, 2001–2003, 2002–2004, . . . Total number of articles was 39314 for which we were able to scrape 22975 abstracts and 3740 full-texts Nearly 70% coverage by content 10% coverage by author-provided keywords
We used non-overlapping community detection algorithms, because at that time there was no suitable implementation of overlapping communities detection algorithm. We currently work with overlapping communities.
In 2007 there was a strong inflow from community 15 “semantic web and IR” into community 26, which caused a change of topics towards “semantic web”.