The document summarizes BBC News Labs and their semantic annotation work. It discusses:
1. The history of their News Juicer project which extracts concepts from BBC news articles and annotates them.
2. Their current work where over 84,000 news articles have been tagged and they have launched 4 semantic APIs for accessing the annotated data.
3. An overview of the 4 APIs which include accessing annotated articles by concept, full-text article search, finding related concepts that co-occur, and concept search.
2. BBC News Labs
• Matt Shearer – Lead, BBC News Labs
@completedespair
• Jeremy Tarling – Data Architect, BBC News
@jeremytarling
• Paul Wilton – Tech Architect, Ontoba Ltd
@pwilton
• AND Viktor Tron, Matt Haynes, Mark Ransby
3. BBC News Labs
1. History
2. Now
3. What’s possible
4. APIs
(no On-Stage, LIVE coding. sorry)
4. History
• The News Juicer is 1 year old
• 2012 : "All we have is a bunch of articles...
We need a semantic prototyping platform!"
1
Grab
BBC News
& Sport
Articles
2
Extract
Concepts
3
Match to
DBpedia
4
Annotate
Article
5
Push to
Triplestore
6
Expose
via
API
The News Juicer
8. APIs
Jeremy Tarling - Data Architect, BBC News
@jeremytarling
the Juicer API has 4 endpoints:
1. article semantic annotation
2. article full text search
3. concept co-occurrence
4. concept search
9. API 1 - article semantic annotation
• GET a list of BBC News articles by concept
• support for SPARQL queries
• explore the DBpedia graph
• examples
o "articles about Conservative politicians"
o "articles about places within 25 miles of Chester"
o "articles about companies in the aerospace
industry"
10. API 2 - article full text search
• perform full text search of BBC News articles
• filter by section: "politics", "business"
• specify data range, limit + offset
• example: "find 5 articles with the words
'horsemeat' and 'Tesco' in from the UK section
since Jan 1st 2012"
11. API 3 - concept co-occurrence
• select a DBpedia concept:
<http://dbpedia.org/resource/David_Cameron>
• specify the type:
<http://dbpedia.org/ontology/Person>
• returns an ordered list of people that also
appear in BBC news articles alongside David
Cameron, and their frequencies
12. API 4 - find concepts
• full text search for concepts that are tagged on
articles
• specify a search term or phrase
• returns OpenSearch Suggestions JSON for the
semantic concepts