TSO Organograms - Moving Linked Data into production and reaping the benefits - SemTechBiz 2012

Moving linked data into production
- and reaping the benefits
Richard Goodwin

SemTechBiz the Williams Lea Group
Part of September 2012
[Presentation Title]

What does TSO do?

Semantic discoverability solutions

Linked Open Data

Breaking data out of silos

Dedicated semantic team with a variety of experience & backgrounds

Part of the Williams Lea Group

Why Semantic discoverability?

Aggregate relevant Extract important
Enrich the content
data information from it

Allow linking,
Add value to the re-use and
data repurposing


Organograms

Making government
Automated
Good use case for information
dissemination
Linked Open Data accessible and
process
open


Government Transparency initiative

 Announced by No.10 in May 2010

 RDF and visualisation

 Live in June 2011


What was involved?

 Using Excel source data
– convert from CSV to RDF using PHP
and XL Wrap
 Preview and publish through linked
data API
 Creating custom organogram
Visualisation
 Supporting distributed publication by
owner organisations

TSO Non Sensitive Part of the Williams Lea Group

Achievements

 Publishing in RDF right across
government
– now = 200
– soon = 400+
 New data published every 6 months
 Humans use Visualisation for
information about government
 Machines can pull regularly updated
info to create other resources


Challenges

 WCM unable to handle RDF
documents
 Officials struggling to get sign-off from
ministers
 Upload / validation usability issues
 Minimise errors that departments are
expected to remedy
 Reduce bootstrapping
 Exploit value in data set - see
changes over time

Challenges – improve the user experience

 CSV and validation usability issues
– Apparently inconsistent validation
– Silent errors
– Department uploading ≠ department featured
– Departments see all files clearly marked as senior CSV, junior CSV and RDF

 Sign-off from ministers

– Senior management, short of time


Challenges – improve the user experience

 Organisational quirks
– e.g. some Ministry of Defence (MoD) civil servants report to minister, others of same
grade report elsewhere

 Grades need to be more flexible
– e.g. „equivalent to grade X‟ or accept those parts which are correct and flag the others

 Duplicate uploads need flagging

 Improve the speed of preview function


Solutions – WCM and reliability

 Replaced the XL Wrap with CSV2TTL
– a Python-based implementation of CSV to RDF

This supports efficient and reliable publishing of RDF triples from CSV

 Early validation takes place in spreadsheet template

 Data owners upload the spreadsheet to the preview server for signing-off


Solutions – Usability
The main constraint on our action is the use of the templates from within the
Government
 Secure Intranet - VBA code inappropriate

 Strip out lengthy formulae (hard to maintain)
– Net result no change to file size despite extra features

 Provide per cell rather than per row feedback to users

 Hide extraneous cells and improve validation rules

 Use single-cell lookup point for web application to ascertain validity

Linked data - increasing value over time

Enables user to Solution
 View the change in the  Serves all datasets from
shape of government same iteration into single
over time Knowledge Base (KB)
with each different
 Use a slider on
iteration in separate KBs
Visualisation to show
changes  Data registry maintains
the mapping between
<iteration>,
<department> and
knowledge base,
<graph>

OpenUp® Platform

Harvest Enrich Store Publish

Aggregation of Extracting Highly scalable
Websites and
data from web, useful data and database
APIs to reach
APIs databases converting to re- storage and
data users
and files usable formats query engine

Automated processes that deliver reliable data


Questions?

See http://openup.tso.co.uk
Follow @TSOTechnology
Meet TSO Semantic team on our stand
Test our new release of Flint online SPARQL editor launching today…


Disclaimer

Confidentiality statement
The contents of this document together with all other information, data, materials, specifications or other related documents provided by Williams Lea
(“WL”) (together “materials”) shall be treated at all times by the recipient as the confidential and proprietary information of WL. The recipient shall not
disclose any such materials to any third parties without the express, prior written approval of WL. Where such express approval is granted by WL, the
recipient shall ensure that all third parties to whom disclosure is made shall keep any such materials confidential and shall not disclose them or any part of
them to any other person. All intellectual property rights in the materials shall remain the property of WL, or its third party licensors, and are protected by
copyright.

© 2012 Williams Lea Group

Disclaimer
This document may be incomplete without reference to any oral briefing provided by WL, reflects current conditions and WL‟s views as of this date and is
subject to correction or change at any time. Although the information contained in this document is believed to be accurate in all material respects, neither
WL nor any of WL‟s advisers, agents, officers or employees accepts responsibility or liability for or makes any promise, representation, statement or
expression of opinion or warranty, express or implied, with respect to the accuracy or completeness of the content of this document (to the extent
permissible by law) unless and save to the extent that such promise, representation, statement or expression of opinion or warranty is later expressly
incorporated into a legally binding contract.


TSO Organograms - Moving Linked Data into production and reaping the benefits - SemTechBiz 2012

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

Similaire à TSO Organograms - Moving Linked Data into production and reaping the benefits - SemTechBiz 2012

Similaire à TSO Organograms - Moving Linked Data into production and reaping the benefits - SemTechBiz 2012 (20)

TSO Organograms - Moving Linked Data into production and reaping the benefits - SemTechBiz 2012

Notes de l'éditeur