SlideShare une entreprise Scribd logo
1  sur  26
Integrating Government Data using Semantic Web technology  Dean Allemang Chief Scientist, TopQuadrant Inc.  Prepared for ISWC 2009
Government Data Sources ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
“ Objets trouv és” ,[object Object],[object Object],Lal Hitchcock Sculptures
“ Found data” ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Formats for “Found Data” in government Format Examples Notes Spreadsheets Data.gov, USASpending.gov, DOI Flexibility makes it popular, but makes work at re-use time XML Data.gov Not really a single format, but can be parsed uniformly RSS USASpending.gov, USGS Syntax wars largely irrelevant now.  Easy to read, dynamic RDFa <none?> New kid on the block, supported by Google, Yahoo!, Drupal SPARQL Endpoint Dbpedia.org Most flexible of all, dynamic RDF/N3/SKOS OEGov, Tetherless World Flexible, relatively static.  Great for vocabularies etc.
Quality Considerations of Found Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A few species of Found Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Integration strategy using RDF ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Import data into RDF ,[object Object],[object Object],<Person id=“3”> <name>Irene Polikoff</name> <employer>TopQuadrant</employer> <position>CEO</position> </Person> Name Address Company Title Dean Allemang 10 Downing St. TopQuadrant Chief Scientist Michael Brodie 14 Wysteria Lane Verizon Chief Scientist
Import Data into RDF ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Canus Dog Collie Wolf Beagle Terrier Lone Steppen Genus Species Sub-species Canus Dog Collie Canus Dog Beagle Canus Dog Terrier Canus Wolf Steppen Canus Wolf Lone
Data Quality and Controlled Vocabularies ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Quality and Controlled Vocabularies (cont) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unstructured data and  Controlled Vocabularies ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Merging Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data mapping Style 1:  Schema Mapping Examples ,[object Object],<Person id=“3”> <name>Irene Polikoff</name> <employer>TopQuadrant</employer> <position>CEO</position> </Person> Name=name,  but Company=employer Title=position Name Address Company Title Dean Allemang 10 Downing St. TopQuadrant Chief Scientist Michael Brodie 14 Wysteria Lane Verizon Chief Scientist
Schema Mapping Examples (cont) ,[object Object],<rss:item ID=“3”> <wgs:lat>39.945345</wgs:lat> <wgs:long>-79.34524</wgs:long> </rss:item> <image src=“doggie.jpg”> <wgs:Point> <wgs:lat>39.945345</wgs:lat> <wgs:long>-79.34524</wgs:long> </wgs:Point> </image> <Entry <position>39.945345,-79.34524</position> </Entry>
Schema mapping solutions: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Schema mapping solutions (cont) ,[object Object],[object Object],[object Object],[object Object]
Role of Standards in the Mapping ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mapping Style 2:  Tagging or Sorting ,[object Object],<Bookmark href=“http://www.topquadrant.com”> <tag>Semantic Web</tag> </Bookmark> <System name=“Central Bookkeeping”> <Evaluation> <PerformanceMeasure>Quality</PerformanceMeasure> <Resullt>Fair</Result> </Evaluation> </Bookmark> That’s an FEA reference! Where does this come from?
Role of Standards in the Mapping ,[object Object],[object Object],[object Object],[object Object],[object Object]
Analysis and Display ,[object Object],[object Object],[object Object],[object Object],[object Object]
Tags as Amalgamation FEA DOI GSA If two sources use the same controlled vocabulary, they can be amalgamated along that dimension.
Mapping Columns
Model-driven displays SELECT ?lat ?long WHERE {?item a :DisplayLocation . ?item geo:lat ?lat . ?item geo:long ?long .} Name latitude longitude Slausen -171.3 38.4 Union -171.4 38.2 Vine -170.9 37.9 McArthur -170.4 38.1 Anaheim -171.3 38.2 Chinatown -171.1 38.5 Beverly -171.3 38.1 latitude longitude Station domain geo:lat geo:long :DisplayLocation domain domain subPropertyOf subPropertyOf subClassOf
Exercises ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Tendances

Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Koray Tugberk GUBUR
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
Juan Sequeda
 
The OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit ProjectThe OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit Project
Alexandro Colorado
 
Searching the Internet
Searching the Internet Searching the Internet
Searching the Internet
guest32ae6
 
Power Searching Within Google
Power Searching Within GooglePower Searching Within Google
Power Searching Within Google
kphillips
 

Tendances (20)

Making the Web searchable
Making the Web searchableMaking the Web searchable
Making the Web searchable
 
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
 
The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
 
Funding data for research
Funding data for researchFunding data for research
Funding data for research
 
Natural Language Search with Knowledge Graphs (Activate 2019)
Natural Language Search with Knowledge Graphs (Activate 2019)Natural Language Search with Knowledge Graphs (Activate 2019)
Natural Language Search with Knowledge Graphs (Activate 2019)
 
GraphDB
GraphDBGraphDB
GraphDB
 
Making things findable
Making things findableMaking things findable
Making things findable
 
Spj110509
Spj110509Spj110509
Spj110509
 
Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
 
The OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit ProjectThe OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit Project
 
More than Raw: Government Data Online
More than Raw: Government Data OnlineMore than Raw: Government Data Online
More than Raw: Government Data Online
 
Introduction to Advanced Internet Searching
Introduction to Advanced Internet Searching Introduction to Advanced Internet Searching
Introduction to Advanced Internet Searching
 
Searching the Internet
Searching the Internet Searching the Internet
Searching the Internet
 
Power Searching Within Google
Power Searching Within GooglePower Searching Within Google
Power Searching Within Google
 
IST 561 Spring 2007--Session7, Sources of Information
IST 561 Spring 2007--Session7, Sources of InformationIST 561 Spring 2007--Session7, Sources of Information
IST 561 Spring 2007--Session7, Sources of Information
 
Metadata lecture riley_2011
Metadata lecture riley_2011Metadata lecture riley_2011
Metadata lecture riley_2011
 
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
 
CrossRef Technical Information for Libraries
CrossRef Technical Information for LibrariesCrossRef Technical Information for Libraries
CrossRef Technical Information for Libraries
 
Formulating an Effective Search Query
Formulating an Effective Search QueryFormulating an Effective Search Query
Formulating an Effective Search Query
 

Similaire à Integrating Government Data New

Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
liddy
 
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Bradley Allen
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011
sssw2011
 

Similaire à Integrating Government Data New (20)

New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadata
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorial
 
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
 
Gist od2-feb-2011
Gist od2-feb-2011Gist od2-feb-2011
Gist od2-feb-2011
 
DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0
 
Introduction to Linked Data
Introduction to Linked DataIntroduction to Linked Data
Introduction to Linked Data
 
W3 C Specification For Interoperability And Accessibility For Ajax, Dhtml, Xm...
W3 C Specification For Interoperability And Accessibility For Ajax, Dhtml, Xm...W3 C Specification For Interoperability And Accessibility For Ajax, Dhtml, Xm...
W3 C Specification For Interoperability And Accessibility For Ajax, Dhtml, Xm...
 
Role of metadata in transportation agency data programs
Role of metadata in transportation agency data programsRole of metadata in transportation agency data programs
Role of metadata in transportation agency data programs
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011
 
Web of data
Web of dataWeb of data
Web of data
 
Beyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationBeyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content Integration
 
Database Project
Database ProjectDatabase Project
Database Project
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
 
Sweo talk
Sweo talkSweo talk
Sweo talk
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011
 
The Semantic Web
The Semantic WebThe Semantic Web
The Semantic Web
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
 

Dernier

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Integrating Government Data New

  • 1. Integrating Government Data using Semantic Web technology Dean Allemang Chief Scientist, TopQuadrant Inc. Prepared for ISWC 2009
  • 2.
  • 3.
  • 4.
  • 5. Formats for “Found Data” in government Format Examples Notes Spreadsheets Data.gov, USASpending.gov, DOI Flexibility makes it popular, but makes work at re-use time XML Data.gov Not really a single format, but can be parsed uniformly RSS USASpending.gov, USGS Syntax wars largely irrelevant now. Easy to read, dynamic RDFa <none?> New kid on the block, supported by Google, Yahoo!, Drupal SPARQL Endpoint Dbpedia.org Most flexible of all, dynamic RDF/N3/SKOS OEGov, Tetherless World Flexible, relatively static. Great for vocabularies etc.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Tags as Amalgamation FEA DOI GSA If two sources use the same controlled vocabulary, they can be amalgamated along that dimension.
  • 25. Model-driven displays SELECT ?lat ?long WHERE {?item a :DisplayLocation . ?item geo:lat ?lat . ?item geo:long ?long .} Name latitude longitude Slausen -171.3 38.4 Union -171.4 38.2 Vine -170.9 37.9 McArthur -170.4 38.1 Anaheim -171.3 38.2 Chinatown -171.1 38.5 Beverly -171.3 38.1 latitude longitude Station domain geo:lat geo:long :DisplayLocation domain domain subPropertyOf subPropertyOf subClassOf
  • 26.