SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
Converts' rally, Evangelistic Committee of New York City, Carnegie Hall, Sept.14, 1908
Open DataLinked
Six Ingredients
The missing ★
Mix ‘n Mash
Contextualize!
Choose your Grain Size
Lower the Threshold
Repeatable Transformation
1
Themissing★
http://give.everything/a/URI
HTTPs URIs only please!

(or resolver + URN)
Versioninformation
Versionagnostic
Guessable
2
RepeatableTransformation
Transformation should be part of routine ...
... manageable and scalable...
... repeatable ...
Linked Data will not be the officialsource anytime soon
http://www.w3.org/TR/prov-overview/
Provenance is key
3
ChooseyourGrainSize
• The document is the 

traditionalgrain size

(dublin core)
• Linked data allows for 

deeplinks into data
• Cost versus usefulness
• Are you the right party to provide detailed descriptions?
http://creatingandeducating.blogspot.nl/2011/11/blog-post.html
4
Mix‘nMash
• Multiple vocabularies won’tbite
• Multiple identifiers won’tbite
!
• Choose what’s useful for you...
• ... then map to others!
Image © David Sykes 2009 All rights reserved
Good News: the bulk has already been done for you!
5• Information is notalwayscompatible
• Make explicit in which context the information holds ...
• ... and who stated the information, why and how.
Contextualize!
Flat Earth and Square Earth idea courtesy of SzymonKlarman
to
2Data SemanticsSemantics for Scientific Data PublishersFrom Data
Photo by Philip Dujardin, http://www.filipdujardin.be
HerkomstenHergebruikvan
OpenData
Rinke Hoekstra

VU University Amsterdam/University of Amsterdam
rinke.hoekstra@vu.nl
Photo by Philip Dujardin, http://www.filipdujardin.be
Definition

(OxfordEnglishDictionary)
• The fact of coming from some particular source or quarter;
origin, derivation;
• the history or pedigree of a work of art, manuscript, rare
book, etc.;
• concretely, arecordofthepassage of an item through its
various owners.
Making trust judgements
Liability, trust and privacy 

in open government data
Compliance and auditing 

of business processes
Licensing and attribution 

of combined information
Curt Tilmes, Peter Fox, Xiaogang Ma, Deborah L. McGuinness, Ana Pinheiro Privette, Aaron Smith, Anne Waple,
Stephan Zednik, Jinguang Zheng: Provenance Representation for the National Climate Assessment in the Global
Change Information System. IEEE T. Geoscience and Remote Sensing 51(11): 5160-5168 (2013)
Integrated & Summarized Data
Transparency and Trust
“Provenance is the number one
issue that we face when publishing
government data in data.gov.uk”
John Sheridan, UK National Archives, data.gov.uk
Provenance?
• Provenance = Metadata?

Provenance can be seen as metadata, but not all metadata is
provenance
• Provenance = Trust?

Provenance provides a substrate for deriving different trust
metrics
• Provenance = Authentication?

Provenance records can be used to verify and authenticate
amongst users
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessing provenance information
• Use

Interpreting and understanding provenance in practice
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessing provenance information
• Use

Interpreting and understanding provenance in practice
recording annotating workflows
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessing provenance information
• Use

Interpreting and understanding provenance in practice
recording annotating workflows
scalability interoperability
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessing provenance information
• Use

Interpreting and understanding provenance in practice
recording annotating workflows
scalability interoperability
trust accountability compliance explanation debugging
Standardization
W3CPROVStandard
Provenance is a record

that describes the people,
institutions, entities, and
activities, involved in producing,
influencing, or delivering a
piece of data

or a thing.
http://www.w3.org/TR/prov-overview
Luc Moreau & Paul Groth
W3CPROVStandard
Provenance is a record

that describes the people,
institutions, entities, and
activities, involved in producing,
influencing, or delivering a
piece of data

or a thing.
http://www.w3.org/TR/prov-overview
http://doc.metalex.eu
http://yasgui.data2semantics.org
Interpretation
NaiveApproaches
InProv: Visualizing Provenance Graphs with Radial Layouts and Time-Based Hierarchical Grouping

Madelaine D. Boyd - http://www.seas.harvard.edu/sites/default/files/files/archived/Boyd.pdf
Orbiter has several limitations. It does not have capabilities for query subgraph high-
lighting, regular expression filters, process grouping, annotations, or programmable views[16].
Furthermore, the structure of each summary node, where child nodes are grouped within
parents and are hidden until the parent is expanded, benefits queries earlier in the depen-
dency chain. Initial overviews often correspond with system bootup, and appear very similar
across di↵erent traces (time slices of system activity).
Figure 10: In these screenshots of Orbiter, the presence of edges overwhelms the visibility of
nodes. By relying on a node-link graph layout and using spatial location to encode object
relationships, Orbiter’s graph layout algorithm must draw many long edges to communi-
cate node connections. Without edge bundling or opacity variation, the meanings of these
relationships are obscured.
Another one of Orbiter’s weaknesses is its node-link diagram layout. As a result, each
node’s position in the X-Y plane and the length and angle of connecting lines are wasted
attributes. The chosen graph layout algorithm (dot by default) arranges nodes to minimize
Figure 11: (Top): A screenshot of the portion of the graph generated by GraphViz for a
trace of the third provenance challenge. (Bottom): A zoomed-in view of the same graph.
The horizontal black bars across the images are dense collections of edges.
E↵ective large graph visualizations present the user with a summary view that can be
explored, filtered, and expanded interactively.
2.5 Tree Visualization
While trees are a subcategory of graphs, because of their hierarchical composition, tree visu-
alization forms its own subfield of research. A survey of over two-hundred tree visualizations
is given at Hans-Jrg Schulz’s treevis.net. Visitors can narrow down by dimensionality
(2D, 3D, or mixed), representation (explicit node-link diagram, implicit treemap, or combi-
nation), alignment (XY plot, radial layout, or free diagram)[55]. These categories are shown
Figure 12: Left: Pajek uses various summary node-link and matrix-based representations
depending on the structure of the supplied data set. Pictured is a main core subgraph
extracted from routing data on the Internet. Right: TopoLayout optimizes the choice of
visualization display depending on the underlying graph structure. The right column is
TopoLayout’s output, while the left and middle columns are the outputs of the GRIP and
FM graph layout algorithms.
Figure 13: treevis.net defines di↵erent categories for tree maps. Tree maps can be cate-
gorized by dimensionality (2D, 3D, or mixed), representation (explicit, implicit, or mixed),
or alignment (XY, radial, or spring).
Tree visualizations are either explicit or implicit. Explicit representations resemble node-
link diagrams. An example of an implicit representation is a tree map, a diagram where the
entire tree is inscribed in a rectangle representing the root node. This root is subdivided
hierarchically into more rectangles, which represent child nodes, and each child node is
subdivided into more child nodes. Treemaps are excellent for displaying hierarchical or
categorical data[57]. One famous example, shown in Figure 14, is the “Map of the Market”
from SmartMoney.com, which displays in red and green the changes in market value of
publicly-traded companies, grouped by market sector, with cell size proportional to market
capitalization[64].
TreePlus is an example of a tree-inspired graph visualization tool (Figure 15). It uses
the guiding metaphor of “plant a seed to watch it grow” to summarize navigation of its tree-
Width of activities and entities is based on informationflow
Activities and entities are extracted from an egograph
Capturing
We need an intuitive REST-like API to integrated Open
Government data. Dealing with all these different formats
and identifiers is really taking too much time.
I have all this data, and I want to make (part of) it
available for the general public, but haven't a clue how!
Civil Servant
wants to publish data
Application Developers
want to consume data
Carrier
12:00
PM
Page Title
http://
www.domain.com
Googl
e
Apps and applications
Visual interactions with Open Data.
Application specific logics (e.g. 'danger')
CitySDK API
HTTP API to the CitySDK
Returns JSON, Turtle, etc.
(includes the Linked Data API of CitySDK)
SPARQL API
SPARQL Endpoint to the Linked
Data storage of the ODE
Partial Synchronisation
CitySDK Datastores Linked Data Triplestore
Feed into
Query
Orchestrator
Amsterdam Open Data Exchange
HTTP API to `canned queries' across multiple datasets.
Returns JSON-LD, Turtle
Data Integrator
ODE Best Practices
Best practices for publishing Open Data
CitySDK Ingestion Plugins
"Standard" adapters part of CitySDK
ODE Ingestion Adapters
Ingestion adapters developed within
ODE
Municipal Legacy Systems Excel Files
Amsterdam Open Data CKAN
Amsterdam Open Data Catalog
Will point to datasets in the ODE
May provide a direct query interface on top of ODE
Wrapper-based
Workflow-based
TomdeNies(Ghent University)

SaraMagliacane (VU University Amsterdam)
Integrated
to
2Data SemanticsSemantics for Scientific Data PublishersFrom Data
The Big Future of Data

2 October 2014
Enrich Publish Analyze
Semantic Publication of Data
Publish directly
from the cloud

to the cloud
On-the-fly analysis
and tag suggestion
Interactive Data Construction 

via Instrumented IPython Notebook
Integration in 

popular tool
No “green field”
Visual Exploration of Big Data
Virtualisation
Discover patterns
Interactive visualisation
Sparse and heterogeneous
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)

Contenu connexe

Tendances

Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Paul Groth
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data ShowcasingPaul Groth
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of DataPaul Groth
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text Paul Groth
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Oop principles a good book
Oop principles a good bookOop principles a good book
Oop principles a good booklahorisher
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsPaul Groth
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph FuturesPaul Groth
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Sören Auer
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chainPaul Groth
 
The Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for ScienceThe Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for SciencePaul Groth
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020Pistoia Alliance
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebAdrian Paschke
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsPaul Groth
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
 

Tendances (20)

Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data Showcasing
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of Data
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Cognitive data
Cognitive dataCognitive data
Cognitive data
 
Oop principles a good book
Oop principles a good bookOop principles a good book
Oop principles a good book
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph Futures
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chain
 
The Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for ScienceThe Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for Science
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic Web
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
 

En vedette

Provenance Information in the Web of Data
Provenance Information in the Web of DataProvenance Information in the Web of Data
Provenance Information in the Web of DataOlaf Hartig
 
The Structured Data Hub in 2019
The Structured Data Hub in 2019The Structured Data Hub in 2019
The Structured Data Hub in 2019Richard Zijdeman
 
Advancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataAdvancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataRichard Zijdeman
 
Historical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesHistorical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesRichard Zijdeman
 
Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)Richard Zijdeman
 
Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Richard Zijdeman
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentOlaf Hartig
 
QBer - Connect your data to the cloud
QBer - Connect your data to the cloudQBer - Connect your data to the cloud
QBer - Connect your data to the cloudRinke Hoekstra
 

En vedette (9)

Provenance Information in the Web of Data
Provenance Information in the Web of DataProvenance Information in the Web of Data
Provenance Information in the Web of Data
 
The Structured Data Hub in 2019
The Structured Data Hub in 2019The Structured Data Hub in 2019
The Structured Data Hub in 2019
 
Csdh sbg clariah_intr01
Csdh sbg clariah_intr01Csdh sbg clariah_intr01
Csdh sbg clariah_intr01
 
Advancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataAdvancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open Data
 
Historical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesHistorical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemes
 
Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)
 
Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 
QBer - Connect your data to the cloud
QBer - Connect your data to the cloudQBer - Connect your data to the cloud
QBer - Connect your data to the cloud
 

Similaire à Provenance and Reuse of Open Data (PILOD 2.0 June 2014)

Back to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV securityBack to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV securityLilminow
 
Big data visualization state of the art
Big data visualization state of the artBig data visualization state of the art
Big data visualization state of the artsoria musa
 
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-shareBigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-sharestelligence
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaNithin Kakkireni
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
 
Safeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data PerspectivesSafeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data PerspectivesParang Saraf
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningEditor IJCATR
 
Components of gis
Components of gisComponents of gis
Components of gisPramoda Raj
 
Introduction to RAGLD
Introduction to RAGLDIntroduction to RAGLD
Introduction to RAGLDragld
 
TCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYATCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYAAditya Srinivasan
 
I0324053055
I0324053055I0324053055
I0324053055theijes
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsIJMER
 
Cross Domain Data Fusion
Cross Domain Data FusionCross Domain Data Fusion
Cross Domain Data FusionIRJET Journal
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and howbobosenthil
 
Optimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4jOptimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4jNeo4j
 
Graph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jGraph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jijtsrd
 

Similaire à Provenance and Reuse of Open Data (PILOD 2.0 June 2014) (20)

Back to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV securityBack to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV security
 
Big data visualization state of the art
Big data visualization state of the artBig data visualization state of the art
Big data visualization state of the art
 
Cal Essay
Cal EssayCal Essay
Cal Essay
 
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-shareBigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_Sharmila
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big data
 
Big Data
Big DataBig Data
Big Data
 
Data visualization
Data visualizationData visualization
Data visualization
 
Safeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data PerspectivesSafeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data Perspectives
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 
Components of gis
Components of gisComponents of gis
Components of gis
 
Introduction to RAGLD
Introduction to RAGLDIntroduction to RAGLD
Introduction to RAGLD
 
TCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYATCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYA
 
Open Data Convergence
Open Data ConvergenceOpen Data Convergence
Open Data Convergence
 
I0324053055
I0324053055I0324053055
I0324053055
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result Records
 
Cross Domain Data Fusion
Cross Domain Data FusionCross Domain Data Fusion
Cross Domain Data Fusion
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
 
Optimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4jOptimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4j
 
Graph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jGraph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4j
 

Plus de Rinke Hoekstra

Jurix 2014 welcome presentation
Jurix 2014 welcome presentationJurix 2014 welcome presentation
Jurix 2014 welcome presentationRinke Hoekstra
 
Linkitup: Link Discovery for Research Data
Linkitup: Link Discovery for Research DataLinkitup: Link Discovery for Research Data
Linkitup: Link Discovery for Research DataRinke Hoekstra
 
A Network Analysis of Dutch Regulations - Using the Metalex Document Server
A Network Analysis of Dutch Regulations - Using the Metalex Document ServerA Network Analysis of Dutch Regulations - Using the Metalex Document Server
A Network Analysis of Dutch Regulations - Using the Metalex Document ServerRinke Hoekstra
 
Linked (Open) Data - But what does it buy me?
Linked (Open) Data - But what does it buy me?Linked (Open) Data - But what does it buy me?
Linked (Open) Data - But what does it buy me?Rinke Hoekstra
 
Linked Science - Building a Web of Research Data
Linked Science - Building a Web of Research DataLinked Science - Building a Web of Research Data
Linked Science - Building a Web of Research DataRinke Hoekstra
 
Semantic Representations for Research
Semantic Representations for ResearchSemantic Representations for Research
Semantic Representations for ResearchRinke Hoekstra
 
A Slightly Different Web of Data
A Slightly Different Web of DataA Slightly Different Web of Data
A Slightly Different Web of DataRinke Hoekstra
 
The Knowledge Reengineering Bottleneck
The Knowledge Reengineering BottleneckThe Knowledge Reengineering Bottleneck
The Knowledge Reengineering BottleneckRinke Hoekstra
 
Concept- en Definitie Extractie
Concept- en Definitie ExtractieConcept- en Definitie Extractie
Concept- en Definitie ExtractieRinke Hoekstra
 
SIKS 2011 Semantic Web Languages
SIKS 2011 Semantic Web LanguagesSIKS 2011 Semantic Web Languages
SIKS 2011 Semantic Web LanguagesRinke Hoekstra
 
The MetaLex Document Server - Legal Documents as Versioned Linked Data
The MetaLex Document Server - Legal Documents as Versioned Linked DataThe MetaLex Document Server - Legal Documents as Versioned Linked Data
The MetaLex Document Server - Legal Documents as Versioned Linked DataRinke Hoekstra
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of DataRinke Hoekstra
 
History of Knowledge Representation (SIKS Course 2010)
History of Knowledge Representation (SIKS Course 2010)History of Knowledge Representation (SIKS Course 2010)
History of Knowledge Representation (SIKS Course 2010)Rinke Hoekstra
 
Making Sense of Design Patterns
Making Sense of Design PatternsMaking Sense of Design Patterns
Making Sense of Design PatternsRinke Hoekstra
 
Publicatie van Linked Open Overheids Data
Publicatie van Linked Open Overheids DataPublicatie van Linked Open Overheids Data
Publicatie van Linked Open Overheids DataRinke Hoekstra
 
ODaF 2010 Linked Data in the Netherlands
ODaF 2010 Linked Data in the NetherlandsODaF 2010 Linked Data in the Netherlands
ODaF 2010 Linked Data in the NetherlandsRinke Hoekstra
 
Overzicht BEST Project - NWO Site Visit
Overzicht BEST Project - NWO Site VisitOverzicht BEST Project - NWO Site Visit
Overzicht BEST Project - NWO Site VisitRinke Hoekstra
 
Semantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web TechnologySemantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web TechnologyRinke Hoekstra
 

Plus de Rinke Hoekstra (20)

Jurix 2014 welcome presentation
Jurix 2014 welcome presentationJurix 2014 welcome presentation
Jurix 2014 welcome presentation
 
Linkitup: Link Discovery for Research Data
Linkitup: Link Discovery for Research DataLinkitup: Link Discovery for Research Data
Linkitup: Link Discovery for Research Data
 
A Network Analysis of Dutch Regulations - Using the Metalex Document Server
A Network Analysis of Dutch Regulations - Using the Metalex Document ServerA Network Analysis of Dutch Regulations - Using the Metalex Document Server
A Network Analysis of Dutch Regulations - Using the Metalex Document Server
 
Linked (Open) Data - But what does it buy me?
Linked (Open) Data - But what does it buy me?Linked (Open) Data - But what does it buy me?
Linked (Open) Data - But what does it buy me?
 
Linked Science - Building a Web of Research Data
Linked Science - Building a Web of Research DataLinked Science - Building a Web of Research Data
Linked Science - Building a Web of Research Data
 
COMMIT/VIVO
COMMIT/VIVOCOMMIT/VIVO
COMMIT/VIVO
 
Semantic Representations for Research
Semantic Representations for ResearchSemantic Representations for Research
Semantic Representations for Research
 
A Slightly Different Web of Data
A Slightly Different Web of DataA Slightly Different Web of Data
A Slightly Different Web of Data
 
The Knowledge Reengineering Bottleneck
The Knowledge Reengineering BottleneckThe Knowledge Reengineering Bottleneck
The Knowledge Reengineering Bottleneck
 
Linked Census Data
Linked Census DataLinked Census Data
Linked Census Data
 
Concept- en Definitie Extractie
Concept- en Definitie ExtractieConcept- en Definitie Extractie
Concept- en Definitie Extractie
 
SIKS 2011 Semantic Web Languages
SIKS 2011 Semantic Web LanguagesSIKS 2011 Semantic Web Languages
SIKS 2011 Semantic Web Languages
 
The MetaLex Document Server - Legal Documents as Versioned Linked Data
The MetaLex Document Server - Legal Documents as Versioned Linked DataThe MetaLex Document Server - Legal Documents as Versioned Linked Data
The MetaLex Document Server - Legal Documents as Versioned Linked Data
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of Data
 
History of Knowledge Representation (SIKS Course 2010)
History of Knowledge Representation (SIKS Course 2010)History of Knowledge Representation (SIKS Course 2010)
History of Knowledge Representation (SIKS Course 2010)
 
Making Sense of Design Patterns
Making Sense of Design PatternsMaking Sense of Design Patterns
Making Sense of Design Patterns
 
Publicatie van Linked Open Overheids Data
Publicatie van Linked Open Overheids DataPublicatie van Linked Open Overheids Data
Publicatie van Linked Open Overheids Data
 
ODaF 2010 Linked Data in the Netherlands
ODaF 2010 Linked Data in the NetherlandsODaF 2010 Linked Data in the Netherlands
ODaF 2010 Linked Data in the Netherlands
 
Overzicht BEST Project - NWO Site Visit
Overzicht BEST Project - NWO Site VisitOverzicht BEST Project - NWO Site Visit
Overzicht BEST Project - NWO Site Visit
 
Semantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web TechnologySemantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web Technology
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Dernier (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Provenance and Reuse of Open Data (PILOD 2.0 June 2014)

  • 1. Converts' rally, Evangelistic Committee of New York City, Carnegie Hall, Sept.14, 1908
  • 2.
  • 3. Open DataLinked Six Ingredients The missing ★ Mix ‘n Mash Contextualize! Choose your Grain Size Lower the Threshold Repeatable Transformation
  • 4. 1 Themissing★ http://give.everything/a/URI HTTPs URIs only please!
 (or resolver + URN) Versioninformation Versionagnostic Guessable
  • 5. 2 RepeatableTransformation Transformation should be part of routine ... ... manageable and scalable... ... repeatable ... Linked Data will not be the officialsource anytime soon http://www.w3.org/TR/prov-overview/ Provenance is key
  • 6. 3 ChooseyourGrainSize • The document is the 
 traditionalgrain size
 (dublin core) • Linked data allows for 
 deeplinks into data • Cost versus usefulness • Are you the right party to provide detailed descriptions? http://creatingandeducating.blogspot.nl/2011/11/blog-post.html
  • 7. 4 Mix‘nMash • Multiple vocabularies won’tbite • Multiple identifiers won’tbite ! • Choose what’s useful for you... • ... then map to others! Image © David Sykes 2009 All rights reserved Good News: the bulk has already been done for you!
  • 8. 5• Information is notalwayscompatible • Make explicit in which context the information holds ... • ... and who stated the information, why and how. Contextualize! Flat Earth and Square Earth idea courtesy of SzymonKlarman
  • 9.
  • 10.
  • 11. to 2Data SemanticsSemantics for Scientific Data PublishersFrom Data
  • 12.
  • 13. Photo by Philip Dujardin, http://www.filipdujardin.be
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. HerkomstenHergebruikvan OpenData Rinke Hoekstra
 VU University Amsterdam/University of Amsterdam rinke.hoekstra@vu.nl Photo by Philip Dujardin, http://www.filipdujardin.be
  • 23.
  • 24. Definition
 (OxfordEnglishDictionary) • The fact of coming from some particular source or quarter; origin, derivation; • the history or pedigree of a work of art, manuscript, rare book, etc.; • concretely, arecordofthepassage of an item through its various owners.
  • 25. Making trust judgements Liability, trust and privacy 
 in open government data Compliance and auditing 
 of business processes Licensing and attribution 
 of combined information
  • 26. Curt Tilmes, Peter Fox, Xiaogang Ma, Deborah L. McGuinness, Ana Pinheiro Privette, Aaron Smith, Anne Waple, Stephan Zednik, Jinguang Zheng: Provenance Representation for the National Climate Assessment in the Global Change Information System. IEEE T. Geoscience and Remote Sensing 51(11): 5160-5168 (2013) Integrated & Summarized Data Transparency and Trust “Provenance is the number one issue that we face when publishing government data in data.gov.uk” John Sheridan, UK National Archives, data.gov.uk
  • 27. Provenance? • Provenance = Metadata?
 Provenance can be seen as metadata, but not all metadata is provenance • Provenance = Trust?
 Provenance provides a substrate for deriving different trust metrics • Provenance = Authentication?
 Provenance records can be used to verify and authenticate amongst users
  • 28.
  • 29.
  • 30. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice
  • 31. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflows
  • 32. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflows scalability interoperability
  • 33. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflows scalability interoperability trust accountability compliance explanation debugging
  • 35. W3CPROVStandard Provenance is a record
 that describes the people, institutions, entities, and activities, involved in producing, influencing, or delivering a piece of data
 or a thing. http://www.w3.org/TR/prov-overview
  • 36. Luc Moreau & Paul Groth W3CPROVStandard Provenance is a record
 that describes the people, institutions, entities, and activities, involved in producing, influencing, or delivering a piece of data
 or a thing. http://www.w3.org/TR/prov-overview
  • 38.
  • 40.
  • 41. NaiveApproaches InProv: Visualizing Provenance Graphs with Radial Layouts and Time-Based Hierarchical Grouping
 Madelaine D. Boyd - http://www.seas.harvard.edu/sites/default/files/files/archived/Boyd.pdf Orbiter has several limitations. It does not have capabilities for query subgraph high- lighting, regular expression filters, process grouping, annotations, or programmable views[16]. Furthermore, the structure of each summary node, where child nodes are grouped within parents and are hidden until the parent is expanded, benefits queries earlier in the depen- dency chain. Initial overviews often correspond with system bootup, and appear very similar across di↵erent traces (time slices of system activity). Figure 10: In these screenshots of Orbiter, the presence of edges overwhelms the visibility of nodes. By relying on a node-link graph layout and using spatial location to encode object relationships, Orbiter’s graph layout algorithm must draw many long edges to communi- cate node connections. Without edge bundling or opacity variation, the meanings of these relationships are obscured. Another one of Orbiter’s weaknesses is its node-link diagram layout. As a result, each node’s position in the X-Y plane and the length and angle of connecting lines are wasted attributes. The chosen graph layout algorithm (dot by default) arranges nodes to minimize Figure 11: (Top): A screenshot of the portion of the graph generated by GraphViz for a trace of the third provenance challenge. (Bottom): A zoomed-in view of the same graph. The horizontal black bars across the images are dense collections of edges. E↵ective large graph visualizations present the user with a summary view that can be explored, filtered, and expanded interactively. 2.5 Tree Visualization While trees are a subcategory of graphs, because of their hierarchical composition, tree visu- alization forms its own subfield of research. A survey of over two-hundred tree visualizations is given at Hans-Jrg Schulz’s treevis.net. Visitors can narrow down by dimensionality (2D, 3D, or mixed), representation (explicit node-link diagram, implicit treemap, or combi- nation), alignment (XY plot, radial layout, or free diagram)[55]. These categories are shown Figure 12: Left: Pajek uses various summary node-link and matrix-based representations depending on the structure of the supplied data set. Pictured is a main core subgraph extracted from routing data on the Internet. Right: TopoLayout optimizes the choice of visualization display depending on the underlying graph structure. The right column is TopoLayout’s output, while the left and middle columns are the outputs of the GRIP and FM graph layout algorithms. Figure 13: treevis.net defines di↵erent categories for tree maps. Tree maps can be cate- gorized by dimensionality (2D, 3D, or mixed), representation (explicit, implicit, or mixed), or alignment (XY, radial, or spring). Tree visualizations are either explicit or implicit. Explicit representations resemble node- link diagrams. An example of an implicit representation is a tree map, a diagram where the entire tree is inscribed in a rectangle representing the root node. This root is subdivided hierarchically into more rectangles, which represent child nodes, and each child node is subdivided into more child nodes. Treemaps are excellent for displaying hierarchical or categorical data[57]. One famous example, shown in Figure 14, is the “Map of the Market” from SmartMoney.com, which displays in red and green the changes in market value of publicly-traded companies, grouped by market sector, with cell size proportional to market capitalization[64]. TreePlus is an example of a tree-inspired graph visualization tool (Figure 15). It uses the guiding metaphor of “plant a seed to watch it grow” to summarize navigation of its tree-
  • 42.
  • 43. Width of activities and entities is based on informationflow Activities and entities are extracted from an egograph
  • 45.
  • 46.
  • 47. We need an intuitive REST-like API to integrated Open Government data. Dealing with all these different formats and identifiers is really taking too much time. I have all this data, and I want to make (part of) it available for the general public, but haven't a clue how! Civil Servant wants to publish data Application Developers want to consume data Carrier 12:00 PM Page Title http:// www.domain.com Googl e Apps and applications Visual interactions with Open Data. Application specific logics (e.g. 'danger') CitySDK API HTTP API to the CitySDK Returns JSON, Turtle, etc. (includes the Linked Data API of CitySDK) SPARQL API SPARQL Endpoint to the Linked Data storage of the ODE Partial Synchronisation CitySDK Datastores Linked Data Triplestore Feed into Query Orchestrator Amsterdam Open Data Exchange HTTP API to `canned queries' across multiple datasets. Returns JSON-LD, Turtle Data Integrator ODE Best Practices Best practices for publishing Open Data CitySDK Ingestion Plugins "Standard" adapters part of CitySDK ODE Ingestion Adapters Ingestion adapters developed within ODE Municipal Legacy Systems Excel Files Amsterdam Open Data CKAN Amsterdam Open Data Catalog Will point to datasets in the ODE May provide a direct query interface on top of ODE Wrapper-based
  • 51.
  • 52.
  • 53. to 2Data SemanticsSemantics for Scientific Data PublishersFrom Data The Big Future of Data
 2 October 2014
  • 55. Semantic Publication of Data Publish directly from the cloud
 to the cloud On-the-fly analysis and tag suggestion
  • 56. Interactive Data Construction 
 via Instrumented IPython Notebook Integration in 
 popular tool No “green field”
  • 57. Visual Exploration of Big Data Virtualisation Discover patterns Interactive visualisation Sparse and heterogeneous