Generating Illustrative Snippets for Open Data on the Web
1. Generating Illustrative Snippets
for Open Data on the Web
Gong Cheng, Cheng Jin, Wentao Ding, Danyun Xu, Yuzhong Qu
Websoft Research Group
National Key Laboratory for Novel Software Technology
Nanjing University, China
Websoft
6. We propose
to also serve an illustrative snippet,
Dataset:
A set of entity-property-value triples
Snippet:
A size-limited subset of triples
Snippet generation
7. and to serve a high-quality snippet.
• Coverage
To cover the most important entity types and properties.
• Familiarity
To contain entities familiar to average users.
• Cohesion
To describe a set of related entities.
8. To this end, we formulate and solve a new
combinatorial optimization problem:
• Maximum-weight-and-coverage connected graph
problem (MwcCG)
9. To this end, we formulate and solve a new
combinatorial optimization problem:
• Maximum-weight-and-coverage connected graph
problem (MwcCG)
CoverageFamiliarity Cohesion
Quality of snippet
11. Summary
• Motivation
• To help people quickly know the contents of a large dataset
• Our contribution
• We propose to automatically extract an optimal illustrative snippet
pursuing coverage, familiarity, and cohesion.
• We formulate a new combinatorial optimization problem:
to maximize coverage & weights, constrained by graph connectivity.
• We solve the problem using an approximation algorithm.
• Paper
• Gong Cheng, Cheng Jin, Wentao Ding, Danyun Xu, Yuzhong Qu.
Generating Illustrative Snippets for Open Data on the Web.
In Proc. WSDM ’17.