SlideShare une entreprise Scribd logo
1  sur  19
Motivation
Data on the Web
09/07/13 ICWE 2013, Aalborg, Denmark
Some eyecatching opener illustrating growth and or diversity of web data
Summaries on the fly:
Query-based Extraction of Structured Knowledge
from Web Documents
ICWE 2013: International Conference on Web Engineering
8-12 July 2013, Aalborg , Denmark
Besnik Fetahu, Bernardo Pereira Nunes, Stefan Dietze
(L3S Research Center, DE)
Outline
– Introduction
– Related Work
– Focused Knowledge Extraction
• Pre-Processing & Query Expansion
• Pattern Generation
• Contextual Structure
– Evaluation
– Results
– Conclusions
09/07/13 ICWE 2013, Aalborg, Denmark
Introduction
• Motivation
– Large amounts of textual Web Documents
– Efficient techniques querying for relevant information
– Extraction of chunks of text: relations, named entities etc.
– Summaries as means on highlighting most important chunks of text
• Issues:
– Summaries as non-structured text
– Weak relationship of user interests and importance of specific chunks of
text in a corpus
09/07/13 ICWE 2013, Aalborg, Denmark
Prominent Text Summarisation Approaches
• Heuristics for relation extraction
• Extraction of information based on predefined templates
• Sentence inclusion based on inclusion of specific terms
• Latent Semantic Analysis (LSA) for measuring importance of specific terms
• Tree Kernels encoding relevant information for event detection
• Latent Dirichlet Allocation (LDA) for topic modelling
• Populating ontologies based on extracted information from text
09/07/13 ICWE 2013, Aalborg, Denmark
IE
IR
ML
SW
Focused Knowledge Extraction
Overview
• Structured Summary Generation Components:
– Query Expansion and Reformulation
– Named Entity Definition and Co-Reference Resolution
– Pattern Generation
– Contextual Structure of Summaries
09/07/13 ICWE 2013, Aalborg, Denmark
Focused Knowledge Extraction
Pipeline
09/07/13 ICWE 2013, Aalborg, Denmark
Stem Cell
user query
Anatomical structure
Biotechnology
Cloning
Cell biology
Developmental Biology
Stem Cell
query typing and expansion
Corpus
OR/AND of
expanded query terms
NER
POS
Annotate
filtered
documents patterns
Democrats → applauded → Mr. Spitzer Eliot (Gov) calls
→ insure → 500 000 children → lack→ health insurance
→ enroll → 900 000 adults → are → eligible Medicaid
→ enrolled → issue debt → pay → stem cell research.
structured summary
Entities Actions
Focused Knowledge Extraction
Query Expansion
• Query (“Stem Cell”) → NER → http://dbpedia.org/page/Stem_cell
• Query Typing & Expansion
– DBpedia SPARQL Query Expansion:
• Query: “Stem Cell” is processed into:
– Typed Query:
• http://dbpedia.org/page/Stem_cell
– Expanded Query:
• http://dbpedia.org/page/Biotechnology
• http://dbpedia.org/page/Cloning
• http://dbpedia.org/page/Cell_biology
• http://dbpedia.org/page/Developmental_biology
– Conjunction/Disjunction of expanded query terms
09/07/13 ICWE 2013, Aalborg, Denmark
SELECT ?o ?label WHERE{
<http://dbpedia.org/resource/Stem_cell> ?p ?o .
?o rdfs:label ?label }
Focused Knowledge Extraction -
Named Entity Definitions & Co-Reference Resolution
• Entities recognised using NER&NED tools (Stanford’s NLP toolkit)
• Construct a co-occurrence matrix of proper nouns appearing consecutively
• Sample entities: “Chicago Bears”, “playoff games”
• Co-reference resolution crucial for accurate knowledge extraction
09/07/13 ICWE 2013, Aalborg, Denmark

k
i
ii termtermoccurrcoiMiscentity
1
1),(][
=
+−=
Focused Knowledge Extraction
Pattern Generation
• Determine topic terms (LDA) from the
underlying filtered corpus
• Annotate using POS taggers topic terms
• Pattern items:
– POS tags from topic terms
– Query terms (incl. terms after expansion)
09/07/13 ICWE 2013, Aalborg, Denmark
police found women men dr death people drug
medical officers man problems study killed
heart hospital test sex patients evidence dead
drugs officer….
police_NN found_VBD women_NNS men_NNS
dr_VBP death_NN people_NNS drug_NN
medical_JJ officers_NNS man_NN
problems_NNS study_NN killed_VBD heart_NN
hospital_NN test_NN sex_NN patients_NNS
evidence_NN dead_NN drugs_NNS officer_NN
NN → VBD → NNS → VBP → NN….
Stem Cell → Anatomical structure →
Biotechnology Cloning → Cell Biology →
Developmental Biology
Focused Knowledge Extraction
Pattern Generation (I)
• Construct co-occurrence matrix of pattern items (POS tags, Query terms)
• Generate automatically emerging patterns reflecting syntactical relevance
of chunks of text
• Patterns as a sequence of co-occurring items, modelled as directed tree
graphs
• For each pattern item generate a directed tree graph, considering it as a
root node
• Patterns score conveys importance for a given corpus and query
09/07/13 ICWE 2013, Aalborg, Denmark
Generated Patterns Pattern Score ψscore
NN → JJ → VB → RB 0.28571429
NN → VB → JJ → RB 0.19949495
Stem Cell → NN → VB → RB → JJ 0.17361111
JJ → RB → VB → NN → Stem Cell 0.17347462
RB → JJ → NN → Stem Cell 0.16466599
NN → Stem Cell → RB → VB → JJ 0.16155811
RB → VB → Stem Cell → NN → JJ 0.16129665
09/07/13 ICWE 2013, Aalborg, Denmark
Focused Knowledge Extraction
Pattern Generation (II)
Automatically generated patterns showing sequence of important syntactical items to appear in a sentence
Scoring mechanism of patterns as the marginal
probability of co-occurring pattern items based on the
filtered corpus
Prior probability of a
pattern item, as the
head node of the
directed tree graph.
Conditional probability
of two consecutive
pattern items
Focused Knowledge Extraction
Contextual Structure of Summaries
• Summaries generated as structured knowledge
• Decomposition of summaries into two structures:
– global (Entities, Actions) for entire corpus
– local (entity-context, action-context) for particular document
• Multiple summary perspectives based on generated context
• Enrichment with additional information from reference datasets (DBpedia)
09/07/13 ICWE 2013, Aalborg, Denmark
Focused Knowledge Extraction
Contextual Structure of Summaries
09/07/13 ICWE 2013, Aalborg, Denmark
Contextual Structure of Summaries with global and local structures enabling multiple summary perspectives:
“The kinds of stem cell therapies being researched for the most part do not involve the politically sensitive use of
embryonic stem cells.”
Stem cell
Therapies
researched
involve
Stem Cell:
Embryonic, sensitive
researched:
Stem cell therapies ↔ most part
Evaluation Setup
• Dataset: New York Times, year 2007
• 40,000 articles with manually generated summaries
• Summary relevance w.r.t the generated context (query)
• Coverage of the manually NYT generated summaries
• ROGUE-n metric to measure coverage of structured vs. manually generated
summaries
09/07/13 ICWE 2013, Aalborg, Denmark
Total n-grams
Matching n-grams from
structured and manually
generated summaries.
Results
• 10 queries used for evaluation (2007’s prominent events from Time’s
Magazine1
)
• Human evaluation for summary relevance: 76% correctly generated
• 17 evaluators with an average of 20 summaries evaluated
1
http://www.time.com/time/specials/2007/0,28757,1686204,00.html
09/07/13 ICWE 2013, Aalborg, Denmark
Query European
Union
Super
Bowl
US
Congress
Virgina
Tech
Stem
Cell
Protest Harry
Potter
Global
Warming
National
Security
Terrorist
Attacks
#Q. Terms 7 13 17 28 5 2 22 5 0 0
#Doc. 157 370 13 12 105 129 10 198 250 57
#Summ. 129 325 19 11 86 103 7 170 207 52
Generated structured summaries for the different queries.
Results
• ROGUE-1 evaluation results for the 10 queries
• 25% precision and 32% recall as best performing results for ROGUE-1
09/07/13 ICWE 2013, Aalborg, Denmark
P/R/F1 measures based on ROGUE-1 metric for the 10 queries used for evaluation
Results
Sample Generated Summaries
09/07/13 ICWE 2013, Aalborg, Denmark
Query: “Stem Cell”
Democrats → applauded → Mr. Spitzer Eliot (Gov) calls → insure → 500, 000 children → lack → health
insurance → enrol → 900, 000 adults → are → eligible Medicaid → enrolled → issue debt → pay → stem cell
research.
Congress’s Shift in Power → revives → Medicare Debate House Democrats → try to rush → legislation →
requiring → government → negotiate → lower drug prices for Medicare beneficiaries → overturning →
President Bush’s restrictions on embryonic stem cell research.
The nation → welcome → ambitious agenda → being offered → today by the new Congress Democratic
majority → raising → minimum wage → advancing → stem cell research → restoring → oversight of the
executive branch.
New study → suggesting → useful stem cells → be derived → amniotic fluid without → destroying →
embryos.
Swarns, Rachel L → announced → 9 Aug. federal government → pays → studies on stem cell colonies , lines
→ created before→ that date, government → does not encourage → destruction of additional embryos .
Stem cell research → has not produced → a single medical treatment → is morally wrong→ to create human
life → to destroy → for research.
The measure → allow → scientists → receiving → federal funds → use → embryonic stem cells from surplus
embryos → generated → fertility clinics , after cell lines → had been derived → by others → using → nonfederal
funds.
Conclusions
• Query-based generated summaries
• Contextualised Structured Summaries
– Typing and expanding of queries using reference datasets
– Automated pattern generation
• Incorporated user interests and syntactical relevance of chunks of text
• Multiple summary perspectives
• Overall good accuracy of generated summaries
• Infer new knowledge by interlinking summaries of different/same contexts
09/07/13 ICWE 2013, Aalborg, Denmark
Thank you!
Questions?
09/07/13 ICWE 2013, Aalborg, Denmark

Contenu connexe

Tendances

Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Jisc
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementcunera
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsManuel Corpas
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesMatthieu Schapranow
 
Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?Philip Bourne
 
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Susanna-Assunta Sansone
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataPhilip Bourne
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsPaul Groth
 
Computational Research day 2015
Computational Research day 2015Computational Research day 2015
Computational Research day 2015cunera
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Datacunera
 
Why does research data matter to libraries
Why does research data matter to librariesWhy does research data matter to libraries
Why does research data matter to librariesJisc RDM
 
Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014Jisc
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Matthieu Schapranow
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
 
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...ICPSR
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchMartin Donnelly
 

Tendances (20)

Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
Research Data Management: How will Northwestern address new sharing requireme...
Research Data Management: How will Northwestern address new sharing requireme...Research Data Management: How will Northwestern address new sharing requireme...
Research Data Management: How will Northwestern address new sharing requireme...
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life Sciences
 
Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?
 
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big Data
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
 
Computational Research day 2015
Computational Research day 2015Computational Research day 2015
Computational Research day 2015
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Data
 
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
 
Why does research data matter to libraries
Why does research data matter to librariesWhy does research data matter to libraries
Why does research data matter to libraries
 
Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521
 
UWA Research Week 2016
UWA Research Week 2016UWA Research Week 2016
UWA Research Week 2016
 
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening Research
 

En vedette

euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Towards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphTowards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphBesnik Fetahu
 
Combining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingCombining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingBesnik Fetahu
 
Automated News Suggestions for Populating Wikipedia Entity Pages
Automated News Suggestions for Populating Wikipedia Entity PagesAutomated News Suggestions for Populating Wikipedia Entity Pages
Automated News Suggestions for Populating Wikipedia Entity PagesBesnik Fetahu
 
Improving Entity Retrieval on Structured Data
Improving Entity Retrieval on Structured DataImproving Entity Retrieval on Structured Data
Improving Entity Retrieval on Structured DataBesnik Fetahu
 
Complex Matching of RDF Datatype Properties
Complex Matching of RDF Datatype PropertiesComplex Matching of RDF Datatype Properties
Complex Matching of RDF Datatype PropertiesBesnik Fetahu
 
How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?Besnik Fetahu
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesBesnik Fetahu
 
Finding News Citations For Wikipedia
Finding News Citations For WikipediaFinding News Citations For Wikipedia
Finding News Citations For WikipediaBesnik Fetahu
 

En vedette (9)

euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Towards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphTowards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data Graph
 
Combining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingCombining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linking
 
Automated News Suggestions for Populating Wikipedia Entity Pages
Automated News Suggestions for Populating Wikipedia Entity PagesAutomated News Suggestions for Populating Wikipedia Entity Pages
Automated News Suggestions for Populating Wikipedia Entity Pages
 
Improving Entity Retrieval on Structured Data
Improving Entity Retrieval on Structured DataImproving Entity Retrieval on Structured Data
Improving Entity Retrieval on Structured Data
 
Complex Matching of RDF Datatype Properties
Complex Matching of RDF Datatype PropertiesComplex Matching of RDF Datatype Properties
Complex Matching of RDF Datatype Properties
 
How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 
Finding News Citations For Wikipedia
Finding News Citations For WikipediaFinding News Citations For Wikipedia
Finding News Citations For Wikipedia
 

Similaire à Summaries on the fly: Query-based Extraction of Structured Knowledge from Web Documents

ICIC 2016: Examining Funding Data to Predict the Future of Research
ICIC 2016: Examining Funding Data to Predict the Future of ResearchICIC 2016: Examining Funding Data to Predict the Future of Research
ICIC 2016: Examining Funding Data to Predict the Future of ResearchDr. Haxel Consult
 
Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Anita de Waard
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
Data management profiles workshop
Data management profiles workshopData management profiles workshop
Data management profiles workshoplindahauck
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in librariesC. Tobin Magle
 
Causal discovery
Causal discoveryCausal discovery
Causal discoverydagunisa
 
Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassAaron Collie
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...dkNET
 
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...Fiona Nielsen
 
Database technologies in bioinformatics
Database technologies in bioinformaticsDatabase technologies in bioinformatics
Database technologies in bioinformaticsGleb Sklyr
 
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016Jisc
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDatabricks
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
 
Bionano summer symposium: Finding information for your research
Bionano summer symposium: Finding information for your researchBionano summer symposium: Finding information for your research
Bionano summer symposium: Finding information for your researchAndrea Miller-Nesbitt
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Anita de Waard
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhilip Bourne
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?Maryann Martone
 

Similaire à Summaries on the fly: Query-based Extraction of Structured Knowledge from Web Documents (20)

ICIC 2016: Examining Funding Data to Predict the Future of Research
ICIC 2016: Examining Funding Data to Predict the Future of ResearchICIC 2016: Examining Funding Data to Predict the Future of Research
ICIC 2016: Examining Funding Data to Predict the Future of Research
 
Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Data management profiles workshop
Data management profiles workshopData management profiles workshop
Data management profiles workshop
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in libraries
 
Causal discovery
Causal discoveryCausal discovery
Causal discovery
 
Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities Class
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
 
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
 
Database technologies in bioinformatics
Database technologies in bioinformaticsDatabase technologies in bioinformatics
Database technologies in bioinformatics
 
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016
 
Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge Graphs
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
Bionano summer symposium: Finding information for your research
Bionano summer symposium: Finding information for your researchBionano summer symposium: Finding information for your research
Bionano summer symposium: Finding information for your research
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Summaries on the fly: Query-based Extraction of Structured Knowledge from Web Documents

  • 1. Motivation Data on the Web 09/07/13 ICWE 2013, Aalborg, Denmark Some eyecatching opener illustrating growth and or diversity of web data Summaries on the fly: Query-based Extraction of Structured Knowledge from Web Documents ICWE 2013: International Conference on Web Engineering 8-12 July 2013, Aalborg , Denmark Besnik Fetahu, Bernardo Pereira Nunes, Stefan Dietze (L3S Research Center, DE)
  • 2. Outline – Introduction – Related Work – Focused Knowledge Extraction • Pre-Processing & Query Expansion • Pattern Generation • Contextual Structure – Evaluation – Results – Conclusions 09/07/13 ICWE 2013, Aalborg, Denmark
  • 3. Introduction • Motivation – Large amounts of textual Web Documents – Efficient techniques querying for relevant information – Extraction of chunks of text: relations, named entities etc. – Summaries as means on highlighting most important chunks of text • Issues: – Summaries as non-structured text – Weak relationship of user interests and importance of specific chunks of text in a corpus 09/07/13 ICWE 2013, Aalborg, Denmark
  • 4. Prominent Text Summarisation Approaches • Heuristics for relation extraction • Extraction of information based on predefined templates • Sentence inclusion based on inclusion of specific terms • Latent Semantic Analysis (LSA) for measuring importance of specific terms • Tree Kernels encoding relevant information for event detection • Latent Dirichlet Allocation (LDA) for topic modelling • Populating ontologies based on extracted information from text 09/07/13 ICWE 2013, Aalborg, Denmark IE IR ML SW
  • 5. Focused Knowledge Extraction Overview • Structured Summary Generation Components: – Query Expansion and Reformulation – Named Entity Definition and Co-Reference Resolution – Pattern Generation – Contextual Structure of Summaries 09/07/13 ICWE 2013, Aalborg, Denmark
  • 6. Focused Knowledge Extraction Pipeline 09/07/13 ICWE 2013, Aalborg, Denmark Stem Cell user query Anatomical structure Biotechnology Cloning Cell biology Developmental Biology Stem Cell query typing and expansion Corpus OR/AND of expanded query terms NER POS Annotate filtered documents patterns Democrats → applauded → Mr. Spitzer Eliot (Gov) calls → insure → 500 000 children → lack→ health insurance → enroll → 900 000 adults → are → eligible Medicaid → enrolled → issue debt → pay → stem cell research. structured summary Entities Actions
  • 7. Focused Knowledge Extraction Query Expansion • Query (“Stem Cell”) → NER → http://dbpedia.org/page/Stem_cell • Query Typing & Expansion – DBpedia SPARQL Query Expansion: • Query: “Stem Cell” is processed into: – Typed Query: • http://dbpedia.org/page/Stem_cell – Expanded Query: • http://dbpedia.org/page/Biotechnology • http://dbpedia.org/page/Cloning • http://dbpedia.org/page/Cell_biology • http://dbpedia.org/page/Developmental_biology – Conjunction/Disjunction of expanded query terms 09/07/13 ICWE 2013, Aalborg, Denmark SELECT ?o ?label WHERE{ <http://dbpedia.org/resource/Stem_cell> ?p ?o . ?o rdfs:label ?label }
  • 8. Focused Knowledge Extraction - Named Entity Definitions & Co-Reference Resolution • Entities recognised using NER&NED tools (Stanford’s NLP toolkit) • Construct a co-occurrence matrix of proper nouns appearing consecutively • Sample entities: “Chicago Bears”, “playoff games” • Co-reference resolution crucial for accurate knowledge extraction 09/07/13 ICWE 2013, Aalborg, Denmark  k i ii termtermoccurrcoiMiscentity 1 1),(][ = +−=
  • 9. Focused Knowledge Extraction Pattern Generation • Determine topic terms (LDA) from the underlying filtered corpus • Annotate using POS taggers topic terms • Pattern items: – POS tags from topic terms – Query terms (incl. terms after expansion) 09/07/13 ICWE 2013, Aalborg, Denmark police found women men dr death people drug medical officers man problems study killed heart hospital test sex patients evidence dead drugs officer…. police_NN found_VBD women_NNS men_NNS dr_VBP death_NN people_NNS drug_NN medical_JJ officers_NNS man_NN problems_NNS study_NN killed_VBD heart_NN hospital_NN test_NN sex_NN patients_NNS evidence_NN dead_NN drugs_NNS officer_NN NN → VBD → NNS → VBP → NN…. Stem Cell → Anatomical structure → Biotechnology Cloning → Cell Biology → Developmental Biology
  • 10. Focused Knowledge Extraction Pattern Generation (I) • Construct co-occurrence matrix of pattern items (POS tags, Query terms) • Generate automatically emerging patterns reflecting syntactical relevance of chunks of text • Patterns as a sequence of co-occurring items, modelled as directed tree graphs • For each pattern item generate a directed tree graph, considering it as a root node • Patterns score conveys importance for a given corpus and query 09/07/13 ICWE 2013, Aalborg, Denmark
  • 11. Generated Patterns Pattern Score ψscore NN → JJ → VB → RB 0.28571429 NN → VB → JJ → RB 0.19949495 Stem Cell → NN → VB → RB → JJ 0.17361111 JJ → RB → VB → NN → Stem Cell 0.17347462 RB → JJ → NN → Stem Cell 0.16466599 NN → Stem Cell → RB → VB → JJ 0.16155811 RB → VB → Stem Cell → NN → JJ 0.16129665 09/07/13 ICWE 2013, Aalborg, Denmark Focused Knowledge Extraction Pattern Generation (II) Automatically generated patterns showing sequence of important syntactical items to appear in a sentence Scoring mechanism of patterns as the marginal probability of co-occurring pattern items based on the filtered corpus Prior probability of a pattern item, as the head node of the directed tree graph. Conditional probability of two consecutive pattern items
  • 12. Focused Knowledge Extraction Contextual Structure of Summaries • Summaries generated as structured knowledge • Decomposition of summaries into two structures: – global (Entities, Actions) for entire corpus – local (entity-context, action-context) for particular document • Multiple summary perspectives based on generated context • Enrichment with additional information from reference datasets (DBpedia) 09/07/13 ICWE 2013, Aalborg, Denmark
  • 13. Focused Knowledge Extraction Contextual Structure of Summaries 09/07/13 ICWE 2013, Aalborg, Denmark Contextual Structure of Summaries with global and local structures enabling multiple summary perspectives: “The kinds of stem cell therapies being researched for the most part do not involve the politically sensitive use of embryonic stem cells.” Stem cell Therapies researched involve Stem Cell: Embryonic, sensitive researched: Stem cell therapies ↔ most part
  • 14. Evaluation Setup • Dataset: New York Times, year 2007 • 40,000 articles with manually generated summaries • Summary relevance w.r.t the generated context (query) • Coverage of the manually NYT generated summaries • ROGUE-n metric to measure coverage of structured vs. manually generated summaries 09/07/13 ICWE 2013, Aalborg, Denmark Total n-grams Matching n-grams from structured and manually generated summaries.
  • 15. Results • 10 queries used for evaluation (2007’s prominent events from Time’s Magazine1 ) • Human evaluation for summary relevance: 76% correctly generated • 17 evaluators with an average of 20 summaries evaluated 1 http://www.time.com/time/specials/2007/0,28757,1686204,00.html 09/07/13 ICWE 2013, Aalborg, Denmark Query European Union Super Bowl US Congress Virgina Tech Stem Cell Protest Harry Potter Global Warming National Security Terrorist Attacks #Q. Terms 7 13 17 28 5 2 22 5 0 0 #Doc. 157 370 13 12 105 129 10 198 250 57 #Summ. 129 325 19 11 86 103 7 170 207 52 Generated structured summaries for the different queries.
  • 16. Results • ROGUE-1 evaluation results for the 10 queries • 25% precision and 32% recall as best performing results for ROGUE-1 09/07/13 ICWE 2013, Aalborg, Denmark P/R/F1 measures based on ROGUE-1 metric for the 10 queries used for evaluation
  • 17. Results Sample Generated Summaries 09/07/13 ICWE 2013, Aalborg, Denmark Query: “Stem Cell” Democrats → applauded → Mr. Spitzer Eliot (Gov) calls → insure → 500, 000 children → lack → health insurance → enrol → 900, 000 adults → are → eligible Medicaid → enrolled → issue debt → pay → stem cell research. Congress’s Shift in Power → revives → Medicare Debate House Democrats → try to rush → legislation → requiring → government → negotiate → lower drug prices for Medicare beneficiaries → overturning → President Bush’s restrictions on embryonic stem cell research. The nation → welcome → ambitious agenda → being offered → today by the new Congress Democratic majority → raising → minimum wage → advancing → stem cell research → restoring → oversight of the executive branch. New study → suggesting → useful stem cells → be derived → amniotic fluid without → destroying → embryos. Swarns, Rachel L → announced → 9 Aug. federal government → pays → studies on stem cell colonies , lines → created before→ that date, government → does not encourage → destruction of additional embryos . Stem cell research → has not produced → a single medical treatment → is morally wrong→ to create human life → to destroy → for research. The measure → allow → scientists → receiving → federal funds → use → embryonic stem cells from surplus embryos → generated → fertility clinics , after cell lines → had been derived → by others → using → nonfederal funds.
  • 18. Conclusions • Query-based generated summaries • Contextualised Structured Summaries – Typing and expanding of queries using reference datasets – Automated pattern generation • Incorporated user interests and syntactical relevance of chunks of text • Multiple summary perspectives • Overall good accuracy of generated summaries • Infer new knowledge by interlinking summaries of different/same contexts 09/07/13 ICWE 2013, Aalborg, Denmark
  • 19. Thank you! Questions? 09/07/13 ICWE 2013, Aalborg, Denmark