SlideShare a Scribd company logo
1 of 19
Download to read offline
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
SEMANTiCS’16 - 13.09.2016
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
Semantic Processing for the Conversion of
Unstructured Documents into Structured
Information in the Enterprise Context
The NXTM research project
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Agenda
• Motivation
• The NXTM Project
• Data analysis
• Search Engine
• Representation Layer
• Use case
Adam Bartusiak M.Sc. : The NXTM research project 2/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Motivation
• unstructured data overload (80-90% of digital data)
• unstructured data is rather intended for human consumption only
• it holds useful knowledge that can be utilized for:
• trend analytics
• decision support
• problem solving
• discovering new facts and relations
• it can improve knowledge management within enterprise
• it helps SMEs gaining a sustainable competitive advantage on the market
Adam Bartusiak M.Sc. : The NXTM research project 3/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
The NXTM Project
• cooperation project between HSZG and an IT company from Dresden
• lifetime: January 2015 - October 2016
Adam Bartusiak M.Sc. : The NXTM research project 4/10
Goal:
Improving SMEs’ processes for extracting valuable business information from UD
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
The NXTM Project
• cooperation project between HSZG and an IT company from Dresden
• lifetime: January 2015 - October 2016
Goal:
Adam Bartusiak M.Sc. : The NXTM research project 4/10
Improving SMEs’ processes for extracting valuable business information from UD
• extraction of structured data from unstructured data from multiple resources:
• emails and text messages
• MS Office and PDF documents
• XML and HTML files
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
The NXTM Project
• cooperation project between HSZG and an IT company from Dresden
• lifetime: January 2015 - October 2016
Adam Bartusiak M.Sc. : The NXTM research project 4/10
Goal:
Improving SMEs’ processes for extracting valuable business information from UD
• extraction of structured data from unstructured data from multiple resources:
• emails and text messages
• MS Office and PDF documents
• XML and HTML files
• dynamic recognition and representation of linked information in documents
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
The NXTM Project
• cooperation project between HSZG and an IT company from Dresden
• lifetime: January 2015 - October 2016
Adam Bartusiak M.Sc. : The NXTM research project 4/10
Goal:
Improving SMEs’ processes for extracting valuable business information from UD
• extraction of structured data from unstructured data from multiple resources:
• emails and text messages
• MS Office and PDF documents
• XML and HTML files
• flexible and intuitive graphical user interface enabling easy access to the
analyzed data
• dynamic recognition and representation of linked information in documents
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Data analysis
Data Input Interface
1. import of documents as JAVA objects from the input pipeline
Adam Bartusiak M.Sc. : The NXTM research project 5/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Data analysis
Data Input Interface
NXTM Data and Text Analysis Engine
Metadata Analysis
Text Extraction
Segmentation
Morphology
Semantic Analysis
Similarity Analysis
1. import of documents as JAVA objects from the input pipeline
2. language identification, MIME-Type and metadata analysis
Adam Bartusiak M.Sc. : The NXTM research project 5/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Data analysis
Data Input Interface
NXTM Data and Text Analysis Engine
Metadata Analysis
Text Extraction
Segmentation
Morphology
Semantic Analysis
Similarity Analysis
1. import of documents as JAVA objects from the input pipeline
2. language identification, MIME-Type and metadata analysis
3. NL processing in chained analysis engines and annotating semantic information
Adam Bartusiak M.Sc. : The NXTM research project 5/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Data analysis
Data Input Interface
NXTM Data and Text Analysis Engine
Metadata Analysis
Text Extraction
Segmentation
Morphology
Semantic Analysis
Similarity Analysis
1. import of documents as JAVA objects from the input pipeline
2. language identification, MIME-Type and metadata analysis
3. NL processing in chained analysis engines and annotating semantic information
4. similarity calculation and document clustering
Adam Bartusiak M.Sc. : The NXTM research project 5/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Data analysis
Data Input Interface Data Persistence Layer
NXTM Data and Text Analysis Engine
Metadata Analysis
Text Extraction
Segmentation
Morphology
Semantic Analysis
Similarity Analysis
DB
Mapper
Clustering Engine
1. import of documents as JAVA objects from the input pipeline
2. language identification, MIME-Type and metadata analysis
3. NL processing in chained analysis engines and annotating semantic information
4. similarity calculation and document clustering
5. storing extracted data in DB, updating search index
Adam Bartusiak M.Sc. : The NXTM research project 5/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Data analysis
Data Input Interface Data Persistence Layer
NXTM Data and Text Analysis Engine
Metadata Analysis
Text Extraction
Segmentation
Morphology
Semantic Analysis
Similarity Analysis
Linked Open Data
Knowledge
Integrator
DB
Mapper
Clustering Engine
1. import of documents as JAVA objects from the input pipeline
2. language identification, MIME-Type and metadata analysis
3. NL processing in chained analysis engines and annotating semantic information
4. similarity calculation and document clustering
5. storing extracted data in DB, updating search index
6. mapping annotated entities with LOD resources
Adam Bartusiak M.Sc. : The NXTM research project 5/10
NXTM Item
• ID
• Type (DOC, ENT)
• Attribute []
• …
•
Attribute
• Predicate
• Value (NXTM_Item_ID; String)
• Provenance (NXTM_Item_ID)
• Confidence
• Access policy
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Data analysis
1. import of documents as JAVA objects from the input pipeline
2. language identification, MIME-Type and metadata analysis
3. NL processing in chained analysis engines and annotating semantic information
4. similarity calculation and document clustering
5. storing extracted data in DB, updating search index
6. mapping annotated entities with LOD resources
Adam Bartusiak M.Sc. : The NXTM research project 5/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Search Engine
Data Presistence Layer
Search query…
Semantic Search Machine
NXTM Search Layer
Field Value
ID NXTM_Item_ID
Content LuceneAnalyzer
Semantic SIREnAnalyzer
• direct queries to a DB for retrieving the
analysed data is an inefficient way of
searching information
• a semantic search machine can
effectively search for hierarchical data
• search engine is still subject of
research:
•
•
•
Clustering Engine
Results…
Adam Bartusiak M.Sc. : The NXTM research project 6/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Representation Layer
• search results are represented as an interactive graph with
nodes and edges
• real time browsing of the graph enables the user to discover
other relevant sources of information and their dependencies
• d3js.org java-script library
Standalone Frontend
Plugins & Apps
NXTM Representation Layer
Document
Abstract
Lorem ipsum dolor
sit amet, consetetur
s a d i p s c i n g e l i t r,
sediam nonumy
eirmod temport…
Updated: 03.01.2003
Entity
Type: Person
Name: John Smith
Author of: XYZ
Title: XYZ
Adam Bartusiak M.Sc. : The NXTM research project 7/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Use case
Lorem ipsum dolor sit
amet, consetetur NY elitr,
sed diam nonumy eirmod
tempor invidunt ut labore
et NY dolore magna
aliquyam erat, NY sed
diam voluptua. At vero
eos et accusam et justo
duo dolores NY et ea
rebum. Stet
#1
ipsum dolor sit amet.
Lorem NY ipsum dolor sit
a m e t , c o n s e t e t u r
sadipscing elitr, sed
diam nonumy eirmod
tempor invidunt ut labore
e t d o l o r e m a g n a
aliquyam erat, sed diam
voluptua. At vero eos et
#2
Entity
• ID #301
• type PLACE
• name NY (#1)
• name NY (#2)
Metadata
• createdIn NY
NXTM System
NXTM Item
• ID #1
• Type DOC
• Attribute [] (Metadata)
NXTM Item
• ID #2
• Type DOC
• Attribute [] (Metadata)
NXTM Item
• ID #301
• Type ENT
• Attribute [] (Metadata)
NXTM DB
Adam Bartusiak M.Sc. : The NXTM research project 8/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Use case cont.
Query: New York
NXTM Results
ResultItem
• NXTM_ITEM_ID #1
• Score
• Attribute []
ResultItem
• NXTM_ITEM_ID #2
• Score
• Attribute []
Result Item
• NXTM_ITEM_ID #301
• Score
• Attribute []
Result Triples
Source; Target; Distance
ResultItem#1; ResultItem#2; DOC-DOC
ResultItem#1; ResultItem#3; DOC-ENT
ResultItem#2; ResultItem#3; DOC-ENT
ENT #301
DOC #1
DOC #2
• DOC-DOC -> f(TF*IDF Similarity, Lucene score)
• DOC-ENT -> f(Confidence score, Lucene score)
Adam
Bartusiak …
Person
DOC#45
Metadata
Keywords
Adam Bartusiak M.Sc. : The NXTM research project 9/10
Enterprise Application Development Group
University of Applied Sciences Zittau/Görlitz
The NXTM Project
Development of a technology for live analysis of data
streams with regard to semantics and cross-linked data
structures
Adam Bartusiak M.Sc.
University of Applied Sciences Zittau/Görlitz
January 7, 2015
Questions
Partners/Cooperations
a.bartusiak@hszg.de | ead.hszg.de

More Related Content

What's hot

Cloud Machine Learning with Google Cloud Platform
Cloud Machine Learning with Google Cloud PlatformCloud Machine Learning with Google Cloud Platform
Cloud Machine Learning with Google Cloud PlatformMichal Brys
 
BDE SC6.2 Workshop-05/12/16 - CESSDA
BDE SC6.2 Workshop-05/12/16 - CESSDABDE SC6.2 Workshop-05/12/16 - CESSDA
BDE SC6.2 Workshop-05/12/16 - CESSDABigData_Europe
 
BDE SC6-ws-05/12/2016 technology part - SWC
BDE SC6-ws-05/12/2016 technology part - SWCBDE SC6-ws-05/12/2016 technology part - SWC
BDE SC6-ws-05/12/2016 technology part - SWCBigData_Europe
 
College Van Trends Tot Innovatie
College Van Trends Tot InnovatieCollege Van Trends Tot Innovatie
College Van Trends Tot InnovatieWouter Meys
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseMarin Dimitrov
 
Memory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business InnovationMemory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business InnovationVoltDB
 
Big data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilotsBig data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilotsBigData_Europe
 
Amidst demo (BNAIC 2015)
Amidst demo (BNAIC 2015)Amidst demo (BNAIC 2015)
Amidst demo (BNAIC 2015)AMIDST Toolbox
 
Searching for reliable business information: free versus fee
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus feevoginip
 
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS PlatformALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS PlatformSeonho Kim
 
Redlink, The Data Linking API
Redlink, The Data Linking APIRedlink, The Data Linking API
Redlink, The Data Linking APISergio Fernández
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realizationSeonho Kim
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
 
The changing landscape of search for business information
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business informationvoginip
 
ComputableFacts: a Secure System to Store Documents and Graphs
ComputableFacts: a Secure System to Store Documents and GraphsComputableFacts: a Secure System to Store Documents and Graphs
ComputableFacts: a Secure System to Store Documents and GraphsAccumulo Summit
 
OpenStreetMap as base layer in a linked open data distribution platform - Ber...
OpenStreetMap as base layer in a linked open data distribution platform - Ber...OpenStreetMap as base layer in a linked open data distribution platform - Ber...
OpenStreetMap as base layer in a linked open data distribution platform - Ber...OSMFstateofthemap
 
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral ConsortiumEnabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral ConsortiumHenrique O. Santos
 
How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)Rainer Sternfeld
 
The Vienna History Wiki – a Collaborative Knowledge Platform for the City of...
The Vienna History Wiki –  a Collaborative Knowledge Platform for the City of...The Vienna History Wiki –  a Collaborative Knowledge Platform for the City of...
The Vienna History Wiki – a Collaborative Knowledge Platform for the City of...Bernhard Krabina
 

What's hot (20)

Cloud Machine Learning with Google Cloud Platform
Cloud Machine Learning with Google Cloud PlatformCloud Machine Learning with Google Cloud Platform
Cloud Machine Learning with Google Cloud Platform
 
BDE SC6.2 Workshop-05/12/16 - CESSDA
BDE SC6.2 Workshop-05/12/16 - CESSDABDE SC6.2 Workshop-05/12/16 - CESSDA
BDE SC6.2 Workshop-05/12/16 - CESSDA
 
BDE SC6-ws-05/12/2016 technology part - SWC
BDE SC6-ws-05/12/2016 technology part - SWCBDE SC6-ws-05/12/2016 technology part - SWC
BDE SC6-ws-05/12/2016 technology part - SWC
 
College Van Trends Tot Innovatie
College Van Trends Tot InnovatieCollege Van Trends Tot Innovatie
College Van Trends Tot Innovatie
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
DMA Ignite Night - Status DMA
DMA Ignite Night - Status DMADMA Ignite Night - Status DMA
DMA Ignite Night - Status DMA
 
Memory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business InnovationMemory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business Innovation
 
Big data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilotsBig data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilots
 
Amidst demo (BNAIC 2015)
Amidst demo (BNAIC 2015)Amidst demo (BNAIC 2015)
Amidst demo (BNAIC 2015)
 
Searching for reliable business information: free versus fee
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus fee
 
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS PlatformALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
 
Redlink, The Data Linking API
Redlink, The Data Linking APIRedlink, The Data Linking API
Redlink, The Data Linking API
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realization
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
The changing landscape of search for business information
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business information
 
ComputableFacts: a Secure System to Store Documents and Graphs
ComputableFacts: a Secure System to Store Documents and GraphsComputableFacts: a Secure System to Store Documents and Graphs
ComputableFacts: a Secure System to Store Documents and Graphs
 
OpenStreetMap as base layer in a linked open data distribution platform - Ber...
OpenStreetMap as base layer in a linked open data distribution platform - Ber...OpenStreetMap as base layer in a linked open data distribution platform - Ber...
OpenStreetMap as base layer in a linked open data distribution platform - Ber...
 
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral ConsortiumEnabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
 
How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)
 
The Vienna History Wiki – a Collaborative Knowledge Platform for the City of...
The Vienna History Wiki –  a Collaborative Knowledge Platform for the City of...The Vienna History Wiki –  a Collaborative Knowledge Platform for the City of...
The Vienna History Wiki – a Collaborative Knowledge Platform for the City of...
 

Viewers also liked

“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”diannepatricia
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseJanna Hastings
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data IntegrationJanna Hastings
 
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewRadityo Eko Prasojo
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackAlyona Medelyan
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyCharlie Greenbacker
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyJanna Hastings
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextDataWorks Summit
 

Viewers also liked (11)

“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and Disease
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
 
Ontology
OntologyOntology
Ontology
 
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer Feedback
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in Lumify
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
 
AI and the Future of Growth
AI and the Future of GrowthAI and the Future of Growth
AI and the Future of Growth
 

Similar to Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Unstructured Documents into Structured Information in the Enterprise Context

10 years of IBM Connections
10 years of IBM Connections10 years of IBM Connections
10 years of IBM ConnectionsLetsConnect
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_Titash Mandal
 
BRISSKit: biomedical research made easy - Jisc Digital Festival 2015
BRISSKit: biomedical research made easy - Jisc Digital Festival 2015BRISSKit: biomedical research made easy - Jisc Digital Festival 2015
BRISSKit: biomedical research made easy - Jisc Digital Festival 2015Jisc
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicInstitute of Contemporary Sciences
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI dayMohammed Barakat
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineMustafa Jarrar
 
TechEvent Customer Project "Trend-Analytics"
TechEvent Customer Project "Trend-Analytics"TechEvent Customer Project "Trend-Analytics"
TechEvent Customer Project "Trend-Analytics"Trivadis
 
Deep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language TextDeep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language TextPankaj Gupta, PhD
 
Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015Jisc
 
AIVS - AI, Industrial Data Space, and Innovation Transformation
AIVS - AI, Industrial Data Space, and Innovation TransformationAIVS - AI, Industrial Data Space, and Innovation Transformation
AIVS - AI, Industrial Data Space, and Innovation Transformationpantapong
 
Internet of Things - Steven Davy 2015
Internet of Things - Steven Davy 2015Internet of Things - Steven Davy 2015
Internet of Things - Steven Davy 2015Walton Institute
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Research Data Alliance
 
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...Open Science Fair
 

Similar to Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Unstructured Documents into Structured Information in the Enterprise Context (20)

10 years of IBM Connections
10 years of IBM Connections10 years of IBM Connections
10 years of IBM Connections
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_
 
1-for PHDCV 4.6.16
1-for PHDCV 4.6.161-for PHDCV 4.6.16
1-for PHDCV 4.6.16
 
CV
CVCV
CV
 
BRISSKit: biomedical research made easy - Jisc Digital Festival 2015
BRISSKit: biomedical research made easy - Jisc Digital Festival 2015BRISSKit: biomedical research made easy - Jisc Digital Festival 2015
BRISSKit: biomedical research made easy - Jisc Digital Festival 2015
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI day
 
Data mining projects
Data mining projectsData mining projects
Data mining projects
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in Palestine
 
TechEvent Customer Project "Trend-Analytics"
TechEvent Customer Project "Trend-Analytics"TechEvent Customer Project "Trend-Analytics"
TechEvent Customer Project "Trend-Analytics"
 
Alexia Meyermann: Building a research infrastructure for educational studies ...
Alexia Meyermann: Building a research infrastructure for educational studies ...Alexia Meyermann: Building a research infrastructure for educational studies ...
Alexia Meyermann: Building a research infrastructure for educational studies ...
 
Deep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language TextDeep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language Text
 
Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015
 
AIVS - AI, Industrial Data Space, and Innovation Transformation
AIVS - AI, Industrial Data Space, and Innovation TransformationAIVS - AI, Industrial Data Space, and Innovation Transformation
AIVS - AI, Industrial Data Space, and Innovation Transformation
 
Industrial Natural Language Processing and Information Extraction
Industrial Natural Language Processing and Information ExtractionIndustrial Natural Language Processing and Information Extraction
Industrial Natural Language Processing and Information Extraction
 
Internet of Things - Steven Davy 2015
Internet of Things - Steven Davy 2015Internet of Things - Steven Davy 2015
Internet of Things - Steven Davy 2015
 
Open Government Data - The Next Generation
Open Government Data - The Next GenerationOpen Government Data - The Next Generation
Open Government Data - The Next Generation
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
 
Annual Report
Annual ReportAnnual Report
Annual Report
 
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
 

More from semanticsconference

Linear books to open world adventure
Linear books to open world adventureLinear books to open world adventure
Linear books to open world adventuresemanticsconference
 
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
Session 1.2   high-precision, context-free entity linking exploiting unambigu...Session 1.2   high-precision, context-free entity linking exploiting unambigu...
Session 1.2 high-precision, context-free entity linking exploiting unambigu...semanticsconference
 
Session 4.3 semantic annotation for enhancing collaborative ideation
Session 4.3   semantic annotation for enhancing collaborative ideationSession 4.3   semantic annotation for enhancing collaborative ideation
Session 4.3 semantic annotation for enhancing collaborative ideationsemanticsconference
 
Session 1.1 dalicc - data licenses clearance center
Session 1.1   dalicc - data licenses clearance centerSession 1.1   dalicc - data licenses clearance center
Session 1.1 dalicc - data licenses clearance centersemanticsconference
 
Session 1.3 context information management across smart city knowledge domains
Session 1.3   context information management across smart city knowledge domainsSession 1.3   context information management across smart city knowledge domains
Session 1.3 context information management across smart city knowledge domainssemanticsconference
 
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
Session 0.0   aussenac semanticsnl-pwebsem2017-v4Session 0.0   aussenac semanticsnl-pwebsem2017-v4
Session 0.0 aussenac semanticsnl-pwebsem2017-v4semanticsconference
 
Session 0.0 keynote sandeep sacheti - final hi res
Session 0.0   keynote sandeep sacheti - final hi resSession 0.0   keynote sandeep sacheti - final hi res
Session 0.0 keynote sandeep sacheti - final hi ressemanticsconference
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlandssemanticsconference
 
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
Session 1.2   enrich your knowledge graphs: linked data integration with pool...Session 1.2   enrich your knowledge graphs: linked data integration with pool...
Session 1.2 enrich your knowledge graphs: linked data integration with pool...semanticsconference
 
Session 1.4 connecting information from legislation and datasets using a ca...
Session 1.4   connecting information from legislation and datasets using a ca...Session 1.4   connecting information from legislation and datasets using a ca...
Session 1.4 connecting information from legislation and datasets using a ca...semanticsconference
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage informationsemanticsconference
 
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
Session 0.0   media panel - matthias priem - gtuo - semantics 2017Session 0.0   media panel - matthias priem - gtuo - semantics 2017
Session 0.0 media panel - matthias priem - gtuo - semantics 2017semanticsconference
 
Session 1.3 semantic asset management in the dutch rail engineering and con...
Session 1.3   semantic asset management in the dutch rail engineering and con...Session 1.3   semantic asset management in the dutch rail engineering and con...
Session 1.3 semantic asset management in the dutch rail engineering and con...semanticsconference
 
Session 1.3 energy, smart homes & smart grids: towards interoperability...
Session 1.3   energy, smart homes & smart grids: towards interoperability...Session 1.3   energy, smart homes & smart grids: towards interoperability...
Session 1.3 energy, smart homes & smart grids: towards interoperability...semanticsconference
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichmentsemanticsconference
 
Session 2.3 semantics for safeguarding & security – a police story
Session 2.3   semantics for safeguarding & security – a police storySession 2.3   semantics for safeguarding & security – a police story
Session 2.3 semantics for safeguarding & security – a police storysemanticsconference
 
Session 2.5 semantic similarity based clustering of license excerpts for im...
Session 2.5   semantic similarity based clustering of license excerpts for im...Session 2.5   semantic similarity based clustering of license excerpts for im...
Session 2.5 semantic similarity based clustering of license excerpts for im...semanticsconference
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....semanticsconference
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...semanticsconference
 
Session 5.6 towards a semantic outlier detection framework in wireless sens...
Session 5.6   towards a semantic outlier detection framework in wireless sens...Session 5.6   towards a semantic outlier detection framework in wireless sens...
Session 5.6 towards a semantic outlier detection framework in wireless sens...semanticsconference
 

More from semanticsconference (20)

Linear books to open world adventure
Linear books to open world adventureLinear books to open world adventure
Linear books to open world adventure
 
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
Session 1.2   high-precision, context-free entity linking exploiting unambigu...Session 1.2   high-precision, context-free entity linking exploiting unambigu...
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
 
Session 4.3 semantic annotation for enhancing collaborative ideation
Session 4.3   semantic annotation for enhancing collaborative ideationSession 4.3   semantic annotation for enhancing collaborative ideation
Session 4.3 semantic annotation for enhancing collaborative ideation
 
Session 1.1 dalicc - data licenses clearance center
Session 1.1   dalicc - data licenses clearance centerSession 1.1   dalicc - data licenses clearance center
Session 1.1 dalicc - data licenses clearance center
 
Session 1.3 context information management across smart city knowledge domains
Session 1.3   context information management across smart city knowledge domainsSession 1.3   context information management across smart city knowledge domains
Session 1.3 context information management across smart city knowledge domains
 
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
Session 0.0   aussenac semanticsnl-pwebsem2017-v4Session 0.0   aussenac semanticsnl-pwebsem2017-v4
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
 
Session 0.0 keynote sandeep sacheti - final hi res
Session 0.0   keynote sandeep sacheti - final hi resSession 0.0   keynote sandeep sacheti - final hi res
Session 0.0 keynote sandeep sacheti - final hi res
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlands
 
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
Session 1.2   enrich your knowledge graphs: linked data integration with pool...Session 1.2   enrich your knowledge graphs: linked data integration with pool...
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
 
Session 1.4 connecting information from legislation and datasets using a ca...
Session 1.4   connecting information from legislation and datasets using a ca...Session 1.4   connecting information from legislation and datasets using a ca...
Session 1.4 connecting information from legislation and datasets using a ca...
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage information
 
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
Session 0.0   media panel - matthias priem - gtuo - semantics 2017Session 0.0   media panel - matthias priem - gtuo - semantics 2017
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
 
Session 1.3 semantic asset management in the dutch rail engineering and con...
Session 1.3   semantic asset management in the dutch rail engineering and con...Session 1.3   semantic asset management in the dutch rail engineering and con...
Session 1.3 semantic asset management in the dutch rail engineering and con...
 
Session 1.3 energy, smart homes & smart grids: towards interoperability...
Session 1.3   energy, smart homes & smart grids: towards interoperability...Session 1.3   energy, smart homes & smart grids: towards interoperability...
Session 1.3 energy, smart homes & smart grids: towards interoperability...
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
Session 2.3 semantics for safeguarding & security – a police story
Session 2.3   semantics for safeguarding & security – a police storySession 2.3   semantics for safeguarding & security – a police story
Session 2.3 semantics for safeguarding & security – a police story
 
Session 2.5 semantic similarity based clustering of license excerpts for im...
Session 2.5   semantic similarity based clustering of license excerpts for im...Session 2.5   semantic similarity based clustering of license excerpts for im...
Session 2.5 semantic similarity based clustering of license excerpts for im...
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...
 
Session 5.6 towards a semantic outlier detection framework in wireless sens...
Session 5.6   towards a semantic outlier detection framework in wireless sens...Session 5.6   towards a semantic outlier detection framework in wireless sens...
Session 5.6 towards a semantic outlier detection framework in wireless sens...
 

Recently uploaded

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Unstructured Documents into Structured Information in the Enterprise Context

  • 1. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 SEMANTiCS’16 - 13.09.2016 Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz Semantic Processing for the Conversion of Unstructured Documents into Structured Information in the Enterprise Context The NXTM research project
  • 2. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Agenda • Motivation • The NXTM Project • Data analysis • Search Engine • Representation Layer • Use case Adam Bartusiak M.Sc. : The NXTM research project 2/10
  • 3. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Motivation • unstructured data overload (80-90% of digital data) • unstructured data is rather intended for human consumption only • it holds useful knowledge that can be utilized for: • trend analytics • decision support • problem solving • discovering new facts and relations • it can improve knowledge management within enterprise • it helps SMEs gaining a sustainable competitive advantage on the market Adam Bartusiak M.Sc. : The NXTM research project 3/10
  • 4. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 The NXTM Project • cooperation project between HSZG and an IT company from Dresden • lifetime: January 2015 - October 2016 Adam Bartusiak M.Sc. : The NXTM research project 4/10 Goal: Improving SMEs’ processes for extracting valuable business information from UD
  • 5. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 The NXTM Project • cooperation project between HSZG and an IT company from Dresden • lifetime: January 2015 - October 2016 Goal: Adam Bartusiak M.Sc. : The NXTM research project 4/10 Improving SMEs’ processes for extracting valuable business information from UD • extraction of structured data from unstructured data from multiple resources: • emails and text messages • MS Office and PDF documents • XML and HTML files
  • 6. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 The NXTM Project • cooperation project between HSZG and an IT company from Dresden • lifetime: January 2015 - October 2016 Adam Bartusiak M.Sc. : The NXTM research project 4/10 Goal: Improving SMEs’ processes for extracting valuable business information from UD • extraction of structured data from unstructured data from multiple resources: • emails and text messages • MS Office and PDF documents • XML and HTML files • dynamic recognition and representation of linked information in documents
  • 7. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 The NXTM Project • cooperation project between HSZG and an IT company from Dresden • lifetime: January 2015 - October 2016 Adam Bartusiak M.Sc. : The NXTM research project 4/10 Goal: Improving SMEs’ processes for extracting valuable business information from UD • extraction of structured data from unstructured data from multiple resources: • emails and text messages • MS Office and PDF documents • XML and HTML files • flexible and intuitive graphical user interface enabling easy access to the analyzed data • dynamic recognition and representation of linked information in documents
  • 8. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Data analysis Data Input Interface 1. import of documents as JAVA objects from the input pipeline Adam Bartusiak M.Sc. : The NXTM research project 5/10
  • 9. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Data analysis Data Input Interface NXTM Data and Text Analysis Engine Metadata Analysis Text Extraction Segmentation Morphology Semantic Analysis Similarity Analysis 1. import of documents as JAVA objects from the input pipeline 2. language identification, MIME-Type and metadata analysis Adam Bartusiak M.Sc. : The NXTM research project 5/10
  • 10. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Data analysis Data Input Interface NXTM Data and Text Analysis Engine Metadata Analysis Text Extraction Segmentation Morphology Semantic Analysis Similarity Analysis 1. import of documents as JAVA objects from the input pipeline 2. language identification, MIME-Type and metadata analysis 3. NL processing in chained analysis engines and annotating semantic information Adam Bartusiak M.Sc. : The NXTM research project 5/10
  • 11. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Data analysis Data Input Interface NXTM Data and Text Analysis Engine Metadata Analysis Text Extraction Segmentation Morphology Semantic Analysis Similarity Analysis 1. import of documents as JAVA objects from the input pipeline 2. language identification, MIME-Type and metadata analysis 3. NL processing in chained analysis engines and annotating semantic information 4. similarity calculation and document clustering Adam Bartusiak M.Sc. : The NXTM research project 5/10
  • 12. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Data analysis Data Input Interface Data Persistence Layer NXTM Data and Text Analysis Engine Metadata Analysis Text Extraction Segmentation Morphology Semantic Analysis Similarity Analysis DB Mapper Clustering Engine 1. import of documents as JAVA objects from the input pipeline 2. language identification, MIME-Type and metadata analysis 3. NL processing in chained analysis engines and annotating semantic information 4. similarity calculation and document clustering 5. storing extracted data in DB, updating search index Adam Bartusiak M.Sc. : The NXTM research project 5/10
  • 13. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Data analysis Data Input Interface Data Persistence Layer NXTM Data and Text Analysis Engine Metadata Analysis Text Extraction Segmentation Morphology Semantic Analysis Similarity Analysis Linked Open Data Knowledge Integrator DB Mapper Clustering Engine 1. import of documents as JAVA objects from the input pipeline 2. language identification, MIME-Type and metadata analysis 3. NL processing in chained analysis engines and annotating semantic information 4. similarity calculation and document clustering 5. storing extracted data in DB, updating search index 6. mapping annotated entities with LOD resources Adam Bartusiak M.Sc. : The NXTM research project 5/10
  • 14. NXTM Item • ID • Type (DOC, ENT) • Attribute [] • … • Attribute • Predicate • Value (NXTM_Item_ID; String) • Provenance (NXTM_Item_ID) • Confidence • Access policy Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Data analysis 1. import of documents as JAVA objects from the input pipeline 2. language identification, MIME-Type and metadata analysis 3. NL processing in chained analysis engines and annotating semantic information 4. similarity calculation and document clustering 5. storing extracted data in DB, updating search index 6. mapping annotated entities with LOD resources Adam Bartusiak M.Sc. : The NXTM research project 5/10
  • 15. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Search Engine Data Presistence Layer Search query… Semantic Search Machine NXTM Search Layer Field Value ID NXTM_Item_ID Content LuceneAnalyzer Semantic SIREnAnalyzer • direct queries to a DB for retrieving the analysed data is an inefficient way of searching information • a semantic search machine can effectively search for hierarchical data • search engine is still subject of research: • • • Clustering Engine Results… Adam Bartusiak M.Sc. : The NXTM research project 6/10
  • 16. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Representation Layer • search results are represented as an interactive graph with nodes and edges • real time browsing of the graph enables the user to discover other relevant sources of information and their dependencies • d3js.org java-script library Standalone Frontend Plugins & Apps NXTM Representation Layer Document Abstract Lorem ipsum dolor sit amet, consetetur s a d i p s c i n g e l i t r, sediam nonumy eirmod temport… Updated: 03.01.2003 Entity Type: Person Name: John Smith Author of: XYZ Title: XYZ Adam Bartusiak M.Sc. : The NXTM research project 7/10
  • 17. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Use case Lorem ipsum dolor sit amet, consetetur NY elitr, sed diam nonumy eirmod tempor invidunt ut labore et NY dolore magna aliquyam erat, NY sed diam voluptua. At vero eos et accusam et justo duo dolores NY et ea rebum. Stet #1 ipsum dolor sit amet. Lorem NY ipsum dolor sit a m e t , c o n s e t e t u r sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore e t d o l o r e m a g n a aliquyam erat, sed diam voluptua. At vero eos et #2 Entity • ID #301 • type PLACE • name NY (#1) • name NY (#2) Metadata • createdIn NY NXTM System NXTM Item • ID #1 • Type DOC • Attribute [] (Metadata) NXTM Item • ID #2 • Type DOC • Attribute [] (Metadata) NXTM Item • ID #301 • Type ENT • Attribute [] (Metadata) NXTM DB Adam Bartusiak M.Sc. : The NXTM research project 8/10
  • 18. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Use case cont. Query: New York NXTM Results ResultItem • NXTM_ITEM_ID #1 • Score • Attribute [] ResultItem • NXTM_ITEM_ID #2 • Score • Attribute [] Result Item • NXTM_ITEM_ID #301 • Score • Attribute [] Result Triples Source; Target; Distance ResultItem#1; ResultItem#2; DOC-DOC ResultItem#1; ResultItem#3; DOC-ENT ResultItem#2; ResultItem#3; DOC-ENT ENT #301 DOC #1 DOC #2 • DOC-DOC -> f(TF*IDF Similarity, Lucene score) • DOC-ENT -> f(Confidence score, Lucene score) Adam Bartusiak … Person DOC#45 Metadata Keywords Adam Bartusiak M.Sc. : The NXTM research project 9/10
  • 19. Enterprise Application Development Group University of Applied Sciences Zittau/Görlitz The NXTM Project Development of a technology for live analysis of data streams with regard to semantics and cross-linked data structures Adam Bartusiak M.Sc. University of Applied Sciences Zittau/Görlitz January 7, 2015 Questions Partners/Cooperations a.bartusiak@hszg.de | ead.hszg.de