SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Domain Knowledge makes Artificial
Intelligence Smart
Linda Andersson
AI-SDV 2022
© Artificial Researcher,2022 1
Linda Andersson
CEO/CTO Artificial Researcher IT GmbH
Awards
• 2018 Commercial Viability award by
the Austrian Angel Investors
Association.
• 2017 PhD high potential R&D award,
i2c Award, Vienna University of
Technology
• 1999 Finalist in a Venture Cup held at
Mälardalens University College.
Research fields
• Natural Language Understanding
• Natural Language Processing
• Information Extraction
• Information Retrieval
Academic merits
• Information design
1998-2001
• General Linguistics
2001-2003
• Computational Linguistics
2006-2009
• Language Engineering
2004
• Library & Information Science
2004-2006
• Computer Science
2009-2019
© Artificial Researcher,2022 2
“There is little you can do about prior art which
wasn’t found in the search that you made before
filing the patent application.”
Barry Franks Hynell, “Opposition before the EPO”, preface to
Winning with IP: managing intellectual property today, ISBN 978-1-9998329-6-4
© Artificial Researcher,2022 3
Information Need ”Prior Art”
No 37435 Benz Patent – Motorwagen (1886 )
Steering mechanism
Search for:
• Problem and Solution (A)
• Scope of the invention (A,B,C)
• Specific technical details (B,C)
Engine function
A
C
C
The transport vehicle
B
© Artificial Researcher,2022 4
What make a AI Tool technical successful?
• A successful text mining solution does not only focus on developing the
technology or the best deep learning algorithm, it is much more complex.
The technology design reflects:
• Task Complexity
• Information needs, retrieve relevant paragraphs and not just documents
• Language Complexity
• Word formation of new words are particular important for the patent text
genre.
• Domain text-genre Complexity
• Multi word terms
© Artificial Researcher,2022 5
Known item search Explorative search
Find relevant paragraphs
Scientific Summarization
ComparativeSummarization
Requests
Output example
Reading text
Different Information Needs
Task Complexity
© Artificial Researcher,2022 6
Next generation Search Technology
Our search software generate the query
in the specific domain
Retrieves relevant paragraphs
Graph Search & Text Box Search
We have developed text mining technologies by combining traditional
Natural Language Processing and Deep Learning technologies.
© Artificial Researcher,2022 7
Task Complexity
Example of automatic query generation
Automatic query expansion terms
© Artificial Researcher,2022 8
Automatic Query formulation
- From text segment to automatic Boolean query
Task Complexity
Automatic Query formulation
- From terms to concept term graphs
https://graph-demo.artificialresearcher.com/
Term: environment climate fuel biofuel
(fuel OR gas OR engine OR vehicle OR oil OR coal OR energy OR hydrocarbon OR steam OR
liquid OR fluid OR furnace OR lubricant OR hydrogen OR feed) AND ("diesel fuel"~4 OR "liquid
fuel"~4 OR "fuel gas"~4 OR "fuel oil"~4 OR "fuel material"~4 OR "hydrogen gas"~4 OR "diesel
oil"~4 OR "liquid hydrocarbon"~4 OR "burning gas"~4 OR "synthetic gas"~4 OR "suitable
hydrocarbon"~4 OR "mixed gas"~4 OR "tail gas"~4)
With context similarity of 0.85
© Artificial Researcher,2022 9
Task Complexity
Artificial Researcher-Search Solutions
Relation graph for
terms extracted
from patent and
scientific
publications
• Direct access
• Relevant paragraphs (passages)
• Abstracts
• Bibliographic information
• Links to full text documents
• Download
• Paragraphs including all bibliographic data, including
identified technical terms, subject classification terms,
scientific classifications
• Standard Index Collections
• Open Access scientific articles
• access to 204,582,649 – CORE.uk
• Patents
• EP full-text data for text analytics
• Medical collections
• COVID-19 Open Research Dataset
© Artificial Researcher,2022 10
Task Complexity
How indices and ontologies are generated
• Technology and data flow overview
• Language and Domain Complexity
© Artificial Researcher,2022 11
From Text to Structured Semantic Representation
of Patents and Scientific Publications
© Artificial Researcher,2022 12
Any Data
collections
NLP pipeline, PoS & Chunker,
Lexico-Semantic Patterns
Harmonization &
Text Normalization
External
Language
Resources,
Gazetteers
Ontology
(OWL, JSON)
Ontology & Index Repositories
Meta Data
External Information
Resources
API
+
output
input
+
Search options
Graph Search
Artificial Researcher NLP-Toolkit
Word Similarity Model Context Similarity Model Task Model
1 2
3
4
5
6
Documents are processed through several text analysis modules:
1. Text Normalization and harmonization (including PDF converter)
2. NLP and relation identification
3. Artificial Researcher NLP-Toolkit - Technical Terms
4. Packaged as enhanced XML/JSON
5. Software AR Ontology Services and AR Passage Retrieval
6. End user applications rest APIs, desktop script, GUI
Cloud platforms
(Google, AZURE2023)
Internal
Information
Resources (analyses
are processed with
SaaS – so no data
are stored the
cloud)
Artificial Research Data Pipeline solution
Assembly line for creating Indices and Ontologies
© Artificial Researcher,2022 13
Combining NLP & Distributional
Semantics
𝐽𝑜𝑖𝑛𝑒𝑑𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 = ෍
𝑖,𝑗=1,𝑛
𝑖≠𝑗
𝑖<𝑗
𝑁 cos
𝑤𝑖 , 𝑤𝑗
𝑁
• wi, wj represent each word vector pair cosine similarity of a MWT
• N is the number of words for a MWT (Andersson 2013-2017)
Embedding identifies similarities between different words, and the standard reference
threshold for w2v 0.60
• Underwear similar to underpants , undergarment, panties, underclothes
• Strength similar to strengths, strength, toughness, stronger, sfrength
Technical semantic relations are a mixture of single words and
phrases
• synthetic fibers synonym to polyester fibers
• thrips hypernym to bulb fly larvae
© Artificial Researcher,2022 14
Language and Domain Complexity
What is Technical Term?
Depends on who you are asking!
© Artificial Researcher,2022 15
Language and Domain Complexity
BERT – Technical Term Prediction
• Task: detect domain term spans in sentences
A network synchronization method allows
Domain Term
O I I I O
• Using pre-trained SciBERT Model
• With additional training data for Technical Term detection
• Randomly picked 22 patents with different International Patent Classification codes
• Sentences 10,337, Technical Term dictionary of size 5,099
Fink T., Andersson L., Hanbury A. (2021) Detecting Multi Word Terms in patents the same way as entities / World Patent Information, 67 102078; 1 - 6
© Artificial Researcher,2022 16
Language and Domain Complexity
Technical term identification
- Evaluation using WIPO PEARL Terminology DB
RECALL AND PRECISION ASSESSMENT
WIPO CONCEPT RECALL 81%
ALL DOMAIN TERM RECALL 96%
ALL DOMAIN TERM PRECISION 85%
© Artificial Researcher,2022 17
Language and Domain Complexity
What is Semantic Lexical Relations
Hyponym, a narrow term
e.g. horse is a hyponym of farm animals
Hypernym, a broad term,
e.g. farm animals a hypernym of pig, cow, sheep, horse
Hyponymy relations:
• part of
• kind of
• member of
• type of
• domain specific relation
…. farm animals such as pigs, sheep, cows and horses
© Artificial Researcher,2022 18
Language and Domain complexity
Better than Distributional Semantics
brake pedal:
vehicle operating pedal,
conventional hydraulic brake system
pedal devices
position brake actuating member
brake actuating member
hydraulically-assisted rack pinion steering
gear
brake operating member
conventional braking system
pair pedals
accelerator pedal
case pedal device
pedal device
Related concept
Artificial Researcher’s text mining technology Extraction using pre-trained contextual
models only
brake pedal PatBERT SciBERT
conventional hydraulic brake
system
0.69 0.89
hydraulically-assisted rack
pinion steering gear
0.49 0.80
conventional braking system 0.66 0.84
Technical terms
Using semantically enriched data can retrieve up to 60% more related concepts than traditional pre-trained
contextual models (a.k.a. Neural Network-based methods, BERT-family).
© Artificial Researcher,2022 19
Language and Domain complexity
Proof-of-concept
-Passage retrieval performance on CLEF-IP 2013 text collection.
Method
PRES@
100
Recall@
100
MAP@
100
Prec(D)
(Albarede et. al. 2021) 0.461 0.603 0.173 0.231
(Andersson et. al., 2016) 0.444 0.560 0.187 0.282
(Andersson et. al., 2017) 0.558 0.647 0.269 0.207
(Luo J and Yang H. 2013) 0.433 0.540 0.191 0.213
Artificial Researcher Passage Retrieval ServiceTM
Artificial Researcher Graph Search ServiceTM
Albarede L., & Mulhem P., Goeuriot L., Le Pape-Gardeux C., Sylvain M., Trindad C. (2021). Passage retrieval in context: Experiments on Patents.
Andersson L, Lupu M, Palotti J., Hanbury A., Rauber A. (2016) When is the time Ripe for Natural Language Processing for Patent Passage Retrieval? In Proceedings of the 25th ACM International Conference
on Conference on Information and Knowledge Management (CIKM 2016)
Andersson L, Rekabsaz N., Hanbury A. (2017) Automatic query expansion for patent passage retrieval using paradigmatic and syntagmatic information The first WiNLP Workshop co-located with the Annual
Meeting of the Association for Computational Linguistics (ACL 2017), Vancouver
Luo J and Yang H. (2013). Query formulation for prior art search-Georgetown university at clef-ip 2013. In Proc. of CLEF.
© Artificial Researcher,2022 20
Language and Domain complexity
New REST API service
Text classification and semantic relation extraction
Output
Automatic classification according to
taxonomies such as IPC/CPC, MeSH,
PLoS, WIPO Scientific Subject fields
Input
© Artificial Researcher,2022 21
Artificial Researcher - Services & Products
Test access: API key b10225791d2a478c9ff3d8cd106ad65a
• Artificial Researcher Passage Retrieval Service
• Access: rest API, Desktop App Script (Python)
• https://swagger.artificialresearcher.com/ (developer)
• https://passageretrieval.artificialresearcher.com/ (demo)
• SaaS or on-premises software (Docker)
• Artificial Researcher Ontology Service
• Access: rest API, Desktop App Script (Python)
• https://swagger.artificialresearcher.com/?urls.primaryName=onto-api (developer)
• Data format OWL or JSON
• SaaS or on-premises software (Docker)
• Ontology generator – only SaaS
• Artificial Researcher NLP-toolkit Services
• Access: rest API
• https://swagger.artificialresearcher.com/?urls.primaryName=unified-nlp-server (developer)
• SaaS or on-premises software (Docker)
• Artificial Researcher Graph Search Service
• UX: https://graph-demo.artificialresearcher.com/ (demo)
© Artificial Researcher,2022 22
Thank you for your attention
artificialresearcher.com
linda.andersson@artificialresearcher.com
© Artificial Researcher,2022 23

Contenu connexe

Similaire à Domain Knowledge Powers AI with Context

Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021Gérard Dupont
 
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portalSFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portalSouth Tyrol Free Software Conference
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...mlaij
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Gautier Poupeau
 
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)kevig
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaGiorgia Lodi
 
[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the future[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the futureEADTU
 
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...EADTU
 
SiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddhartha Mitra
 
8th International Conference on Natural Language Computing (NATL 2022)
8th International Conference on Natural Language Computing (NATL 2022)8th International Conference on Natural Language Computing (NATL 2022)
8th International Conference on Natural Language Computing (NATL 2022)ijcsity
 
Call for Papers - 8th International Conference on Natural Language Computing ...
Call for Papers - 8th International Conference on Natural Language Computing ...Call for Papers - 8th International Conference on Natural Language Computing ...
Call for Papers - 8th International Conference on Natural Language Computing ...ijistjournal
 
Call for Paper - 3rd International Conference on NLP & Information Retrieval ...
Call for Paper - 3rd International Conference on NLP & Information Retrieval ...Call for Paper - 3rd International Conference on NLP & Information Retrieval ...
Call for Paper - 3rd International Conference on NLP & Information Retrieval ...mlaij
 
Béatrice Markhoff - Semantic mediation ArSol and CIDOC CRM
Béatrice Markhoff - Semantic mediation ArSol and CIDOC CRMBéatrice Markhoff - Semantic mediation ArSol and CIDOC CRM
Béatrice Markhoff - Semantic mediation ArSol and CIDOC CRMariadnenetwork
 
2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decays2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decaysHenry Schreiner
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...mlaij
 
3rd International Conference on NLP & Information Retrieval (NLPI 2022)
3rd International Conference on NLP & Information Retrieval (NLPI 2022)3rd International Conference on NLP & Information Retrieval (NLPI 2022)
3rd International Conference on NLP & Information Retrieval (NLPI 2022)ijccsa
 
8 th International Conference on Natural Language Computing (NATL 2022)
8 th International Conference on Natural Language Computing (NATL 2022)8 th International Conference on Natural Language Computing (NATL 2022)
8 th International Conference on Natural Language Computing (NATL 2022)ijscai
 
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...dannyijwest
 

Similaire à Domain Knowledge Powers AI with Context (20)

Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portalSFScon21 - Simone Tritini - The Environmental Data Platform web portal
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...
 
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the future[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the future
 
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
 
SiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdf
 
8th International Conference on Natural Language Computing (NATL 2022)
8th International Conference on Natural Language Computing (NATL 2022)8th International Conference on Natural Language Computing (NATL 2022)
8th International Conference on Natural Language Computing (NATL 2022)
 
Call for Papers - 8th International Conference on Natural Language Computing ...
Call for Papers - 8th International Conference on Natural Language Computing ...Call for Papers - 8th International Conference on Natural Language Computing ...
Call for Papers - 8th International Conference on Natural Language Computing ...
 
Call for Paper - 3rd International Conference on NLP & Information Retrieval ...
Call for Paper - 3rd International Conference on NLP & Information Retrieval ...Call for Paper - 3rd International Conference on NLP & Information Retrieval ...
Call for Paper - 3rd International Conference on NLP & Information Retrieval ...
 
Béatrice Markhoff - Semantic mediation ArSol and CIDOC CRM
Béatrice Markhoff - Semantic mediation ArSol and CIDOC CRMBéatrice Markhoff - Semantic mediation ArSol and CIDOC CRM
Béatrice Markhoff - Semantic mediation ArSol and CIDOC CRM
 
2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decays2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decays
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
 
3rd International Conference on NLP & Information Retrieval (NLPI 2022)
3rd International Conference on NLP & Information Retrieval (NLPI 2022)3rd International Conference on NLP & Information Retrieval (NLPI 2022)
3rd International Conference on NLP & Information Retrieval (NLPI 2022)
 
8 th International Conference on Natural Language Computing (NATL 2022)
8 th International Conference on Natural Language Computing (NATL 2022)8 th International Conference on Natural Language Computing (NATL 2022)
8 th International Conference on Natural Language Computing (NATL 2022)
 
Resume_Karthick
Resume_KarthickResume_Karthick
Resume_Karthick
 
Resume (2)
Resume (2)Resume (2)
Resume (2)
 
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
 

Plus de Dr. Haxel Consult

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementDr. Haxel Consult
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...Dr. Haxel Consult
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...Dr. Haxel Consult
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...Dr. Haxel Consult
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...Dr. Haxel Consult
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...Dr. Haxel Consult
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...Dr. Haxel Consult
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...Dr. Haxel Consult
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...Dr. Haxel Consult
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...Dr. Haxel Consult
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterDr. Haxel Consult
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCDr. Haxel Consult
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...Dr. Haxel Consult
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...Dr. Haxel Consult
 
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...Dr. Haxel Consult
 

Plus de Dr. Haxel Consult (20)

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
 

Dernier

On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...tanu pandey
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Standkumarajju5765
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.soniya singh
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)Delhi Call girls
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceDelhi Call girls
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.CarlotaBedoya1
 

Dernier (20)

On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
 

Domain Knowledge Powers AI with Context

  • 1. Domain Knowledge makes Artificial Intelligence Smart Linda Andersson AI-SDV 2022 © Artificial Researcher,2022 1
  • 2. Linda Andersson CEO/CTO Artificial Researcher IT GmbH Awards • 2018 Commercial Viability award by the Austrian Angel Investors Association. • 2017 PhD high potential R&D award, i2c Award, Vienna University of Technology • 1999 Finalist in a Venture Cup held at Mälardalens University College. Research fields • Natural Language Understanding • Natural Language Processing • Information Extraction • Information Retrieval Academic merits • Information design 1998-2001 • General Linguistics 2001-2003 • Computational Linguistics 2006-2009 • Language Engineering 2004 • Library & Information Science 2004-2006 • Computer Science 2009-2019 © Artificial Researcher,2022 2
  • 3. “There is little you can do about prior art which wasn’t found in the search that you made before filing the patent application.” Barry Franks Hynell, “Opposition before the EPO”, preface to Winning with IP: managing intellectual property today, ISBN 978-1-9998329-6-4 © Artificial Researcher,2022 3
  • 4. Information Need ”Prior Art” No 37435 Benz Patent – Motorwagen (1886 ) Steering mechanism Search for: • Problem and Solution (A) • Scope of the invention (A,B,C) • Specific technical details (B,C) Engine function A C C The transport vehicle B © Artificial Researcher,2022 4
  • 5. What make a AI Tool technical successful? • A successful text mining solution does not only focus on developing the technology or the best deep learning algorithm, it is much more complex. The technology design reflects: • Task Complexity • Information needs, retrieve relevant paragraphs and not just documents • Language Complexity • Word formation of new words are particular important for the patent text genre. • Domain text-genre Complexity • Multi word terms © Artificial Researcher,2022 5
  • 6. Known item search Explorative search Find relevant paragraphs Scientific Summarization ComparativeSummarization Requests Output example Reading text Different Information Needs Task Complexity © Artificial Researcher,2022 6
  • 7. Next generation Search Technology Our search software generate the query in the specific domain Retrieves relevant paragraphs Graph Search & Text Box Search We have developed text mining technologies by combining traditional Natural Language Processing and Deep Learning technologies. © Artificial Researcher,2022 7 Task Complexity
  • 8. Example of automatic query generation Automatic query expansion terms © Artificial Researcher,2022 8 Automatic Query formulation - From text segment to automatic Boolean query Task Complexity
  • 9. Automatic Query formulation - From terms to concept term graphs https://graph-demo.artificialresearcher.com/ Term: environment climate fuel biofuel (fuel OR gas OR engine OR vehicle OR oil OR coal OR energy OR hydrocarbon OR steam OR liquid OR fluid OR furnace OR lubricant OR hydrogen OR feed) AND ("diesel fuel"~4 OR "liquid fuel"~4 OR "fuel gas"~4 OR "fuel oil"~4 OR "fuel material"~4 OR "hydrogen gas"~4 OR "diesel oil"~4 OR "liquid hydrocarbon"~4 OR "burning gas"~4 OR "synthetic gas"~4 OR "suitable hydrocarbon"~4 OR "mixed gas"~4 OR "tail gas"~4) With context similarity of 0.85 © Artificial Researcher,2022 9 Task Complexity
  • 10. Artificial Researcher-Search Solutions Relation graph for terms extracted from patent and scientific publications • Direct access • Relevant paragraphs (passages) • Abstracts • Bibliographic information • Links to full text documents • Download • Paragraphs including all bibliographic data, including identified technical terms, subject classification terms, scientific classifications • Standard Index Collections • Open Access scientific articles • access to 204,582,649 – CORE.uk • Patents • EP full-text data for text analytics • Medical collections • COVID-19 Open Research Dataset © Artificial Researcher,2022 10 Task Complexity
  • 11. How indices and ontologies are generated • Technology and data flow overview • Language and Domain Complexity © Artificial Researcher,2022 11
  • 12. From Text to Structured Semantic Representation of Patents and Scientific Publications © Artificial Researcher,2022 12
  • 13. Any Data collections NLP pipeline, PoS & Chunker, Lexico-Semantic Patterns Harmonization & Text Normalization External Language Resources, Gazetteers Ontology (OWL, JSON) Ontology & Index Repositories Meta Data External Information Resources API + output input + Search options Graph Search Artificial Researcher NLP-Toolkit Word Similarity Model Context Similarity Model Task Model 1 2 3 4 5 6 Documents are processed through several text analysis modules: 1. Text Normalization and harmonization (including PDF converter) 2. NLP and relation identification 3. Artificial Researcher NLP-Toolkit - Technical Terms 4. Packaged as enhanced XML/JSON 5. Software AR Ontology Services and AR Passage Retrieval 6. End user applications rest APIs, desktop script, GUI Cloud platforms (Google, AZURE2023) Internal Information Resources (analyses are processed with SaaS – so no data are stored the cloud) Artificial Research Data Pipeline solution Assembly line for creating Indices and Ontologies © Artificial Researcher,2022 13
  • 14. Combining NLP & Distributional Semantics 𝐽𝑜𝑖𝑛𝑒𝑑𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 = ෍ 𝑖,𝑗=1,𝑛 𝑖≠𝑗 𝑖<𝑗 𝑁 cos 𝑤𝑖 , 𝑤𝑗 𝑁 • wi, wj represent each word vector pair cosine similarity of a MWT • N is the number of words for a MWT (Andersson 2013-2017) Embedding identifies similarities between different words, and the standard reference threshold for w2v 0.60 • Underwear similar to underpants , undergarment, panties, underclothes • Strength similar to strengths, strength, toughness, stronger, sfrength Technical semantic relations are a mixture of single words and phrases • synthetic fibers synonym to polyester fibers • thrips hypernym to bulb fly larvae © Artificial Researcher,2022 14 Language and Domain Complexity
  • 15. What is Technical Term? Depends on who you are asking! © Artificial Researcher,2022 15 Language and Domain Complexity
  • 16. BERT – Technical Term Prediction • Task: detect domain term spans in sentences A network synchronization method allows Domain Term O I I I O • Using pre-trained SciBERT Model • With additional training data for Technical Term detection • Randomly picked 22 patents with different International Patent Classification codes • Sentences 10,337, Technical Term dictionary of size 5,099 Fink T., Andersson L., Hanbury A. (2021) Detecting Multi Word Terms in patents the same way as entities / World Patent Information, 67 102078; 1 - 6 © Artificial Researcher,2022 16 Language and Domain Complexity
  • 17. Technical term identification - Evaluation using WIPO PEARL Terminology DB RECALL AND PRECISION ASSESSMENT WIPO CONCEPT RECALL 81% ALL DOMAIN TERM RECALL 96% ALL DOMAIN TERM PRECISION 85% © Artificial Researcher,2022 17 Language and Domain Complexity
  • 18. What is Semantic Lexical Relations Hyponym, a narrow term e.g. horse is a hyponym of farm animals Hypernym, a broad term, e.g. farm animals a hypernym of pig, cow, sheep, horse Hyponymy relations: • part of • kind of • member of • type of • domain specific relation …. farm animals such as pigs, sheep, cows and horses © Artificial Researcher,2022 18 Language and Domain complexity
  • 19. Better than Distributional Semantics brake pedal: vehicle operating pedal, conventional hydraulic brake system pedal devices position brake actuating member brake actuating member hydraulically-assisted rack pinion steering gear brake operating member conventional braking system pair pedals accelerator pedal case pedal device pedal device Related concept Artificial Researcher’s text mining technology Extraction using pre-trained contextual models only brake pedal PatBERT SciBERT conventional hydraulic brake system 0.69 0.89 hydraulically-assisted rack pinion steering gear 0.49 0.80 conventional braking system 0.66 0.84 Technical terms Using semantically enriched data can retrieve up to 60% more related concepts than traditional pre-trained contextual models (a.k.a. Neural Network-based methods, BERT-family). © Artificial Researcher,2022 19 Language and Domain complexity
  • 20. Proof-of-concept -Passage retrieval performance on CLEF-IP 2013 text collection. Method PRES@ 100 Recall@ 100 MAP@ 100 Prec(D) (Albarede et. al. 2021) 0.461 0.603 0.173 0.231 (Andersson et. al., 2016) 0.444 0.560 0.187 0.282 (Andersson et. al., 2017) 0.558 0.647 0.269 0.207 (Luo J and Yang H. 2013) 0.433 0.540 0.191 0.213 Artificial Researcher Passage Retrieval ServiceTM Artificial Researcher Graph Search ServiceTM Albarede L., & Mulhem P., Goeuriot L., Le Pape-Gardeux C., Sylvain M., Trindad C. (2021). Passage retrieval in context: Experiments on Patents. Andersson L, Lupu M, Palotti J., Hanbury A., Rauber A. (2016) When is the time Ripe for Natural Language Processing for Patent Passage Retrieval? In Proceedings of the 25th ACM International Conference on Conference on Information and Knowledge Management (CIKM 2016) Andersson L, Rekabsaz N., Hanbury A. (2017) Automatic query expansion for patent passage retrieval using paradigmatic and syntagmatic information The first WiNLP Workshop co-located with the Annual Meeting of the Association for Computational Linguistics (ACL 2017), Vancouver Luo J and Yang H. (2013). Query formulation for prior art search-Georgetown university at clef-ip 2013. In Proc. of CLEF. © Artificial Researcher,2022 20 Language and Domain complexity
  • 21. New REST API service Text classification and semantic relation extraction Output Automatic classification according to taxonomies such as IPC/CPC, MeSH, PLoS, WIPO Scientific Subject fields Input © Artificial Researcher,2022 21
  • 22. Artificial Researcher - Services & Products Test access: API key b10225791d2a478c9ff3d8cd106ad65a • Artificial Researcher Passage Retrieval Service • Access: rest API, Desktop App Script (Python) • https://swagger.artificialresearcher.com/ (developer) • https://passageretrieval.artificialresearcher.com/ (demo) • SaaS or on-premises software (Docker) • Artificial Researcher Ontology Service • Access: rest API, Desktop App Script (Python) • https://swagger.artificialresearcher.com/?urls.primaryName=onto-api (developer) • Data format OWL or JSON • SaaS or on-premises software (Docker) • Ontology generator – only SaaS • Artificial Researcher NLP-toolkit Services • Access: rest API • https://swagger.artificialresearcher.com/?urls.primaryName=unified-nlp-server (developer) • SaaS or on-premises software (Docker) • Artificial Researcher Graph Search Service • UX: https://graph-demo.artificialresearcher.com/ (demo) © Artificial Researcher,2022 22
  • 23. Thank you for your attention artificialresearcher.com linda.andersson@artificialresearcher.com © Artificial Researcher,2022 23