SlideShare une entreprise Scribd logo
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 1
Nils Newman | October 10, 2022
Finding the WHAT
Will AI help?
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 2
The WHAT - How to find concepts in Text
• For a computer, finding
concepts within text is an
ongoing struggle
• How can machines help us
find concepts without us
reading?
• What can machines actually
find?
• How will AI change things?
NOUNS
Machines do not understand
what they are “reading”
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 3
Two Main Approaches
• There are two main
approaches to finding
WHAT in a document
➢ Natural Language
Processing (NLP)
➢ Machine Learning (ML)
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 4
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 5
Natural Language Processing
• NLP is about finding WHAT
through the structure of language
• Based on learning from the
structure of language either
through programming or learning
from documents
• Uses semantic and syntactic rules
to “understand” text
• Usually language specific
• Projects are trying to generalize
across languages
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 6
Natural Language Processing
• NLP Requires Training!
• Even if done by someone else
such as Google’s Universal Parsey
• Training is particularly important
if you are interested in technical
topics which do not adhere to
normal sentence structure (for
instance – a patent)
• Some of this training might have
to be supervised (humans)
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 7
NER– NLP’s Concept Shortcut
• Named Entity
Recognition (NER) targets
specific types of entities
such as:
➢ People
➢ Places
➢ Things
• For example:
• Geographic Names
• Chemical Names
• Pharma Concepts
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 8
NER
• NER still requires training
but if you are working in
an area with a
constrained vocabulary,
NER can save a lot of time
and effort
*Text Courtesy of Wikipedia
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 9
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 10
Machine Learning: AKA Alphabet Soup
• Machine Learning in Concept
Extraction is all about finding patterns
• Decades of research have produced
many different approaches:
• LSI
• LSA
• PCA
• SVM
• MI
• TM
• Etc..
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 11
Machine Learning: Patterns via Math
• The core of many of these techniques is finding
patterns using math with little explicit instruction
(no rules given)
• The math runs on your data to look for
connections between items and will find them on
its own
• The advantage of this approach is you do not have
to know what you are looking for
• The disadvantage is sometimes the output is
rubbish
• The other issue is many of these approaches give
a collection of related terms but giving it a name
is up to the human
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 12
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 13
Impact of AI on NLP
• Natural Language Processing now
merging with AI
• NLP was transformed by the BERT
language models (Sci-BERT, Bio-
BERT, FinBERT, RoBERTa, ALBERT, etc..)
• GPT also impactful but not open-source
• The technique works because
enormous training sets form the
foundation
• Original BERT used BookCorpus
(800 million words) and English
Wikipedia (2,500 million words)
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 14
Impact of AI on Machine Learning
• Machine Learning can be considered a
branch of AI
• The distinction is in the level of
training
• The latest round of AI development
combined with the access to a lot of
unsupervised data, means that ML-
based concept extraction may be
drawing on training without you
knowing it
• For example: Deep Learning
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 15
AI + ML + NLP
• AI has facilitated the fusion of ML
with NLP to improve concept
identification
• NLP has the language structure, AI
gives the ability to learn, and ML
enhances that learning by looking
for patterns, particularly patterns
not seen before
• For example, NER systems, given
some initial training, can learn on
their own using ML techniques+ AI
learning models
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 16
Beware the easy WHAT
• Finding the WHAT in records is still a real challenge
• Is WHAT a Concept or a Word?
➢ The Analyst’s WHAT
• An analyst with Subject Matter Expertise has an expected WHAT in mind
when they look at data based on their own knowledge. So their WHAT is
sometimes not represented in the data. They are often looking for higher
order concepts.
➢ The Data WHAT
• Algorithms let the data speak for itself. The WHAT is the word in the data.
• The two WHAT’s often do not agree
• But AI is working to solve that as well…..
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 17
Words vs. Concept
• Looking at a set of words and
associating them with a concept is
not beyond the scope of AI - with
proper training
• In constrained lexicons, it is very
possible now – for example,
screening existing drugs to
repurpose for COVID or Google’s ill-
fated human impersonating Duplex
• However, a general model is not on
the horizon
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 18
Questions?
Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 19

Contenu connexe

Similaire à AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology, USA)

Build a Career in AI
Build a Career in AIBuild a Career in AI
Build a Career in AI
CMassociates
 
Generative Analysis Overview
Generative Analysis OverviewGenerative Analysis Overview
Generative Analysis Overview
Jim Arlow
 
The Truth About AI in B2B Marketing
The Truth About AI in B2B MarketingThe Truth About AI in B2B Marketing
The Truth About AI in B2B Marketing
NapierPR
 
Starting a career in data science
Starting a career in data scienceStarting a career in data science
Starting a career in data science
Brian Spiering
 
Harnessing search engines for KM
Harnessing search engines for KMHarnessing search engines for KM
Harnessing search engines for KM
Invotra
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
Charmi Chokshi
 
Girl Develop It: Introduction to Content Strategy 2016
Girl Develop It: Introduction to Content Strategy 2016Girl Develop It: Introduction to Content Strategy 2016
Girl Develop It: Introduction to Content Strategy 2016
David Dylan Thomas
 
Introduction to content strategy
Introduction to content strategyIntroduction to content strategy
Introduction to content strategy
David Dylan Thomas
 
From SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchFrom SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the Switch
Rachel Berryman
 
Adopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterpriseAdopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterprise
QuantUniversity
 
Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-search
Diana Maynard
 
CMS Crash Course!
CMS Crash Course!CMS Crash Course!
CMS Crash Course!
TechSoup Canada
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATE
Diana Maynard
 
Machine Learning for Marketers - CTAConf 2019
Machine Learning for Marketers - CTAConf 2019Machine Learning for Marketers - CTAConf 2019
Machine Learning for Marketers - CTAConf 2019
Britney Muller
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
Alexander Borzunov
 
How to approach Machine Learning for innovation projects? (by Jochem Grietens)
How to approach Machine Learning for innovation projects? (by Jochem Grietens)How to approach Machine Learning for innovation projects? (by Jochem Grietens)
How to approach Machine Learning for innovation projects? (by Jochem Grietens)
Verhaert Masters in Innovation
 
How to crack Big Data and Data Science roles
How to crack Big Data and Data Science rolesHow to crack Big Data and Data Science roles
How to crack Big Data and Data Science roles
UpXAcademy
 
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
Thinkful
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
CrowdFlower
 

Similaire à AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology, USA) (20)

Build a Career in AI
Build a Career in AIBuild a Career in AI
Build a Career in AI
 
Generative Analysis Overview
Generative Analysis OverviewGenerative Analysis Overview
Generative Analysis Overview
 
The Truth About AI in B2B Marketing
The Truth About AI in B2B MarketingThe Truth About AI in B2B Marketing
The Truth About AI in B2B Marketing
 
Starting a career in data science
Starting a career in data scienceStarting a career in data science
Starting a career in data science
 
Harnessing search engines for KM
Harnessing search engines for KMHarnessing search engines for KM
Harnessing search engines for KM
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Girl Develop It: Introduction to Content Strategy 2016
Girl Develop It: Introduction to Content Strategy 2016Girl Develop It: Introduction to Content Strategy 2016
Girl Develop It: Introduction to Content Strategy 2016
 
Introduction to content strategy
Introduction to content strategyIntroduction to content strategy
Introduction to content strategy
 
From SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchFrom SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the Switch
 
Adopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterpriseAdopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterprise
 
Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-search
 
CMS Crash Course!
CMS Crash Course!CMS Crash Course!
CMS Crash Course!
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATE
 
Machine Learning for Marketers - CTAConf 2019
Machine Learning for Marketers - CTAConf 2019Machine Learning for Marketers - CTAConf 2019
Machine Learning for Marketers - CTAConf 2019
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
 
How to approach Machine Learning for innovation projects? (by Jochem Grietens)
How to approach Machine Learning for innovation projects? (by Jochem Grietens)How to approach Machine Learning for innovation projects? (by Jochem Grietens)
How to approach Machine Learning for innovation projects? (by Jochem Grietens)
 
How to crack Big Data and Data Science roles
How to crack Big Data and Data Science rolesHow to crack Big Data and Data Science roles
How to crack Big Data and Data Science roles
 
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
 

Plus de Dr. Haxel Consult

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
Dr. Haxel Consult
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
Dr. Haxel Consult
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
Dr. Haxel Consult
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
Dr. Haxel Consult
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
Dr. Haxel Consult
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
Dr. Haxel Consult
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
Dr. Haxel Consult
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
Dr. Haxel Consult
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
Dr. Haxel Consult
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
Dr. Haxel Consult
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
Dr. Haxel Consult
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
Dr. Haxel Consult
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
Dr. Haxel Consult
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
Dr. Haxel Consult
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
Dr. Haxel Consult
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
Dr. Haxel Consult
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
Dr. Haxel Consult
 
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
Dr. Haxel Consult
 

Plus de Dr. Haxel Consult (20)

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
 

Dernier

存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
fovkoyb
 
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
uehowe
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
Donato Onofri
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
Paul Walk
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
hackersuli
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
wolfsoftcompanyco
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
k4ncd0z
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
Toptal Tech
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
3a0sd7z3
 
Azure EA Sponsorship - Customer Guide.pdf
Azure EA Sponsorship - Customer Guide.pdfAzure EA Sponsorship - Customer Guide.pdf
Azure EA Sponsorship - Customer Guide.pdf
AanSulistiyo
 
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
bseovas
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
davidjhones387
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
bseovas
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 

Dernier (20)

存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
 
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
 
Azure EA Sponsorship - Customer Guide.pdf
Azure EA Sponsorship - Customer Guide.pdfAzure EA Sponsorship - Customer Guide.pdf
Azure EA Sponsorship - Customer Guide.pdf
 
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 

AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology, USA)

  • 1. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 1 Nils Newman | October 10, 2022 Finding the WHAT Will AI help?
  • 2. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 2 The WHAT - How to find concepts in Text • For a computer, finding concepts within text is an ongoing struggle • How can machines help us find concepts without us reading? • What can machines actually find? • How will AI change things? NOUNS Machines do not understand what they are “reading”
  • 3. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 3 Two Main Approaches • There are two main approaches to finding WHAT in a document ➢ Natural Language Processing (NLP) ➢ Machine Learning (ML)
  • 4. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 4
  • 5. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 5 Natural Language Processing • NLP is about finding WHAT through the structure of language • Based on learning from the structure of language either through programming or learning from documents • Uses semantic and syntactic rules to “understand” text • Usually language specific • Projects are trying to generalize across languages
  • 6. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 6 Natural Language Processing • NLP Requires Training! • Even if done by someone else such as Google’s Universal Parsey • Training is particularly important if you are interested in technical topics which do not adhere to normal sentence structure (for instance – a patent) • Some of this training might have to be supervised (humans)
  • 7. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 7 NER– NLP’s Concept Shortcut • Named Entity Recognition (NER) targets specific types of entities such as: ➢ People ➢ Places ➢ Things • For example: • Geographic Names • Chemical Names • Pharma Concepts
  • 8. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 8 NER • NER still requires training but if you are working in an area with a constrained vocabulary, NER can save a lot of time and effort *Text Courtesy of Wikipedia
  • 9. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 9
  • 10. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 10 Machine Learning: AKA Alphabet Soup • Machine Learning in Concept Extraction is all about finding patterns • Decades of research have produced many different approaches: • LSI • LSA • PCA • SVM • MI • TM • Etc..
  • 11. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 11 Machine Learning: Patterns via Math • The core of many of these techniques is finding patterns using math with little explicit instruction (no rules given) • The math runs on your data to look for connections between items and will find them on its own • The advantage of this approach is you do not have to know what you are looking for • The disadvantage is sometimes the output is rubbish • The other issue is many of these approaches give a collection of related terms but giving it a name is up to the human
  • 12. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 12
  • 13. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 13 Impact of AI on NLP • Natural Language Processing now merging with AI • NLP was transformed by the BERT language models (Sci-BERT, Bio- BERT, FinBERT, RoBERTa, ALBERT, etc..) • GPT also impactful but not open-source • The technique works because enormous training sets form the foundation • Original BERT used BookCorpus (800 million words) and English Wikipedia (2,500 million words)
  • 14. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 14 Impact of AI on Machine Learning • Machine Learning can be considered a branch of AI • The distinction is in the level of training • The latest round of AI development combined with the access to a lot of unsupervised data, means that ML- based concept extraction may be drawing on training without you knowing it • For example: Deep Learning
  • 15. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 15 AI + ML + NLP • AI has facilitated the fusion of ML with NLP to improve concept identification • NLP has the language structure, AI gives the ability to learn, and ML enhances that learning by looking for patterns, particularly patterns not seen before • For example, NER systems, given some initial training, can learn on their own using ML techniques+ AI learning models
  • 16. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 16 Beware the easy WHAT • Finding the WHAT in records is still a real challenge • Is WHAT a Concept or a Word? ➢ The Analyst’s WHAT • An analyst with Subject Matter Expertise has an expected WHAT in mind when they look at data based on their own knowledge. So their WHAT is sometimes not represented in the data. They are often looking for higher order concepts. ➢ The Data WHAT • Algorithms let the data speak for itself. The WHAT is the word in the data. • The two WHAT’s often do not agree • But AI is working to solve that as well…..
  • 17. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 17 Words vs. Concept • Looking at a set of words and associating them with a concept is not beyond the scope of AI - with proper training • In constrained lexicons, it is very possible now – for example, screening existing drugs to repurpose for COVID or Google’s ill- fated human impersonating Duplex • However, a general model is not on the horizon
  • 18. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 18 Questions?
  • 19. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 19