Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology, USA)

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 19 Publicité

AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology, USA)

Télécharger pour lire hors ligne

It is relatively easy for a human to read a document and quickly figure out which concepts are important. However, this task is a difficult challenge for a machine. During the past few decades, there have been two main approaches for concept identification: Natural Language Processing and Machine Learning. During the early part of this century, Machine Learning made great strides as new techniques came into wider use (SVM’s, Topic Modeling, etc..). Sensing the competition, Natural Language Processing responded with deployment of new emerging techniques (sematic networks, finite state automata, etc..). Neither approach has completely solved the WHAT problem. Advances in Artificial Intelligence have the potential to significantly improve the situation. Where AI is making the most impact is as an enhancement to make Machine Learning and Natural Language Processing work better and, more importantly, work together. This presentation looks at some of this history and what might happen in the future when we blend the interpretation of language with pattern prediction.


It is relatively easy for a human to read a document and quickly figure out which concepts are important. However, this task is a difficult challenge for a machine. During the past few decades, there have been two main approaches for concept identification: Natural Language Processing and Machine Learning. During the early part of this century, Machine Learning made great strides as new techniques came into wider use (SVM’s, Topic Modeling, etc..). Sensing the competition, Natural Language Processing responded with deployment of new emerging techniques (sematic networks, finite state automata, etc..). Neither approach has completely solved the WHAT problem. Advances in Artificial Intelligence have the potential to significantly improve the situation. Where AI is making the most impact is as an enhancement to make Machine Learning and Natural Language Processing work better and, more importantly, work together. This presentation looks at some of this history and what might happen in the future when we blend the interpretation of language with pattern prediction.


Publicité
Publicité

Plus De Contenu Connexe

Similaire à AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology, USA) (20)

Plus par Dr. Haxel Consult (20)

Publicité

Plus récents (20)

AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology, USA)

  1. 1. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 1 Nils Newman | October 10, 2022 Finding the WHAT Will AI help?
  2. 2. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 2 The WHAT - How to find concepts in Text • For a computer, finding concepts within text is an ongoing struggle • How can machines help us find concepts without us reading? • What can machines actually find? • How will AI change things? NOUNS Machines do not understand what they are “reading”
  3. 3. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 3 Two Main Approaches • There are two main approaches to finding WHAT in a document ➢ Natural Language Processing (NLP) ➢ Machine Learning (ML)
  4. 4. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 4
  5. 5. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 5 Natural Language Processing • NLP is about finding WHAT through the structure of language • Based on learning from the structure of language either through programming or learning from documents • Uses semantic and syntactic rules to “understand” text • Usually language specific • Projects are trying to generalize across languages
  6. 6. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 6 Natural Language Processing • NLP Requires Training! • Even if done by someone else such as Google’s Universal Parsey • Training is particularly important if you are interested in technical topics which do not adhere to normal sentence structure (for instance – a patent) • Some of this training might have to be supervised (humans)
  7. 7. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 7 NER– NLP’s Concept Shortcut • Named Entity Recognition (NER) targets specific types of entities such as: ➢ People ➢ Places ➢ Things • For example: • Geographic Names • Chemical Names • Pharma Concepts
  8. 8. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 8 NER • NER still requires training but if you are working in an area with a constrained vocabulary, NER can save a lot of time and effort *Text Courtesy of Wikipedia
  9. 9. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 9
  10. 10. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 10 Machine Learning: AKA Alphabet Soup • Machine Learning in Concept Extraction is all about finding patterns • Decades of research have produced many different approaches: • LSI • LSA • PCA • SVM • MI • TM • Etc..
  11. 11. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 11 Machine Learning: Patterns via Math • The core of many of these techniques is finding patterns using math with little explicit instruction (no rules given) • The math runs on your data to look for connections between items and will find them on its own • The advantage of this approach is you do not have to know what you are looking for • The disadvantage is sometimes the output is rubbish • The other issue is many of these approaches give a collection of related terms but giving it a name is up to the human
  12. 12. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 12
  13. 13. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 13 Impact of AI on NLP • Natural Language Processing now merging with AI • NLP was transformed by the BERT language models (Sci-BERT, Bio- BERT, FinBERT, RoBERTa, ALBERT, etc..) • GPT also impactful but not open-source • The technique works because enormous training sets form the foundation • Original BERT used BookCorpus (800 million words) and English Wikipedia (2,500 million words)
  14. 14. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 14 Impact of AI on Machine Learning • Machine Learning can be considered a branch of AI • The distinction is in the level of training • The latest round of AI development combined with the access to a lot of unsupervised data, means that ML- based concept extraction may be drawing on training without you knowing it • For example: Deep Learning
  15. 15. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 15 AI + ML + NLP • AI has facilitated the fusion of ML with NLP to improve concept identification • NLP has the language structure, AI gives the ability to learn, and ML enhances that learning by looking for patterns, particularly patterns not seen before • For example, NER systems, given some initial training, can learn on their own using ML techniques+ AI learning models
  16. 16. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 16 Beware the easy WHAT • Finding the WHAT in records is still a real challenge • Is WHAT a Concept or a Word? ➢ The Analyst’s WHAT • An analyst with Subject Matter Expertise has an expected WHAT in mind when they look at data based on their own knowledge. So their WHAT is sometimes not represented in the data. They are often looking for higher order concepts. ➢ The Data WHAT • Algorithms let the data speak for itself. The WHAT is the word in the data. • The two WHAT’s often do not agree • But AI is working to solve that as well…..
  17. 17. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 17 Words vs. Concept • Looking at a set of words and associating them with a concept is not beyond the scope of AI - with proper training • In constrained lexicons, it is very possible now – for example, screening existing drugs to repurpose for COVID or Google’s ill- fated human impersonating Duplex • However, a general model is not on the horizon
  18. 18. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 18 Questions?
  19. 19. Copyright ©1997-2022 Search Technology, Inc. TheVantagePoint.com | 19

×