SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Natural language processing and
machine learning
Nikola Milosevic
What is AI?
• Intelligence presented by a machine
• Flexible agent that interacts with the environment and
performs actions to maximize success towards certain goal
Popular AI
What is machine learning
• Subfield of computer science that explores
how machines can learn to perform certain
task without explicit programming
Data mining generally
Types of machine learning
• Supervised learning
• Semi-supervised learning
• Unsupervised learning
• Reinforcement learning
Machine learning problems
• Classification
• Clustering
• Regression
Testing the model
• Iteratively improve the model
• Test multiple algorithms – find the best one
• No free lunch theory
• Feedback loop for feature selection
• Konfuziona matrica
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
Examples of ML frameworks and
algorithms
• SCI-kit learn
– Python library
– Implementation of the most useful algorithms
– Naïve Bayes, SVM, Random forests, decision
trees…
• Keras
– Python library implementing about everything
related to neural networks
Text data
• About 80% of data in organizations are in text
format
• Harder to analyse than structured data
• Huge amount of textual documents
– Only in biomedicine 2200 scientific papers are
published every day
• Growing exponentially
Main goals of text mining
• Make communication easier (e.g. translation)
• Automate some processes (e.g.
communication agents/chatbots)
• Do data mining on textual and unstructured
data
Process overview
Challenges
• Man saw a woman with the telescope.
– Who has a telescope?
• Multiple senses, synonyms,
homonyms, irony
• Grammar and context can help
• Acronyms
Approaches
• Rule based
– Human defined rules to extract information
– Needs expert humans who know how people express
certain things
– Is quite laborious
• Machine learning based
– Machine tries to learn what to extract guided by
human
– Needs annotated corpora (usually fairly large)
• This is expensive to create and quite laborious
Levels of analysis
• Lexical
– Analysis of words
• Syntactic
– Analysis of organization of words
(phrases, sentences)
• Semantic
– Analysis meaning
• Sometimes pragmatic
– Analysis pragmatics of the use of certain words,
phrases. Why author used that?
Steps
Lexical processing
• Part of speech tagging
• Parsing
– Constituency
– Dependency
Stanford parser
Semantic processing
• Text classification
– Sentiment analysis (positive/negative)
– Classification by topics (politics/sport/business)
– Authorship detection (Tolkien, Rowling, Shakespeare)
• Named entity recognition
• Topic modelling (unsupervised)
• Search
Sequence modelling
• Machine learning technique useful for named
entity recognition
• Conditional random fields (CRF) or recurrent
neural networks (often LSTM)
Feature engineering
• Selecting important features that help extract
information
• Can be:
– Words, PoS, word shapes, vocabulary features,
etc.
– May depend on task and methodology
– Iterative process of selecting and improving the
performance
– Some features may confuse the algorithm
Search
• Finds documents that are the most relevant
for a given user query
• Usual techniques include algorithm called TF-
IDF and cosine similarity
• May additionally use links towards text,
positions of matched words and similar things
to rank found documents
• Apache Lucene, Solr (Java), there are also
Python libraries
Language models
• Used as features to classification and other
NLP tasks
• Contain some basic characteristics of language
• The most naïve (but also frequently used) is
called Bag of Words
• NN use more advanced
models: word2vec, Glove,
ULMo, BERT…
Useful tools and libraries
• Apache OpenNLP – Java
• Apache Lucene – Java, C#
• Stanford Core NLP – Java
• NLTK – Python
• GATE – GUI alat
• SharpNLP
• ...
• Weka – for machine learning (GUI)

Contenu connexe

Tendances

Machine Learning
Machine LearningMachine Learning
Machine LearningKumar P
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
What is Deep Learning?
What is Deep Learning?What is Deep Learning?
What is Deep Learning?NVIDIA
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningEng Teong Cheah
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelinesjeykottalam
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Simplilearn
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 

Tendances (20)

Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
Data science
Data scienceData science
Data science
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning
Machine learning Machine learning
Machine learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Analytics 2
Analytics 2Analytics 2
Analytics 2
 
What is Deep Learning?
What is Deep Learning?What is Deep Learning?
What is Deep Learning?
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
Machine learning
Machine learningMachine learning
Machine learning
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 

Similaire à Machine learning (ML) and natural language processing (NLP)

Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text MiningMinha Hwang
 
Techniques of information retrieval
Techniques of information retrieval Techniques of information retrieval
Techniques of information retrieval Tariq Hassan
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionMohammad Ilyas Malik
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptxAmanBadesra1
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalVikas Bhushan
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 documentUma Kant
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
Comm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockComm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockMelanie Parlette-Stewart
 
Artificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarArtificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarGarry D. Lasaga
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structuressonykhan3
 
Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?Heimo Hänninen
 

Similaire à Machine learning (ML) and natural language processing (NLP) (20)

Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Text Mining
Text MiningText Mining
Text Mining
 
Techniques of information retrieval
Techniques of information retrieval Techniques of information retrieval
Techniques of information retrieval
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognition
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
 
machine learning
machine learningmachine learning
machine learning
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 document
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
Comm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockComm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcock
 
IR
IRIR
IR
 
Artificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarArtificial Intelligence by B. Ravikumar
Artificial Intelligence by B. Ravikumar
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
 
ICS1020 NLP 2020
ICS1020 NLP 2020ICS1020 NLP 2020
ICS1020 NLP 2020
 
Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"
 
Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?
 

Plus de Nikola Milosevic

Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...Nikola Milosevic
 
AI an the future of society
AI an the future of societyAI an the future of society
AI an the future of societyNikola Milosevic
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock marketsNikola Milosevic
 
Equity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learningEquity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learningNikola Milosevic
 
BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...Nikola Milosevic
 
Extracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literatureExtracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literatureNikola Milosevic
 
Supporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table miningSupporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table miningNikola Milosevic
 
Mobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP SeraphimdroidMobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP SeraphimdroidNikola Milosevic
 
Table mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureTable mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureNikola Milosevic
 
Sentiment analysis for Serbian language
Sentiment analysis for Serbian languageSentiment analysis for Serbian language
Sentiment analysis for Serbian languageNikola Milosevic
 
Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture Nikola Milosevic
 
Mašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jezikuMašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jezikuNikola Milosevic
 
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...Nikola Milosevic
 

Plus de Nikola Milosevic (20)

Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...
 
Veštačka inteligencija
Veštačka inteligencijaVeštačka inteligencija
Veštačka inteligencija
 
AI an the future of society
AI an the future of societyAI an the future of society
AI an the future of society
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock markets
 
Equity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learningEquity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learning
 
BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...
 
Extracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literatureExtracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literature
 
Supporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table miningSupporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table mining
 
Mobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP SeraphimdroidMobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
 
Serbia2
Serbia2Serbia2
Serbia2
 
Table mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureTable mining and data curation from biomedical literature
Table mining and data curation from biomedical literature
 
Malware
MalwareMalware
Malware
 
Sentiment analysis for Serbian language
Sentiment analysis for Serbian languageSentiment analysis for Serbian language
Sentiment analysis for Serbian language
 
Http and security
Http and securityHttp and security
Http and security
 
Android business models
Android business modelsAndroid business models
Android business models
 
Android(1)
Android(1)Android(1)
Android(1)
 
Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture
 
Mašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jezikuMašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jeziku
 
Malware
MalwareMalware
Malware
 
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
 

Dernier

DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
priority interrupt computer organization
priority interrupt computer organizationpriority interrupt computer organization
priority interrupt computer organizationchnrketan
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptNoman khan
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Communityprachaibot
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier Fernández Muñoz
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptxmohitesoham12
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESkarthi keyan
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labsamber724300
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
A brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProA brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProRay Yuan Liu
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfShreyas Pandit
 

Dernier (20)

DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
priority interrupt computer organization
priority interrupt computer organizationpriority interrupt computer organization
priority interrupt computer organization
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).ppt
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Community
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptx
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptx
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labs
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
A brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProA brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision Pro
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdf
 

Machine learning (ML) and natural language processing (NLP)

  • 1. Natural language processing and machine learning Nikola Milosevic
  • 2. What is AI? • Intelligence presented by a machine • Flexible agent that interacts with the environment and performs actions to maximize success towards certain goal
  • 4. What is machine learning • Subfield of computer science that explores how machines can learn to perform certain task without explicit programming
  • 6. Types of machine learning • Supervised learning • Semi-supervised learning • Unsupervised learning • Reinforcement learning
  • 7. Machine learning problems • Classification • Clustering • Regression
  • 8. Testing the model • Iteratively improve the model • Test multiple algorithms – find the best one • No free lunch theory • Feedback loop for feature selection • Konfuziona matrica 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
  • 9. Examples of ML frameworks and algorithms • SCI-kit learn – Python library – Implementation of the most useful algorithms – Naïve Bayes, SVM, Random forests, decision trees… • Keras – Python library implementing about everything related to neural networks
  • 10. Text data • About 80% of data in organizations are in text format • Harder to analyse than structured data • Huge amount of textual documents – Only in biomedicine 2200 scientific papers are published every day • Growing exponentially
  • 11. Main goals of text mining • Make communication easier (e.g. translation) • Automate some processes (e.g. communication agents/chatbots) • Do data mining on textual and unstructured data
  • 13. Challenges • Man saw a woman with the telescope. – Who has a telescope? • Multiple senses, synonyms, homonyms, irony • Grammar and context can help • Acronyms
  • 14. Approaches • Rule based – Human defined rules to extract information – Needs expert humans who know how people express certain things – Is quite laborious • Machine learning based – Machine tries to learn what to extract guided by human – Needs annotated corpora (usually fairly large) • This is expensive to create and quite laborious
  • 15. Levels of analysis • Lexical – Analysis of words • Syntactic – Analysis of organization of words (phrases, sentences) • Semantic – Analysis meaning • Sometimes pragmatic – Analysis pragmatics of the use of certain words, phrases. Why author used that?
  • 16. Steps
  • 17. Lexical processing • Part of speech tagging • Parsing – Constituency – Dependency Stanford parser
  • 18. Semantic processing • Text classification – Sentiment analysis (positive/negative) – Classification by topics (politics/sport/business) – Authorship detection (Tolkien, Rowling, Shakespeare) • Named entity recognition • Topic modelling (unsupervised) • Search
  • 19. Sequence modelling • Machine learning technique useful for named entity recognition • Conditional random fields (CRF) or recurrent neural networks (often LSTM)
  • 20. Feature engineering • Selecting important features that help extract information • Can be: – Words, PoS, word shapes, vocabulary features, etc. – May depend on task and methodology – Iterative process of selecting and improving the performance – Some features may confuse the algorithm
  • 21. Search • Finds documents that are the most relevant for a given user query • Usual techniques include algorithm called TF- IDF and cosine similarity • May additionally use links towards text, positions of matched words and similar things to rank found documents • Apache Lucene, Solr (Java), there are also Python libraries
  • 22. Language models • Used as features to classification and other NLP tasks • Contain some basic characteristics of language • The most naïve (but also frequently used) is called Bag of Words • NN use more advanced models: word2vec, Glove, ULMo, BERT…
  • 23. Useful tools and libraries • Apache OpenNLP – Java • Apache Lucene – Java, C# • Stanford Core NLP – Java • NLTK – Python • GATE – GUI alat • SharpNLP • ... • Weka – for machine learning (GUI)