Introduction to Natural Language Processing (NLP)

•

2 j'aime•141 vues

WingChan46

Presenting NLP concepts and demo my NLP project which is building NLP pipeline with AWS services.

Technologie

Introduction to Natural
Language Processing (NLP)
Presented By Wing Chan

Agenda
What is NLP? How NLP
works?
NLP
Applications
Project Demo –
Switch

What's NLP?
• Stands for Natural Language
Processing.
• A process that try to understand or
generate pieces of human language.
NLP often uses AI techniques such
as Machine Learning.
• To convert unstructured data such
as text documents and voice data
from human to structured and
meaningful knowledges. This is an
important part of Text mining / Text
Analysis.

How NLP Works
Key features in NLP pipeline can do:
• Entity recognition- extracting
entities from text documents
• Topics analysis - extracting
features and topics
• Sentiment analysis - Analyzing
text for positive and negative
feelings
• Classification - Classifying
documents

NLP Applications
• Spellcheckers or Spam filters
• Recommendation system, i.e. Netflix
• Voice assistants, I.e. Alexa, Siri
• Online search, i.e. Google
• Language Translation, i.e. Google translate

Project Demo - Switch
• Switch is a comprehensive Jobs search engine using content from Twitter. It allows Job
seekers to search relevant job posting, submitted by Twitter users. We collect all tweets
related to job posting using Twitter's API and process them with Natural Language
Processing (NLP) techniques to return relevant job results to our users.
• Website: http://switch-ui.s3-website.us-east-2.amazonaws.com

Technologies we used
• Twitter APIs - To scrape data from data source Twitter we used Twitter API
• Amazon Web Service (AWS) - We used AWS to host our services. We leveraged many
AWS services for this project includes Lambda functions, DynamoDB, Elastic Search
Engine, S3 and CloudWatch Events.
• Python Libraries used are Tweepy, Spacy, Pandas, NumPy, re, json, requests
• Web Development Framework– For our end user Web portal we used React as our
platform of development.
• Other Tools – Google Colabs (Jupyter notebook), Trello (Kanban board).

NLP techniques we used
• Named Entity Recognition – is an information extraction technique to extract
unstructured text (i.e. Tweet messages) into pre-defined categories information such as
organization name and geographic location.
• Naive Bayes Classification – is a simple text retrieval technique for constructing a
classifier based on applying Bayes' algorithm with independence features and labeled
training data. We used it to identify and discard irrelevant tweets in this project. This is
also a type of supervised learning methods.

Lesson learned
• Data exploration is important – be sure to understand your data by sampling them first.
• The defaulttweet message was 140 chars. We need to get the full text message (256 chars) instead.
• Missing valuablefield values, i.e. Screen name (@interstate_batteries)
• Collecting good training datais hard.
• Understand your platform limitation
• AWS Lambda deploymentpackage limitation - 50 MB (zipped for direct upload).

References
• Switch Web Portal: http://switch-ui.s3-website.us-east-2.amazonaws.com/
• Switch Source Codes and Documentation:https://lab.textdata.org/wingkc2/switch
• PresentationVideo: https://youtu.be/loMG4rdGXkg

Contenu connexe

Tendances

Sentiment analysis using naive bayes classifier Dev Sahu

Deep learning - A Visual IntroductionLukas Masuch

Syntactic analysis in NLPkartikaVashisht

Document SummarizationPratik Kumar

Word embedding ShivaniChoudhary74

Support Vector Machine ppt presentationAyanaRukasar

LSTM Based Sentiment Analysisijtsrd

NLP_KASHK:Minimum Edit DistanceHemantha Kulathilake

camera-based Lane detection by deep learningYu Huang

Lecture 1: Semantic Analysis in Language TechnologyMarina Santini

Nlp toolkits and_preprocessing_techniquesankit_ppt

Rnn and lstmShreshth Saxena

Sentiment AnalysisAditya Nag

Lecture: Context-Free GrammarsMarina Santini

Introduction to Text MiningMinha Hwang

Aspect Based Sentiment AnalysisGaurav kumar

NLP.pptxRahul Borate

Text classification & sentiment analysisM. Atif Qureshi

Introduction to Named Entity RecognitionTomer Lieber

Semantic Role LabelingMarina Santini

Tendances (20)

Sentiment analysis using naive bayes classifier

Deep learning - A Visual Introduction

Syntactic analysis in NLP

Document Summarization

Word embedding

Support Vector Machine ppt presentation

LSTM Based Sentiment Analysis

NLP_KASHK:Minimum Edit Distance

camera-based Lane detection by deep learning

Lecture 1: Semantic Analysis in Language Technology

Nlp toolkits and_preprocessing_techniques

Rnn and lstm

Sentiment Analysis

Lecture: Context-Free Grammars

Introduction to Text Mining

Aspect Based Sentiment Analysis

NLP.pptx

Text classification & sentiment analysis

Introduction to Named Entity Recognition

Semantic Role Labeling

Similaire à Introduction to Natural Language Processing (NLP)

Natural language processing and searchNathan McMinn

Sentiment Analysis on TwitterSmritiAgarwal26

Building NLP solutions using Pythonbotsplash.com

Building NLP solutions for Davidson ML Groupbotsplash.com

SE-IT JAVA LAB OOP CONCEPTnikshaikh786

An Introduction to Natural Language ProcessingTyrone Systems

AI & MLKaran Shaw

Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Sri Ambati

Webinar: Question Answering and Virtual Assistants with Deep LearningLucidworks

Natural Language Processing for Data Analytics - Tel Aviv Summit 2018Amazon Web Services

How Oracle Uses CrowdFlower For Sentiment AnalysisCrowdFlower

NLP and Deep Learning for non_expertsSanghamitra Deb

ProjectsSummary.pptxJamesKirk79

Solved Big Data and Data Science Projects pdf.pdfProjectPro Big Data and Data Science Projects

Software Programming with Python II.pptxGevitaChinnaiah

Taming TextGrant Ingersoll

SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptxPriyankaShah668821

Crowdsourcing or bust: The Indexer, Archives NZ donellemckinley

Final presentationNitish Upreti

score based ranking of documentsKriti Khanna

Similaire à Introduction to Natural Language Processing (NLP) (20)

Natural language processing and search

Sentiment Analysis on Twitter

Building NLP solutions using Python

Building NLP solutions for Davidson ML Group

SE-IT JAVA LAB OOP CONCEPT

An Introduction to Natural Language Processing

AI & ML

Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...

Webinar: Question Answering and Virtual Assistants with Deep Learning

Natural Language Processing for Data Analytics - Tel Aviv Summit 2018

How Oracle Uses CrowdFlower For Sentiment Analysis

NLP and Deep Learning for non_experts

ProjectsSummary.pptx

Solved Big Data and Data Science Projects pdf.pdf

Software Programming with Python II.pptx

Taming Text

SG_UserGroup_Oct20_2022_NLP_AzureLangStudio.pptx

Crowdsourcing or bust: The Indexer, Archives NZ

Final presentation

score based ranking of documents

Dernier

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Histor y of HAM Radio presentation slidevu2urc

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Real Time Object Detection Using Open CVKhem

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

A Year of the Servo Reboot: Where Are We Now?Igalia

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

How to convert PDF to text with Nanonetsnaman860154

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Artificial Intelligence: Facts and MythsJoaquim Jorge

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Dernier (20)

Data Cloud, More than a CDP by Matt Robison

Tata AIG General Insurance Company - Insurer Innovation Award 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Presentation on how to chat with PDF using ChatGPT code interpreter

How to Troubleshoot Apps for the Modern Connected Worker

Histor y of HAM Radio presentation slide

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

GenCyber Cyber Security Day Presentation

Real Time Object Detection Using Open CV

IAC 2024 - IA Fast Track to Search Focused AI Solutions

A Year of the Servo Reboot: Where Are We Now?

Boost Fertility New Invention Ups Success Rates.pdf

Exploring the Future Potential of AI-Enabled Smartphone Processors

How to convert PDF to text with Nanonets

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Artificial Intelligence: Facts and Myths

🐬 The future of MySQL is Postgres 🐘

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Introduction to Natural Language Processing (NLP)

1. Introduction to Natural Language Processing (NLP) Presented By Wing Chan

2. Agenda What is NLP? How NLP works? NLP Applications Project Demo – Switch

3. What's NLP? • Stands for Natural Language Processing. • A process that try to understand or generate pieces of human language. NLP often uses AI techniques such as Machine Learning. • To convert unstructured data such as text documents and voice data from human to structured and meaningful knowledges. This is an important part of Text mining / Text Analysis.

4. Basic Concepts in NLP

5. How NLP Works Key features in NLP pipeline can do: • Entity recognition- extracting entities from text documents • Topics analysis - extracting features and topics • Sentiment analysis - Analyzing text for positive and negative feelings • Classification - Classifying documents

6. List of common NLP Algorithms

7. NLP Applications • Spellcheckers or Spam filters • Recommendation system, i.e. Netflix • Voice assistants, I.e. Alexa, Siri • Online search, i.e. Google • Language Translation, i.e. Google translate

8. Project Demo - Switch • Switch is a comprehensive Jobs search engine using content from Twitter. It allows Job seekers to search relevant job posting, submitted by Twitter users. We collect all tweets related to job posting using Twitter's API and process them with Natural Language Processing (NLP) techniques to return relevant job results to our users. • Website: http://switch-ui.s3-website.us-east-2.amazonaws.com

9. Technologies we used • Twitter APIs - To scrape data from data source Twitter we used Twitter API • Amazon Web Service (AWS) - We used AWS to host our services. We leveraged many AWS services for this project includes Lambda functions, DynamoDB, Elastic Search Engine, S3 and CloudWatch Events. • Python Libraries used are Tweepy, Spacy, Pandas, NumPy, re, json, requests • Web Development Framework– For our end user Web portal we used React as our platform of development. • Other Tools – Google Colabs (Jupyter notebook), Trello (Kanban board).

10. NLP techniques we used • Named Entity Recognition – is an information extraction technique to extract unstructured text (i.e. Tweet messages) into pre-defined categories information such as organization name and geographic location. • Naive Bayes Classification – is a simple text retrieval technique for constructing a classifier based on applying Bayes' algorithm with independence features and labeled training data. We used it to identify and discard irrelevant tweets in this project. This is also a type of supervised learning methods.

11. Architecture Diagram

12. Lesson learned • Data exploration is important – be sure to understand your data by sampling them first. • The defaulttweet message was 140 chars. We need to get the full text message (256 chars) instead. • Missing valuablefield values, i.e. Screen name (@interstate_batteries) • Collecting good training datais hard. • Understand your platform limitation • AWS Lambda deploymentpackage limitation - 50 MB (zipped for direct upload).

13. References • Switch Web Portal: http://switch-ui.s3-website.us-east-2.amazonaws.com/ • Switch Source Codes and Documentation:https://lab.textdata.org/wingkc2/switch • PresentationVideo: https://youtu.be/loMG4rdGXkg

Introduction to Natural Language Processing (NLP)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Introduction to Natural Language Processing (NLP)

Similaire à Introduction to Natural Language Processing (NLP) (20)

Dernier

Dernier (20)

Introduction to Natural Language Processing (NLP)