SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
Active Learning on Question
Answering with Dialogues
Shen Gao
Content
● Question Answering
● Data Collection
● Active Learning
● User Interaction
● System Architecture
● Results & Future Work
● “Question Answering”
Question Answering
Question Answering is a Computer Science discipline focuses on building
automated systems which are able to answer questions from human in
natural language.
Question Answering
Model
Passage
Question
Answer
Question Answering Data Sets
Text Source QA Source
Quasar-T Search Engine (Google / Bing) Trivia
Search QA Search Engine (Google / Bing) Jeopardy!
SQuAD Wikipedia Articles Annotation
Why Dialogue?
● Natural
● Machine User Interaction
● Availability
○ Transcripts
○ Texting
● Little previous work
Source: Statistics Brain
Question Answering in Dialogue
● TV Series Friends
● 10 Seasons
● 236 Episodes
● 3000 + Scenes
● Datasets from Character Mining
○ JSON formatted data
○ Tokenized
○ Season - Episode - Scene - Utterance
○ Plots available for 44% scenes
Classification on Question Types
● Based on type of answer:
○ Categorical (Multichoice) - Binary (Polar)
○ Continuous (Span of text)
● Based on Inference
○ Explicit
○ Implicit
● Based on answerability (newly introduced in sQuAD):
○ Unanswerable
○ Answerable
Explicit Questions
Q1: Does the job interview includes cooking a salad?
Implicit Questions
Q2 Is the interviewer picky?
Explicit vs Implicit
● The contextual similarity between question and answer
● The amount of inference needed to resolve
● Q1: Explicit; Abundant Similarity; Little Inference
● Q2: Implicit; Little / None Similarity; Substantial Inference
Annotation Tool
Annotation
● Annotation Phases:
○ Experimental Phase - Small Data Chunk
○ Production Phase - All Data
● Tasks per phase:
○ Question & Answer Generation
○ Verification - Inter-Annotator Agreement
Experimental
Revision
On
Template
Stable
High
IAT
Production
Challenges on Annotation
● Ambiguous Pronouns:
○ Example: In a scene having Chandler and Joey: Is he excited about the date?
● Exact wording from the original text
● Low Agreement measured
● Attempted Resolution:
○ Update instructions
○ Integrate Plots in Scene
○ Reduce the number of Questions
Evaluation Metric
● Binary Questions - Exact Match (EM)
● Continuous Questions - F1 Score
Results from Annotation
● Second Round:
○ Added Plot
○ Updated Instructions
● Third Round: Dropped # of Questions
● Random guess would give 50%!
● Cannot obtain high quality data
Change in Path
Dialogue QA
Continuous
Binary
Annotation Model Dev Analysis
Annotation Model Dev Analysis
Active
Learning
System
Dev
Online
Production
Analysis
Active Learning
● Active Learning is a sub-branch of Machine Learning in which the learning
system will interactively query the user to obtain the desired data from user.
● The goal of our system is to:
○ Collect data for model needed for improvement
○ Improve the model by applying these data
● What we offer:
○ Answer queries from user
○ Learn from user
● What user provide:
○ Annotation on the data
Baseline Model
● BERT (Bidirectional Encoder
Representations from Transformers)
from Google AI
● Contextual vs Context Free
○ Bank account
○ River Bank
Pre-train
Network
Contextual
Representation
Downstream
Model
Output
Baseline Model
● Unprecedented results in sQuAD
● Power of Bidirectional Flow
○ Versus Left->Right; Right->Left
○ Allows learning a word from all
of its context
● Masked training
User Interaction - Tutorial
User Interaction - Post Question
User Interaction - Receive Answer
User Interaction - Correct Answer
User Guidance
● Which Scene the user needs to work on
○ Ensure all scenes are evenly annotated
● Which Type of question the user needs to work on
○ Type we have least data on
○ Type the model performed worst
● User Experience: Too Monotonous?
User Guidance
● Scene Selection
● Randomly select from least
annotated
● Type Selection
● Use Probability Function to
Control randomly Select
User Guidance
● Constant c is used to linearly scale the probabilities
● Describes the degree of discrepancy between question types
User Guidance
● Train - Train the model
● Dev - Obtain stat for guidance
● Test - Evaluate Performance
● Test Statistics never shown to system
Tech Stack Overview
● Front-End: HTML, Javascript, JQuery,CSS
● Back-End: Django backend Framework (Routing, Request Parsing, ORM), python
● Database: mySQL Database
● Machine Learning Service: Tensorflow
● Deployment: AWS EC2 instance
Model View Controller (MVC)
● View: User Interface
● Controller: Logic
● Model: Data Storage
Controller get-scene
scene, type
post-question
answer
post-correction
● REST API
● Unauthenticated
● GET get-scene
● POST post-question
● POST post-answer
Controller - Security
● Server needs to know which question
user is changing
● Dummy id could create loophole
● Allow malicious user to change the
response from others
● Session is anonymous, unauthenticated
post-correction:
question-id: 1
question-id:
3/26-s1-e1-c1-1
post-correction:
question-id: 1
question-id:
3/26-s1-e1-c1-1
Controller - Security
● Solution - Hashing + Salt
● Password should not be stored in plain text
● Salt mitigates brute-force attack
● Hash also prevents secret disclosure:
○ Prevents user from know how we compute the
hash
● The hash itself is returned to user
Django Object-Relational Mapping (ORM)
● Mapping Between Database Language and Programming Language
● SQL <-> Python
● Apply structural changes to Database
● Query Database in Programming Language
● Widely used in industry & Reduce Error
Database Schema
Optimization on DB
● Indexing on fields need query
○ hash in User Response
○ count in Scene
● Delay in Database writes:
Receive
Request
Handle
Request
Return
Response
Database
IO
Concurrency on DB
● Two users could work on the same
question type / scene
● Increment the count at the same time
● Pessimistic Row-Level Locking
○ Must acquire lock before write
○ Prevents dirty write
BERT Service
● Performance
○ Reduce Overhead
● Concurrency
○ Modularize into workers
○ Synchronize
● Update
BERT Service - Predict
● Workers
○ Dedicated Model
○ Dedicated Local Space for compute
● Worker Array - Size N
● Mutex Array - Size N
● Semaphore - N available
● Acquire Semaphore first
● Then acquire mutex
● Exception Handling ensure no deadlock
W W W W W
Semaphore
M M M MM
BERT Service - Train
● Query DB for new responses
● Check batch size
● Train with batch
● Populate new worker array
● Change pointers
BERT Service
W W W W W
W W W W W
Snapshot
● Keep track of model progress
● Cron Jobs
● Use the latest worker to test against
○ dev dataset
○ test dataset
● Record:
○ Respective performance
○ Counts
○ User-Model F1
Production
● Advertised through email to students in the department
● Collected data for 7 days
● Will continue online in future
Result - System Performance
● Measured by average of 100
requests
● Predict interface measured by 100
randomly selected scenes with test
questions
● Performance in deployment
environment
Results - Data Collection
● Collected 151 responses
● Concentrated on weak types (72.18% vs 50.64%)
● No evaluation improvement yet
● 1.76% of training data
Result - User-Model F1
● Model cannot learn from its own
prediction
● Denotes reverse of similarity
between model response and user
input
Future Work
● Funding
● Current Major Limitation: Responses
● More advertising through:
○ Community of NLP
○ Community of Friends
“Question Answering”

Contenu connexe

Similaire à Active Learning on Question Answering with Dialogues

Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or realityAwantik Das
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Karthik Deivasigamani
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
 
Automatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLAutomatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLHimadri Mishra
 
Moving from BI to AI : For decision makers
Moving from BI to AI : For decision makersMoving from BI to AI : For decision makers
Moving from BI to AI : For decision makerszekeLabs Technologies
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DSRoopesh Kohad
 
Overcome the Reign of Chaos
Overcome the Reign of ChaosOvercome the Reign of Chaos
Overcome the Reign of ChaosMichael Stockerl
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark mldatamantra
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingDatabricks
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsDataWorks Summit
 
Testing Tools Online Training.pdf
Testing Tools Online Training.pdfTesting Tools Online Training.pdf
Testing Tools Online Training.pdfSpiritsoftsTraining
 

Similaire à Active Learning on Question Answering with Dialogues (20)

Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Automatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLAutomatic Machine Learning, AutoML
Automatic Machine Learning, AutoML
 
Moving from BI to AI : For decision makers
Moving from BI to AI : For decision makersMoving from BI to AI : For decision makers
Moving from BI to AI : For decision makers
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
Build machine learning pipelines from research to production
Build machine learning pipelines from research to productionBuild machine learning pipelines from research to production
Build machine learning pipelines from research to production
 
Overcome the Reign of Chaos
Overcome the Reign of ChaosOvercome the Reign of Chaos
Overcome the Reign of Chaos
 
Aws autopilot
Aws autopilotAws autopilot
Aws autopilot
 
SKLearn Workshop.pptx
SKLearn Workshop.pptxSKLearn Workshop.pptx
SKLearn Workshop.pptx
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated Training
 
MongoDB Online Training.pdf
MongoDB Online Training.pdfMongoDB Online Training.pdf
MongoDB Online Training.pdf
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
 
Testing Tools Online Training.pdf
Testing Tools Online Training.pdfTesting Tools Online Training.pdf
Testing Tools Online Training.pdf
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
 

Plus de Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionJinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning RepresentationJinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingJinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet SimilaritiesJinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical RelationsJinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementJinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingJinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueJinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingJinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological SortJinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseJinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsJinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyJinho Choi
 

Plus de Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Dernier (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Active Learning on Question Answering with Dialogues

  • 1. Active Learning on Question Answering with Dialogues Shen Gao
  • 2. Content ● Question Answering ● Data Collection ● Active Learning ● User Interaction ● System Architecture ● Results & Future Work ● “Question Answering”
  • 3. Question Answering Question Answering is a Computer Science discipline focuses on building automated systems which are able to answer questions from human in natural language.
  • 5. Question Answering Data Sets Text Source QA Source Quasar-T Search Engine (Google / Bing) Trivia Search QA Search Engine (Google / Bing) Jeopardy! SQuAD Wikipedia Articles Annotation
  • 6. Why Dialogue? ● Natural ● Machine User Interaction ● Availability ○ Transcripts ○ Texting ● Little previous work Source: Statistics Brain
  • 7. Question Answering in Dialogue ● TV Series Friends ● 10 Seasons ● 236 Episodes ● 3000 + Scenes ● Datasets from Character Mining ○ JSON formatted data ○ Tokenized ○ Season - Episode - Scene - Utterance ○ Plots available for 44% scenes
  • 8. Classification on Question Types ● Based on type of answer: ○ Categorical (Multichoice) - Binary (Polar) ○ Continuous (Span of text) ● Based on Inference ○ Explicit ○ Implicit ● Based on answerability (newly introduced in sQuAD): ○ Unanswerable ○ Answerable
  • 9. Explicit Questions Q1: Does the job interview includes cooking a salad?
  • 10. Implicit Questions Q2 Is the interviewer picky?
  • 11. Explicit vs Implicit ● The contextual similarity between question and answer ● The amount of inference needed to resolve ● Q1: Explicit; Abundant Similarity; Little Inference ● Q2: Implicit; Little / None Similarity; Substantial Inference
  • 13. Annotation ● Annotation Phases: ○ Experimental Phase - Small Data Chunk ○ Production Phase - All Data ● Tasks per phase: ○ Question & Answer Generation ○ Verification - Inter-Annotator Agreement Experimental Revision On Template Stable High IAT Production
  • 14. Challenges on Annotation ● Ambiguous Pronouns: ○ Example: In a scene having Chandler and Joey: Is he excited about the date? ● Exact wording from the original text ● Low Agreement measured ● Attempted Resolution: ○ Update instructions ○ Integrate Plots in Scene ○ Reduce the number of Questions
  • 15. Evaluation Metric ● Binary Questions - Exact Match (EM) ● Continuous Questions - F1 Score
  • 16. Results from Annotation ● Second Round: ○ Added Plot ○ Updated Instructions ● Third Round: Dropped # of Questions ● Random guess would give 50%! ● Cannot obtain high quality data
  • 17. Change in Path Dialogue QA Continuous Binary Annotation Model Dev Analysis Annotation Model Dev Analysis Active Learning System Dev Online Production Analysis
  • 18. Active Learning ● Active Learning is a sub-branch of Machine Learning in which the learning system will interactively query the user to obtain the desired data from user. ● The goal of our system is to: ○ Collect data for model needed for improvement ○ Improve the model by applying these data ● What we offer: ○ Answer queries from user ○ Learn from user ● What user provide: ○ Annotation on the data
  • 19. Baseline Model ● BERT (Bidirectional Encoder Representations from Transformers) from Google AI ● Contextual vs Context Free ○ Bank account ○ River Bank Pre-train Network Contextual Representation Downstream Model Output
  • 20. Baseline Model ● Unprecedented results in sQuAD ● Power of Bidirectional Flow ○ Versus Left->Right; Right->Left ○ Allows learning a word from all of its context ● Masked training
  • 21. User Interaction - Tutorial
  • 22. User Interaction - Post Question
  • 23. User Interaction - Receive Answer
  • 24. User Interaction - Correct Answer
  • 25. User Guidance ● Which Scene the user needs to work on ○ Ensure all scenes are evenly annotated ● Which Type of question the user needs to work on ○ Type we have least data on ○ Type the model performed worst ● User Experience: Too Monotonous?
  • 26. User Guidance ● Scene Selection ● Randomly select from least annotated ● Type Selection ● Use Probability Function to Control randomly Select
  • 27. User Guidance ● Constant c is used to linearly scale the probabilities ● Describes the degree of discrepancy between question types
  • 28. User Guidance ● Train - Train the model ● Dev - Obtain stat for guidance ● Test - Evaluate Performance ● Test Statistics never shown to system
  • 29. Tech Stack Overview ● Front-End: HTML, Javascript, JQuery,CSS ● Back-End: Django backend Framework (Routing, Request Parsing, ORM), python ● Database: mySQL Database ● Machine Learning Service: Tensorflow ● Deployment: AWS EC2 instance
  • 30. Model View Controller (MVC) ● View: User Interface ● Controller: Logic ● Model: Data Storage
  • 31. Controller get-scene scene, type post-question answer post-correction ● REST API ● Unauthenticated ● GET get-scene ● POST post-question ● POST post-answer
  • 32. Controller - Security ● Server needs to know which question user is changing ● Dummy id could create loophole ● Allow malicious user to change the response from others ● Session is anonymous, unauthenticated post-correction: question-id: 1 question-id: 3/26-s1-e1-c1-1 post-correction: question-id: 1 question-id: 3/26-s1-e1-c1-1
  • 33. Controller - Security ● Solution - Hashing + Salt ● Password should not be stored in plain text ● Salt mitigates brute-force attack ● Hash also prevents secret disclosure: ○ Prevents user from know how we compute the hash ● The hash itself is returned to user
  • 34. Django Object-Relational Mapping (ORM) ● Mapping Between Database Language and Programming Language ● SQL <-> Python ● Apply structural changes to Database ● Query Database in Programming Language ● Widely used in industry & Reduce Error
  • 36. Optimization on DB ● Indexing on fields need query ○ hash in User Response ○ count in Scene ● Delay in Database writes: Receive Request Handle Request Return Response Database IO
  • 37. Concurrency on DB ● Two users could work on the same question type / scene ● Increment the count at the same time ● Pessimistic Row-Level Locking ○ Must acquire lock before write ○ Prevents dirty write
  • 38. BERT Service ● Performance ○ Reduce Overhead ● Concurrency ○ Modularize into workers ○ Synchronize ● Update
  • 39. BERT Service - Predict ● Workers ○ Dedicated Model ○ Dedicated Local Space for compute ● Worker Array - Size N ● Mutex Array - Size N ● Semaphore - N available ● Acquire Semaphore first ● Then acquire mutex ● Exception Handling ensure no deadlock W W W W W Semaphore M M M MM
  • 40. BERT Service - Train ● Query DB for new responses ● Check batch size ● Train with batch ● Populate new worker array ● Change pointers BERT Service W W W W W W W W W W
  • 41. Snapshot ● Keep track of model progress ● Cron Jobs ● Use the latest worker to test against ○ dev dataset ○ test dataset ● Record: ○ Respective performance ○ Counts ○ User-Model F1
  • 42. Production ● Advertised through email to students in the department ● Collected data for 7 days ● Will continue online in future
  • 43. Result - System Performance ● Measured by average of 100 requests ● Predict interface measured by 100 randomly selected scenes with test questions ● Performance in deployment environment
  • 44. Results - Data Collection ● Collected 151 responses ● Concentrated on weak types (72.18% vs 50.64%) ● No evaluation improvement yet ● 1.76% of training data
  • 45. Result - User-Model F1 ● Model cannot learn from its own prediction ● Denotes reverse of similarity between model response and user input
  • 46. Future Work ● Funding ● Current Major Limitation: Responses ● More advertising through: ○ Community of NLP ○ Community of Friends