SlideShare a Scribd company logo
1 of 25
Guided by- Ms. Safa Hamdare
Group Members
.
Quora Duplicate Question Pair
Detection Using Semantic Analysis
Name Roll No.
Jai Mulye 64
Anshul Pawaskar 87
Tannmay Redij 88
Akshata Talankar 89
St. Francis Institute of Technology
Department of Computer Engineering
Quora Duplicate Question Pair Detection using Semantic Analysis
1 28/05/2021
Content
● Introduction
● Literature
● Problem Statement
● Proposed Solution
● Work Flow of the system
● Algorithm with Implementation details
● Experimental Set Up
● Data Set
● Performance Evaluation Parameters
● Validation with Test Cases
● Results & Discussion
● Conclusion
● References
28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 2
Introduction
• What is Quora?
28/05/2021 3
Quora Duplicate Question Pair Detection using Semantic Analysis
Current Scenario:
Quora uses Random Forest technique to identify duplicate
questions.
Let’s look at two hypothetical questions:
1. Is it true that time flies like an arrow?
2. Do fruit flies like a banana?
There are two common words in these questions, flies and
like.
4
28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 4
Let’s consider these
5
28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 5
Literature
• The paper[1] explores the Transformer based
Universal Sentence Encoder which relies on
attention mechanism.
• The paper[2] introduces Deep Averaging Network
which performs well with neural networks that model
semantic and syntactic compositionality.
6
28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis
Literature
• The paper cited [3] explores the two variants of
Universal Sentence Encoder- the transformer and
the deep averaging network (DAN).
• The paper cited [4] analyses several neural network
designs and their variations for sentence pair
modelling and compare their performance
extensively across eight datasets, including
paraphrase identification, semantic textual similarity,
natural language inference, and question answering
tasks.
7
28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis
Problem Statement
• On Quora, there may be people who might ask same
questions differently from an existing question. Solving
this problem will help to reduce the redundancy on the
platform and the manual task of identifying the questions
to match the correct answer for same. The task to identify
which questions asked on Quora are duplicates of
questions that have already been asked could be useful to
instantly provide answers of existing questions.
• A model created which can predict if the questions
entered are similar in meaning based on deep learning
approach using DAN & Transformer model.
28/05/2021 8
Quora Duplicate Question Pair Detection using Semantic Analysis
Proposed Solution
1. Pre Processing 3. Deep Learning Approach
(DAN & Transformer)
2. Sentence to Vector
Conversion (USE)
28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 9
Fig 1: Workflow of the System
Work Flow of the system
28/05/2021 10
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 2: Architecture Diagram
Algorithm with Implementation
Details
28/05/2021 11
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 3: Algorithm
Algorithm with Implementation
Details
28/05/2021 12
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 4: Implementation
Experimental Setup
28/05/2021 13
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 5: Dataset[5]
Experimental Setup
28/05/2021 14
Fig 6: Model accuracy of
Transformer
Fig 7: Model loss of
Transformer
Quora Duplicate Question Pair Detection using Semantic Analysis
Experimental Setup
28/05/2021 15
Fig 8: Model accuracy of DAN
Fig 9: Model loss of DAN
Quora Duplicate Question Pair Detection using Semantic Analysis
Validation with Test cases
28/05/2021 16
Quora Duplicate Question Pair Detection using Semantic Analysis
Results and Discussions
28/05/2021 17
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 10: Browse Questions
Results and Discussions
28/05/2021 18
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 11: Post Questions
Results and Discussions
28/05/2021 19
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 12: Results by DAN Model
Results and Discussions
28/05/2021 20
Quora Duplicate Question Pair Detection using Semantic Analysis
Fig 13: Results by Transformer Model
Conclusion
28/05/2021 21
Quora Duplicate Question Pair Detection using Semantic Analysis
Model Embedding technique
F1-score
weighted average
F1- Score macro
average
Logistic
Regression
Word2Vec, Similarity
scores
0.66 0.62
Random Forest
Word2Vec, Similarity
scores
0.70 0.69
Table 1:Accuracy of machine learning models
Conclusion
28/05/2021 22
Quora Duplicate Question Pair Detection using Semantic Analysis
Table 2:Accuracy of Deep learning models (DAN & Transformer)
Model
Embedding
technique
Epochs
Training
accuracy (%)
Validation
accuracy (%)
Neural
Network
Universal Sentence
Encoder (DAN)
20 88.63 86
Neural
Network
Universal Sentence
Encoder
(Transformer)
20 89.16 85
Conclusion
• Deep learning models using sentence level
embedding outperform the basic classification
model.
• DAN Model sometimes under performs with the
questions having double negation.
• Transformer based Universal Sentence Encoder can
be used.
28/05/2021 23
Quora Duplicate Question Pair Detection using Semantic Analysis
References
[1] Mueller J, Thyagarajan A. Siamese recurrent architectures for learning
sentence similarity. In: Proceedings of the thirtieth AAAI conference on artificial
intelligence. (2016)
[2] Eneko Agirre, Aitor Gonzalez-Agirre, Inigo Lopez-Gazpio, Montse Maritxalar,
German Rigau, and Larraitz Uria. Semeval-2016 task 2: Interpretable semantic
textual similarity. In: Proceedings of the 10th International Workshop on Semantic
Evaluation (2016).
[3] Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.
Advances in neural information processing systems, pp. 5998-6008. 2017. (2017)
[4] Cer D, Yang Y, Kong S-Y, et al. Universal Sentence Encoder for English. In:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language
Processing: System Demonstrations. doi: 10.18653/v1/d18-2029 (2018)
[5] https://www.kaggle.com/c/quora-question-pairs/data
28/05/2021 24
Quora Duplicate Question Pair Detection using Semantic Analysis
28/05/2021 25
Thank you
Quora Duplicate Question Pair Detection using Semantic Analysis

More Related Content

What's hot

[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
شرح مبسط عن الخوارزميات الجينية باستخدام الحاسبات
شرح مبسط عن الخوارزميات الجينية باستخدام الحاسباتشرح مبسط عن الخوارزميات الجينية باستخدام الحاسبات
شرح مبسط عن الخوارزميات الجينية باستخدام الحاسباتsayAAhmad
 
Software re engineering
Software re engineeringSoftware re engineering
Software re engineeringdeshpandeamrut
 
Text prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelText prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelANIRUDHMALODE2
 
REQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGREQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGSaqib Raza
 
Principles of Monitoring Microservices
Principles of Monitoring MicroservicesPrinciples of Monitoring Microservices
Principles of Monitoring MicroservicesMichael Ducy
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measuresankit_ppt
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesNeo4j
 
Unified process model
Unified process modelUnified process model
Unified process modelRyndaMaala
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 
Version Stamps in NOSQL Databases
Version Stamps in NOSQL DatabasesVersion Stamps in NOSQL Databases
Version Stamps in NOSQL DatabasesDr-Dipali Meher
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdaviirpycon
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models Xiaotao Zou
 
أساسيات البحث على الإنترنت
أساسيات البحث على الإنترنتأساسيات البحث على الإنترنت
أساسيات البحث على الإنترنتMostafa Gawdat
 

What's hot (20)

[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Consistency in NoSQL
Consistency in NoSQLConsistency in NoSQL
Consistency in NoSQL
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
شرح مبسط عن الخوارزميات الجينية باستخدام الحاسبات
شرح مبسط عن الخوارزميات الجينية باستخدام الحاسباتشرح مبسط عن الخوارزميات الجينية باستخدام الحاسبات
شرح مبسط عن الخوارزميات الجينية باستخدام الحاسبات
 
Software re engineering
Software re engineeringSoftware re engineering
Software re engineering
 
Text prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelText prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language Model
 
REQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGREQUIREMENT ENGINEERING
REQUIREMENT ENGINEERING
 
Principles of Monitoring Microservices
Principles of Monitoring MicroservicesPrinciples of Monitoring Microservices
Principles of Monitoring Microservices
 
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
Distributed DBMS - Unit 3 - Distributed DBMS ArchitectureDistributed DBMS - Unit 3 - Distributed DBMS Architecture
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measures
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and Ontologies
 
Unified process model
Unified process modelUnified process model
Unified process model
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
Version Stamps in NOSQL Databases
Version Stamps in NOSQL DatabasesVersion Stamps in NOSQL Databases
Version Stamps in NOSQL Databases
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
TensorFlow
TensorFlowTensorFlow
TensorFlow
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdavi
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models
 
أساسيات البحث على الإنترنت
أساسيات البحث على الإنترنتأساسيات البحث على الإنترنت
أساسيات البحث على الإنترنت
 

Similar to Quora Duplicate Question Detection Using Semantic Analysis

Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...IEEEGLOBALSOFTTECHNOLOGIES
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questionsIEEEFINALYEARPROJECTS
 
Manta ray optimized deep contextualized bi-directional long short-term memor...
Manta ray optimized deep contextualized bi-directional long  short-term memor...Manta ray optimized deep contextualized bi-directional long  short-term memor...
Manta ray optimized deep contextualized bi-directional long short-term memor...IJECEIAES
 
Répondre à la question automatique avec le web
Répondre à la question automatique avec le webRépondre à la question automatique avec le web
Répondre à la question automatique avec le webAhmed Hammami
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamDoug Needham
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - DocumentNishna Ma
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET Journal
 
Eswc2009
Eswc2009Eswc2009
Eswc2009fanizzi
 
Nature Inspired Models And The Semantic Web
Nature Inspired Models And The Semantic WebNature Inspired Models And The Semantic Web
Nature Inspired Models And The Semantic WebStefan Ceriu
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...
A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...
A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...Darian Pruitt
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...IRJET Journal
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approachGarima Nanda
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningGuido A. Ciollaro
 

Similar to Quora Duplicate Question Detection Using Semantic Analysis (20)

Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questions
 
ISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-MondalISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-Mondal
 
Manta ray optimized deep contextualized bi-directional long short-term memor...
Manta ray optimized deep contextualized bi-directional long  short-term memor...Manta ray optimized deep contextualized bi-directional long  short-term memor...
Manta ray optimized deep contextualized bi-directional long short-term memor...
 
Répondre à la question automatique avec le web
Répondre à la question automatique avec le webRépondre à la question automatique avec le web
Répondre à la question automatique avec le web
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - Document
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question Matching
 
Eswc2009
Eswc2009Eswc2009
Eswc2009
 
midterm_fa08.pdf
midterm_fa08.pdfmidterm_fa08.pdf
midterm_fa08.pdf
 
Nature Inspired Models And The Semantic Web
Nature Inspired Models And The Semantic WebNature Inspired Models And The Semantic Web
Nature Inspired Models And The Semantic Web
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...
A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...
A Comparison Of The Rule And Case-Based Reasoning Approaches For The Automati...
 
Ssbse12b.ppt
Ssbse12b.pptSsbse12b.ppt
Ssbse12b.ppt
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approach
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine Learning
 

Recently uploaded

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 

Recently uploaded (20)

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 

Quora Duplicate Question Detection Using Semantic Analysis

  • 1. Guided by- Ms. Safa Hamdare Group Members . Quora Duplicate Question Pair Detection Using Semantic Analysis Name Roll No. Jai Mulye 64 Anshul Pawaskar 87 Tannmay Redij 88 Akshata Talankar 89 St. Francis Institute of Technology Department of Computer Engineering Quora Duplicate Question Pair Detection using Semantic Analysis 1 28/05/2021
  • 2. Content ● Introduction ● Literature ● Problem Statement ● Proposed Solution ● Work Flow of the system ● Algorithm with Implementation details ● Experimental Set Up ● Data Set ● Performance Evaluation Parameters ● Validation with Test Cases ● Results & Discussion ● Conclusion ● References 28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 2
  • 3. Introduction • What is Quora? 28/05/2021 3 Quora Duplicate Question Pair Detection using Semantic Analysis
  • 4. Current Scenario: Quora uses Random Forest technique to identify duplicate questions. Let’s look at two hypothetical questions: 1. Is it true that time flies like an arrow? 2. Do fruit flies like a banana? There are two common words in these questions, flies and like. 4 28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 4
  • 5. Let’s consider these 5 28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 5
  • 6. Literature • The paper[1] explores the Transformer based Universal Sentence Encoder which relies on attention mechanism. • The paper[2] introduces Deep Averaging Network which performs well with neural networks that model semantic and syntactic compositionality. 6 28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis
  • 7. Literature • The paper cited [3] explores the two variants of Universal Sentence Encoder- the transformer and the deep averaging network (DAN). • The paper cited [4] analyses several neural network designs and their variations for sentence pair modelling and compare their performance extensively across eight datasets, including paraphrase identification, semantic textual similarity, natural language inference, and question answering tasks. 7 28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis
  • 8. Problem Statement • On Quora, there may be people who might ask same questions differently from an existing question. Solving this problem will help to reduce the redundancy on the platform and the manual task of identifying the questions to match the correct answer for same. The task to identify which questions asked on Quora are duplicates of questions that have already been asked could be useful to instantly provide answers of existing questions. • A model created which can predict if the questions entered are similar in meaning based on deep learning approach using DAN & Transformer model. 28/05/2021 8 Quora Duplicate Question Pair Detection using Semantic Analysis
  • 9. Proposed Solution 1. Pre Processing 3. Deep Learning Approach (DAN & Transformer) 2. Sentence to Vector Conversion (USE) 28/05/2021 Quora Duplicate Question Pair Detection using Semantic Analysis 9 Fig 1: Workflow of the System
  • 10. Work Flow of the system 28/05/2021 10 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 2: Architecture Diagram
  • 11. Algorithm with Implementation Details 28/05/2021 11 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 3: Algorithm
  • 12. Algorithm with Implementation Details 28/05/2021 12 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 4: Implementation
  • 13. Experimental Setup 28/05/2021 13 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 5: Dataset[5]
  • 14. Experimental Setup 28/05/2021 14 Fig 6: Model accuracy of Transformer Fig 7: Model loss of Transformer Quora Duplicate Question Pair Detection using Semantic Analysis
  • 15. Experimental Setup 28/05/2021 15 Fig 8: Model accuracy of DAN Fig 9: Model loss of DAN Quora Duplicate Question Pair Detection using Semantic Analysis
  • 16. Validation with Test cases 28/05/2021 16 Quora Duplicate Question Pair Detection using Semantic Analysis
  • 17. Results and Discussions 28/05/2021 17 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 10: Browse Questions
  • 18. Results and Discussions 28/05/2021 18 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 11: Post Questions
  • 19. Results and Discussions 28/05/2021 19 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 12: Results by DAN Model
  • 20. Results and Discussions 28/05/2021 20 Quora Duplicate Question Pair Detection using Semantic Analysis Fig 13: Results by Transformer Model
  • 21. Conclusion 28/05/2021 21 Quora Duplicate Question Pair Detection using Semantic Analysis Model Embedding technique F1-score weighted average F1- Score macro average Logistic Regression Word2Vec, Similarity scores 0.66 0.62 Random Forest Word2Vec, Similarity scores 0.70 0.69 Table 1:Accuracy of machine learning models
  • 22. Conclusion 28/05/2021 22 Quora Duplicate Question Pair Detection using Semantic Analysis Table 2:Accuracy of Deep learning models (DAN & Transformer) Model Embedding technique Epochs Training accuracy (%) Validation accuracy (%) Neural Network Universal Sentence Encoder (DAN) 20 88.63 86 Neural Network Universal Sentence Encoder (Transformer) 20 89.16 85
  • 23. Conclusion • Deep learning models using sentence level embedding outperform the basic classification model. • DAN Model sometimes under performs with the questions having double negation. • Transformer based Universal Sentence Encoder can be used. 28/05/2021 23 Quora Duplicate Question Pair Detection using Semantic Analysis
  • 24. References [1] Mueller J, Thyagarajan A. Siamese recurrent architectures for learning sentence similarity. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. (2016) [2] Eneko Agirre, Aitor Gonzalez-Agirre, Inigo Lopez-Gazpio, Montse Maritxalar, German Rigau, and Larraitz Uria. Semeval-2016 task 2: Interpretable semantic textual similarity. In: Proceedings of the 10th International Workshop on Semantic Evaluation (2016). [3] Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, pp. 5998-6008. 2017. (2017) [4] Cer D, Yang Y, Kong S-Y, et al. Universal Sentence Encoder for English. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. doi: 10.18653/v1/d18-2029 (2018) [5] https://www.kaggle.com/c/quora-question-pairs/data 28/05/2021 24 Quora Duplicate Question Pair Detection using Semantic Analysis
  • 25. 28/05/2021 25 Thank you Quora Duplicate Question Pair Detection using Semantic Analysis