SlideShare une entreprise Scribd logo
1  sur  21
Google Duplex
By
Deepak Sanaka
Contents
● Introduction
● Abstract
● Context about Google Duplex
● Architecture
● DNNs and RNNs
● Closed domains and Vanishing gradient problem
● Process Flow
Introduction
A long-standing goal of human-computer interaction has
been to enable people to have a natural conversation with
computers, as they would with each other. In recent years,
we have witnessed a revolution in the ability of computers to
understand and to generate natural speech, especially with
the application of deep neural networks (e.g., Google voice
search, WaveNet).
Abstract
Google Duplex, It is a new technology for conducting
natural conversations to carry out “real world” tasks over the
phone. The technology is directed towards completing
specific tasks, such as scheduling certain types of
appointments. For such tasks, the system makes the
conversational experience as natural as possible, allowing
people to speak normally, like they would to another person,
without having to adapt to a machine.
Defining a natural conversation
A natural conversation can be described with the following
characteristics:
● Speaker is exhibiting goal-directed, cooperative, rational
behavior.
● Speaker is using the appropriate tone.
● Speaker can understand and control the conversational
flow and use the right timing.
What is Google Duplex?
● Google Duplex is an artificial intelligence (AI) chat agent
that can carry out specific verbal tasks, such as making a
reservation or appointment, over the phone.
● It works to conduct natural conversations to
accomplish certain types of tasks.
Closed domain operation
Google Duplex is not able to carry out random casual
conversation. Rather, it was trained to autonomously handle
three specific types of tasks:
● Scheduling a hair salon appointment,
● Making a restaurant reservation, and
● Asking about the business hours of a store.
How does Google Duplex model natural
conversations?
● Duplex uses a deep neural network (DNN); in more
complex cases, it makes use of a recurrent neural
network (RNN) which is more expensive, but better at
modeling language.
● At the core of Duplex is a recurrent neural network (RNN)
designed to cope with these challenges, built using
TensorFlow Extended (TFX).
Architecture
Incoming sound is processed through an Automatic Speech Recognition (ASR) system.
This produces text that is analyzed with context data and other inputs to produce a
response text that is read aloud through the Text-to-Speech (TTS) system.
Deep Neural Networks (DNNs)
● DNNs involve an input layer, a hidden layer (the matrix
of weights which is trained against data), and an
output layer capable of producing what can be
interpreted as a prediction or a classification.
Recurrent Neural Networks (RNNs)
RNNs not only ingest the current
input, they also ingest their past
hidden state as well. This allows
for them to learn sequential
patterns.
“Rolled up” RNN
“Unrolled” RNN
DNNs versus RNNs
● DNNs are good at one-shot prediction—if a single
observation is all it takes to produce suitable output.
● However, oftentimes, data comes in sequences, esp. for
a language it arrives in a specific sequence. It’s for this
reason that RNNs are used.
● Since it is very important to remember the context when
conducting a longer human-like conversation, RNNs
became one of the obvious, go-to choices to do the job.
Why closed domain operation is important?
● Closed domains are loosely defined as any setting that
has a limited number of conceivable interactions.
● Any closed domain has a sort of closed (and well-worn)
number of conversational paths and options.
● When a domain is closed, conversations are
pigeonholed—the same sorts of conversations occur over
and over, building up a stronger dataset for harder-to-
reach features such as natural timing, knowing
industry/trade slang, and so on.
Advantages of closed domain operation
● It has a number of advantages, but a major one is that it
helps Duplex avoid the “vanishing gradient problem,”
which is an issue for many DNNs and RNNs alike.
● It increases the sample size for particular conversational
paths in Duplex’s training data.
Vanishing Gradient Problem
● When many hidden layers are stacked such as in a multi-
layer DNN or between time steps in an RNN, the network
begins to “forget” the past.
● As the network goes through multiple layers of words, the
original context gets lost, so it fails to capture the
relationship between the words that stand far apart in a
conversation.
● This happens due to the underlying mechanics of
backpropagation.
Illustration of vanishing gradients
● Given a closed domain, the
number of times one has
to look into the past is
constrained.
● Vanishing gradients aren’t
as much of an issue if you
don’t need to remember
much.
Understanding Nuances
● When many hidden layers are stacked such as in a
multi-layer DNN or between time steps in an RNN, the
network begins to “forget” the past.
● In the above example, we can see how the meaning of
“OK for 4” changes in different contexts.
Process Flow
Conclusion
Allowing people to interact with technology as naturally as
they interact with each other has been a long standing
promise. Google Duplex takes a step in this direction,
making interaction with technology via natural
conversation a reality in specific scenarios.
References
● https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-
conversation.html
● https://willowtreeapps.com/ideas/an-introduction-to-google-duplex-and-
natural-conversations
Thank You

Contenu connexe

Tendances

Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentationSai Mohith
 
google tango technology Seminar report
google tango technology Seminar reportgoogle tango technology Seminar report
google tango technology Seminar reportRUPESHKUMAR633
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingVeenaSKumar2
 
Cluster Computing Seminar.
Cluster Computing Seminar.Cluster Computing Seminar.
Cluster Computing Seminar.Balvant Biradar
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM健程 杨
 
Distributed shared memory shyam soni
Distributed shared memory shyam soniDistributed shared memory shyam soni
Distributed shared memory shyam soniShyam Soni
 
EMC Cloud Management
EMC Cloud ManagementEMC Cloud Management
EMC Cloud ManagementCenk Ersoy
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingCloudxLab
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing Adarsh Saxena
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks
 

Tendances (20)

Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
google tango technology Seminar report
google tango technology Seminar reportgoogle tango technology Seminar report
google tango technology Seminar report
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Cluster Computing Seminar.
Cluster Computing Seminar.Cluster Computing Seminar.
Cluster Computing Seminar.
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM
 
Distributed shared memory shyam soni
Distributed shared memory shyam soniDistributed shared memory shyam soni
Distributed shared memory shyam soni
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
EMC Cloud Management
EMC Cloud ManagementEMC Cloud Management
EMC Cloud Management
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep learning
Deep learning Deep learning
Deep learning
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Deep learning
Deep learningDeep learning
Deep learning
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
JINI Technology
JINI TechnologyJINI Technology
JINI Technology
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 

Similaire à Google Duplex

Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11iansyst
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenPoo Kuan Hoong
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsShreyas Suresh Rao
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdforanisalcani
 
Short story presentation
Short story presentationShort story presentation
Short story presentationStutiAgarwal36
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchNatasha Latysheva
 
Intro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdfIntro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdfomardesoky789
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with RPoo Kuan Hoong
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer ConnectAnuj Gupta
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 
Keras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learningKeras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learningDr. Ananth Krishnamoorthy
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022Kwanghee Choi
 
Video-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer ModelsVideo-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer ModelsRaghava Urs
 

Similaire à Google Duplex (20)

Tensorflow
TensorflowTensorflow
Tensorflow
 
Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R Open
 
Understanding deep learning
Understanding deep learningUnderstanding deep learning
Understanding deep learning
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
 
Short story presentation
Short story presentationShort story presentation
Short story presentation
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
 
Intro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdfIntro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdf
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Os Lamothe
Os LamotheOs Lamothe
Os Lamothe
 
Image captioning
Image captioningImage captioning
Image captioning
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
NUMENTA.pptx
NUMENTA.pptxNUMENTA.pptx
NUMENTA.pptx
 
Keras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learningKeras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learning
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022
 
Video-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer ModelsVideo-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer Models
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Google Duplex

  • 2. Contents ● Introduction ● Abstract ● Context about Google Duplex ● Architecture ● DNNs and RNNs ● Closed domains and Vanishing gradient problem ● Process Flow
  • 3. Introduction A long-standing goal of human-computer interaction has been to enable people to have a natural conversation with computers, as they would with each other. In recent years, we have witnessed a revolution in the ability of computers to understand and to generate natural speech, especially with the application of deep neural networks (e.g., Google voice search, WaveNet).
  • 4. Abstract Google Duplex, It is a new technology for conducting natural conversations to carry out “real world” tasks over the phone. The technology is directed towards completing specific tasks, such as scheduling certain types of appointments. For such tasks, the system makes the conversational experience as natural as possible, allowing people to speak normally, like they would to another person, without having to adapt to a machine.
  • 5. Defining a natural conversation A natural conversation can be described with the following characteristics: ● Speaker is exhibiting goal-directed, cooperative, rational behavior. ● Speaker is using the appropriate tone. ● Speaker can understand and control the conversational flow and use the right timing.
  • 6. What is Google Duplex? ● Google Duplex is an artificial intelligence (AI) chat agent that can carry out specific verbal tasks, such as making a reservation or appointment, over the phone. ● It works to conduct natural conversations to accomplish certain types of tasks.
  • 7. Closed domain operation Google Duplex is not able to carry out random casual conversation. Rather, it was trained to autonomously handle three specific types of tasks: ● Scheduling a hair salon appointment, ● Making a restaurant reservation, and ● Asking about the business hours of a store.
  • 8. How does Google Duplex model natural conversations? ● Duplex uses a deep neural network (DNN); in more complex cases, it makes use of a recurrent neural network (RNN) which is more expensive, but better at modeling language. ● At the core of Duplex is a recurrent neural network (RNN) designed to cope with these challenges, built using TensorFlow Extended (TFX).
  • 9. Architecture Incoming sound is processed through an Automatic Speech Recognition (ASR) system. This produces text that is analyzed with context data and other inputs to produce a response text that is read aloud through the Text-to-Speech (TTS) system.
  • 10. Deep Neural Networks (DNNs) ● DNNs involve an input layer, a hidden layer (the matrix of weights which is trained against data), and an output layer capable of producing what can be interpreted as a prediction or a classification.
  • 11. Recurrent Neural Networks (RNNs) RNNs not only ingest the current input, they also ingest their past hidden state as well. This allows for them to learn sequential patterns. “Rolled up” RNN “Unrolled” RNN
  • 12. DNNs versus RNNs ● DNNs are good at one-shot prediction—if a single observation is all it takes to produce suitable output. ● However, oftentimes, data comes in sequences, esp. for a language it arrives in a specific sequence. It’s for this reason that RNNs are used. ● Since it is very important to remember the context when conducting a longer human-like conversation, RNNs became one of the obvious, go-to choices to do the job.
  • 13. Why closed domain operation is important? ● Closed domains are loosely defined as any setting that has a limited number of conceivable interactions. ● Any closed domain has a sort of closed (and well-worn) number of conversational paths and options. ● When a domain is closed, conversations are pigeonholed—the same sorts of conversations occur over and over, building up a stronger dataset for harder-to- reach features such as natural timing, knowing industry/trade slang, and so on.
  • 14. Advantages of closed domain operation ● It has a number of advantages, but a major one is that it helps Duplex avoid the “vanishing gradient problem,” which is an issue for many DNNs and RNNs alike. ● It increases the sample size for particular conversational paths in Duplex’s training data.
  • 15. Vanishing Gradient Problem ● When many hidden layers are stacked such as in a multi- layer DNN or between time steps in an RNN, the network begins to “forget” the past. ● As the network goes through multiple layers of words, the original context gets lost, so it fails to capture the relationship between the words that stand far apart in a conversation. ● This happens due to the underlying mechanics of backpropagation.
  • 16. Illustration of vanishing gradients ● Given a closed domain, the number of times one has to look into the past is constrained. ● Vanishing gradients aren’t as much of an issue if you don’t need to remember much.
  • 17. Understanding Nuances ● When many hidden layers are stacked such as in a multi-layer DNN or between time steps in an RNN, the network begins to “forget” the past. ● In the above example, we can see how the meaning of “OK for 4” changes in different contexts.
  • 19. Conclusion Allowing people to interact with technology as naturally as they interact with each other has been a long standing promise. Google Duplex takes a step in this direction, making interaction with technology via natural conversation a reality in specific scenarios.