SlideShare une entreprise Scribd logo
Google Duplex
By
Deepak Sanaka
Contents
● Introduction
● Abstract
● Context about Google Duplex
● Architecture
● DNNs and RNNs
● Closed domains and Vanishing gradient problem
● Process Flow
Introduction
A long-standing goal of human-computer interaction has
been to enable people to have a natural conversation with
computers, as they would with each other. In recent years,
we have witnessed a revolution in the ability of computers to
understand and to generate natural speech, especially with
the application of deep neural networks (e.g., Google voice
search, WaveNet).
Abstract
Google Duplex, It is a new technology for conducting
natural conversations to carry out “real world” tasks over the
phone. The technology is directed towards completing
specific tasks, such as scheduling certain types of
appointments. For such tasks, the system makes the
conversational experience as natural as possible, allowing
people to speak normally, like they would to another person,
without having to adapt to a machine.
Defining a natural conversation
A natural conversation can be described with the following
characteristics:
● Speaker is exhibiting goal-directed, cooperative, rational
behavior.
● Speaker is using the appropriate tone.
● Speaker can understand and control the conversational
flow and use the right timing.
What is Google Duplex?
● Google Duplex is an artificial intelligence (AI) chat agent
that can carry out specific verbal tasks, such as making a
reservation or appointment, over the phone.
● It works to conduct natural conversations to
accomplish certain types of tasks.
Closed domain operation
Google Duplex is not able to carry out random casual
conversation. Rather, it was trained to autonomously handle
three specific types of tasks:
● Scheduling a hair salon appointment,
● Making a restaurant reservation, and
● Asking about the business hours of a store.
How does Google Duplex model natural
conversations?
● Duplex uses a deep neural network (DNN); in more
complex cases, it makes use of a recurrent neural
network (RNN) which is more expensive, but better at
modeling language.
● At the core of Duplex is a recurrent neural network (RNN)
designed to cope with these challenges, built using
TensorFlow Extended (TFX).
Architecture
Incoming sound is processed through an Automatic Speech Recognition (ASR) system.
This produces text that is analyzed with context data and other inputs to produce a
response text that is read aloud through the Text-to-Speech (TTS) system.
Deep Neural Networks (DNNs)
● DNNs involve an input layer, a hidden layer (the matrix
of weights which is trained against data), and an
output layer capable of producing what can be
interpreted as a prediction or a classification.
Recurrent Neural Networks (RNNs)
RNNs not only ingest the current
input, they also ingest their past
hidden state as well. This allows
for them to learn sequential
patterns.
“Rolled up” RNN
“Unrolled” RNN
DNNs versus RNNs
● DNNs are good at one-shot prediction—if a single
observation is all it takes to produce suitable output.
● However, oftentimes, data comes in sequences, esp. for
a language it arrives in a specific sequence. It’s for this
reason that RNNs are used.
● Since it is very important to remember the context when
conducting a longer human-like conversation, RNNs
became one of the obvious, go-to choices to do the job.
Why closed domain operation is important?
● Closed domains are loosely defined as any setting that
has a limited number of conceivable interactions.
● Any closed domain has a sort of closed (and well-worn)
number of conversational paths and options.
● When a domain is closed, conversations are
pigeonholed—the same sorts of conversations occur over
and over, building up a stronger dataset for harder-to-
reach features such as natural timing, knowing
industry/trade slang, and so on.
Advantages of closed domain operation
● It has a number of advantages, but a major one is that it
helps Duplex avoid the “vanishing gradient problem,”
which is an issue for many DNNs and RNNs alike.
● It increases the sample size for particular conversational
paths in Duplex’s training data.
Vanishing Gradient Problem
● When many hidden layers are stacked such as in a multi-
layer DNN or between time steps in an RNN, the network
begins to “forget” the past.
● As the network goes through multiple layers of words, the
original context gets lost, so it fails to capture the
relationship between the words that stand far apart in a
conversation.
● This happens due to the underlying mechanics of
backpropagation.
Illustration of vanishing gradients
● Given a closed domain, the
number of times one has
to look into the past is
constrained.
● Vanishing gradients aren’t
as much of an issue if you
don’t need to remember
much.
Understanding Nuances
● When many hidden layers are stacked such as in a
multi-layer DNN or between time steps in an RNN, the
network begins to “forget” the past.
● In the above example, we can see how the meaning of
“OK for 4” changes in different contexts.
Process Flow
Conclusion
Allowing people to interact with technology as naturally as
they interact with each other has been a long standing
promise. Google Duplex takes a step in this direction,
making interaction with technology via natural
conversation a reality in specific scenarios.
References
● https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-
conversation.html
● https://willowtreeapps.com/ideas/an-introduction-to-google-duplex-and-
natural-conversations
Thank You

Contenu connexe

Tendances

Hand gesture recognition system(FYP REPORT)
Hand gesture recognition system(FYP REPORT)Hand gesture recognition system(FYP REPORT)
Hand gesture recognition system(FYP REPORT)
Afnan Rehman
 
MobileNet - PR044
MobileNet - PR044MobileNet - PR044
MobileNet - PR044
Jinwon Lee
 
Neural network
Neural networkNeural network
Neural network
Saddam Hussain
 
Jini technology ppt
Jini technology pptJini technology ppt
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
Shashi Perera
 
Google glass ppt
Google glass pptGoogle glass ppt
Google glass ppt
Nidhin P Koshy
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
Mark Chang
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
Poo Kuan Hoong
 
DALLE-2.pptx
DALLE-2.pptxDALLE-2.pptx
DALLE-2.pptx
PIRSALMANSHAH
 
JINI Technology
JINI TechnologyJINI Technology
JINI Technology
Rachna Singh
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRU
Sri Geetha
 
Fog Computing
Fog ComputingFog Computing
Fog Computing
Joud Khattab
 
Neuromorphic computing
Neuromorphic computingNeuromorphic computing
Neuromorphic computing
SreekuttanJayakumar
 
What is Deep Learning?
What is Deep Learning?What is Deep Learning?
What is Deep Learning?
NVIDIA
 
PyTorch Introduction
PyTorch IntroductionPyTorch Introduction
PyTorch Introduction
Yash Kawdiya
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
Sandeep Singh
 
Deep Learning for Graphs
Deep Learning for GraphsDeep Learning for Graphs
Deep Learning for Graphs
DeepLearningBlr
 
Market oriented Cloud Computing
Market oriented Cloud ComputingMarket oriented Cloud Computing
Market oriented Cloud Computing
Jithin Parakka
 
ARTIFICIAL INTELLIGENCE & NEURAL NETWORKS
ARTIFICIAL INTELLIGENCE & NEURAL NETWORKSARTIFICIAL INTELLIGENCE & NEURAL NETWORKS
ARTIFICIAL INTELLIGENCE & NEURAL NETWORKS
Er Kaushal
 

Tendances (20)

Hand gesture recognition system(FYP REPORT)
Hand gesture recognition system(FYP REPORT)Hand gesture recognition system(FYP REPORT)
Hand gesture recognition system(FYP REPORT)
 
MobileNet - PR044
MobileNet - PR044MobileNet - PR044
MobileNet - PR044
 
Neural network
Neural networkNeural network
Neural network
 
Jini technology ppt
Jini technology pptJini technology ppt
Jini technology ppt
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
 
Google glass ppt
Google glass pptGoogle glass ppt
Google glass ppt
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
 
DALLE-2.pptx
DALLE-2.pptxDALLE-2.pptx
DALLE-2.pptx
 
JINI Technology
JINI TechnologyJINI Technology
JINI Technology
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRU
 
Fog Computing
Fog ComputingFog Computing
Fog Computing
 
Neuromorphic computing
Neuromorphic computingNeuromorphic computing
Neuromorphic computing
 
What is Deep Learning?
What is Deep Learning?What is Deep Learning?
What is Deep Learning?
 
PyTorch Introduction
PyTorch IntroductionPyTorch Introduction
PyTorch Introduction
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
 
Deep Learning for Graphs
Deep Learning for GraphsDeep Learning for Graphs
Deep Learning for Graphs
 
Market oriented Cloud Computing
Market oriented Cloud ComputingMarket oriented Cloud Computing
Market oriented Cloud Computing
 
ARTIFICIAL INTELLIGENCE & NEURAL NETWORKS
ARTIFICIAL INTELLIGENCE & NEURAL NETWORKSARTIFICIAL INTELLIGENCE & NEURAL NETWORKS
ARTIFICIAL INTELLIGENCE & NEURAL NETWORKS
 

Similaire à Google Duplex

Tensorflow
TensorflowTensorflow
Tensorflow
Knoldus Inc.
 
Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11
iansyst
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R Open
Poo Kuan Hoong
 
Understanding deep learning
Understanding deep learningUnderstanding deep learning
Understanding deep learning
Dr. Stylianos Kampakis
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
Shreyas Suresh Rao
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
oranisalcani
 
Short story presentation
Short story presentationShort story presentation
Short story presentation
StutiAgarwal36
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
Natasha Latysheva
 
Intro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdfIntro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdf
omardesoky789
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
Poo Kuan Hoong
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
Anuj Gupta
 
Os Lamothe
Os LamotheOs Lamothe
Os Lamothe
oscon2007
 
Image captioning
Image captioningImage captioning
Image captioning
Muhammad Zbeedat
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj
 
NUMENTA.pptx
NUMENTA.pptxNUMENTA.pptx
NUMENTA.pptx
UmaBhavadharini
 
Keras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learningKeras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learning
Dr. Ananth Krishnamoorthy
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022
Kwanghee Choi
 
Video-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer ModelsVideo-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer Models
Raghava Urs
 

Similaire à Google Duplex (20)

Tensorflow
TensorflowTensorflow
Tensorflow
 
Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11Assess 2012 dragon 11 preview dsa11
Assess 2012 dragon 11 preview dsa11
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R Open
 
Understanding deep learning
Understanding deep learningUnderstanding deep learning
Understanding deep learning
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
 
Short story presentation
Short story presentationShort story presentation
Short story presentation
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
 
Intro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdfIntro.to RNN (Recurrent Neural Network).pdf
Intro.to RNN (Recurrent Neural Network).pdf
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Os Lamothe
Os LamotheOs Lamothe
Os Lamothe
 
Image captioning
Image captioningImage captioning
Image captioning
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
NUMENTA.pptx
NUMENTA.pptxNUMENTA.pptx
NUMENTA.pptx
 
Keras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learningKeras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learning
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022
 
Video-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer ModelsVideo-Language Pre-training based on Transformer Models
Video-Language Pre-training based on Transformer Models
 

Dernier

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 

Dernier (20)

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 

Google Duplex

  • 2. Contents ● Introduction ● Abstract ● Context about Google Duplex ● Architecture ● DNNs and RNNs ● Closed domains and Vanishing gradient problem ● Process Flow
  • 3. Introduction A long-standing goal of human-computer interaction has been to enable people to have a natural conversation with computers, as they would with each other. In recent years, we have witnessed a revolution in the ability of computers to understand and to generate natural speech, especially with the application of deep neural networks (e.g., Google voice search, WaveNet).
  • 4. Abstract Google Duplex, It is a new technology for conducting natural conversations to carry out “real world” tasks over the phone. The technology is directed towards completing specific tasks, such as scheduling certain types of appointments. For such tasks, the system makes the conversational experience as natural as possible, allowing people to speak normally, like they would to another person, without having to adapt to a machine.
  • 5. Defining a natural conversation A natural conversation can be described with the following characteristics: ● Speaker is exhibiting goal-directed, cooperative, rational behavior. ● Speaker is using the appropriate tone. ● Speaker can understand and control the conversational flow and use the right timing.
  • 6. What is Google Duplex? ● Google Duplex is an artificial intelligence (AI) chat agent that can carry out specific verbal tasks, such as making a reservation or appointment, over the phone. ● It works to conduct natural conversations to accomplish certain types of tasks.
  • 7. Closed domain operation Google Duplex is not able to carry out random casual conversation. Rather, it was trained to autonomously handle three specific types of tasks: ● Scheduling a hair salon appointment, ● Making a restaurant reservation, and ● Asking about the business hours of a store.
  • 8. How does Google Duplex model natural conversations? ● Duplex uses a deep neural network (DNN); in more complex cases, it makes use of a recurrent neural network (RNN) which is more expensive, but better at modeling language. ● At the core of Duplex is a recurrent neural network (RNN) designed to cope with these challenges, built using TensorFlow Extended (TFX).
  • 9. Architecture Incoming sound is processed through an Automatic Speech Recognition (ASR) system. This produces text that is analyzed with context data and other inputs to produce a response text that is read aloud through the Text-to-Speech (TTS) system.
  • 10. Deep Neural Networks (DNNs) ● DNNs involve an input layer, a hidden layer (the matrix of weights which is trained against data), and an output layer capable of producing what can be interpreted as a prediction or a classification.
  • 11. Recurrent Neural Networks (RNNs) RNNs not only ingest the current input, they also ingest their past hidden state as well. This allows for them to learn sequential patterns. “Rolled up” RNN “Unrolled” RNN
  • 12. DNNs versus RNNs ● DNNs are good at one-shot prediction—if a single observation is all it takes to produce suitable output. ● However, oftentimes, data comes in sequences, esp. for a language it arrives in a specific sequence. It’s for this reason that RNNs are used. ● Since it is very important to remember the context when conducting a longer human-like conversation, RNNs became one of the obvious, go-to choices to do the job.
  • 13. Why closed domain operation is important? ● Closed domains are loosely defined as any setting that has a limited number of conceivable interactions. ● Any closed domain has a sort of closed (and well-worn) number of conversational paths and options. ● When a domain is closed, conversations are pigeonholed—the same sorts of conversations occur over and over, building up a stronger dataset for harder-to- reach features such as natural timing, knowing industry/trade slang, and so on.
  • 14. Advantages of closed domain operation ● It has a number of advantages, but a major one is that it helps Duplex avoid the “vanishing gradient problem,” which is an issue for many DNNs and RNNs alike. ● It increases the sample size for particular conversational paths in Duplex’s training data.
  • 15. Vanishing Gradient Problem ● When many hidden layers are stacked such as in a multi- layer DNN or between time steps in an RNN, the network begins to “forget” the past. ● As the network goes through multiple layers of words, the original context gets lost, so it fails to capture the relationship between the words that stand far apart in a conversation. ● This happens due to the underlying mechanics of backpropagation.
  • 16. Illustration of vanishing gradients ● Given a closed domain, the number of times one has to look into the past is constrained. ● Vanishing gradients aren’t as much of an issue if you don’t need to remember much.
  • 17. Understanding Nuances ● When many hidden layers are stacked such as in a multi-layer DNN or between time steps in an RNN, the network begins to “forget” the past. ● In the above example, we can see how the meaning of “OK for 4” changes in different contexts.
  • 19. Conclusion Allowing people to interact with technology as naturally as they interact with each other has been a long standing promise. Google Duplex takes a step in this direction, making interaction with technology via natural conversation a reality in specific scenarios.