SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
Copyright © 2017. All rights reserved.
Pardon My French
And Other Adventures on the Road to
Enterprise Virtual Assistants
Editt Gonen-Friedman
Oracle Voice & Emerging Technologies
Editt.gonen-friedman@oracle.com
Copyright © 2017. All rights reserved.
Voice Interaction
“Voice-based technologies are the most important area of growth
for mobile user interfaces… hands-free use and always-on
interfaces will drive increased use of speech recognition…
enterprise application developers will need to accommodate
new ways of accepting input”. - Intelligence report, May 2015,
Tractica
“Enterprises are going to be affected by a worker’s need to do
more than type, click and swipe” - ITWC
“2016 will be the year of Conversational Commerce” – Chris
Messina on Medium
2
Copyright © 2017. All rights reserved.
What Does it Take to Build an Enterprise Virtual Assistant?
(ASR)
Automatic
Speech
Recognition
Voice UI
Dialog
Management
(NLU)
Natural
Language
Understanding
3
• Multiple
technologies must
come together to
build it.
Copyright © 2017. All rights reserved. 4
• Needs SR
• Build your own- advantages:
– Build a massive language corpus
(Google)
– Handle surround sound, priority by
proximity (Amazon’s Alexa)
– Use voice biometrics to identify
speaker (Alexa, Nuance)
• Or use 3rd party services
A Mobile Enterprise VA
Image source: itpro.co.uk
Copyright © 2017. All rights reserved. 5
• Speech service considerations:
– Footprint: local install (Sensory) or cloud
service
– Security: enterprise data is sensitive
– WER (word error rate)
– Device support
– Languages (global enterprise)
– Vocab customization: ability to add
recurring entity names and industry
jargon
Automatic
Speech
Recognition
Mobile Enterprise VA
Copyright © 2017. All rights reserved. 6
• Compared to a general purpose VA
– Supported actions are limited
– Context is limited
• Is it easier?
• As a rule there’s less ambiguity, but sometimes
need to resolve to less popular meaning
• Example:
– The user says: “Leads” or “Go to leads”
– Intent: navigate to the leads page in my speech-enabled
mobile app for sales
Speech Considerations: Vocabulary
Copyright © 2017. All rights reserved. 7
– Result: “Go to Leeds”. A general purpose VA might understand this to mean “bring
up the map for Leeds, England”
Speech Considerations: Vocabulary
Leeds, Northern England
– In this case we had to add “Leads” to the ASR custom vocabulary with an increased
‘weight’ of 50% instead of 10%
• Could also be solved at the NLP step, with full NLP that resolves ambiguity
Copyright © 2017. All rights reserved.
A Mobile Enterprise VA
8
Automatic
Speech
Recognition
Voice UI
• Needs voice interaction
design
– How to make it look like a
speech app?
– How to deal with command
discoverability?
– Can you ‘wake it’ with a key
word?
– In what case do you allow
touch and voice combo?
– How to indicate ‘listening’?
Copyright © 2017. All rights reserved.
A Mobile Enterprise VA
9
– Here are some attempts to answer those questions in a dedicated speech app,
Oracle Voice
Copyright © 2017. All rights reserved.
A Mobile Enterprise VA
10
– And this is a UI change to a regular app, Oracle Sales Cloud Mobile, where speech
capabilities have been added
Copyright © 2017. All rights reserved.
A Mobile Enterprise VA
11
• Needs dialog management
– One-step response: gives you
a simple answer or link, or
navigates you to another page
– Multi-step dialog: manages a
back-and-forth dialog in which
context is retained
– Perhaps add more useful
interactions
• Such as business content reading
(news, emails, app listings)
Dialog
Management
Automatic
Speech
Recognition
VUI
Dialog
Management
Copyright © 2017. All rights reserved.
A Mobile Enterprise VA
12
• Needs NLU
• Many 3rd party solutions
available
– It’s possible to start with a
basic solution that
understands a number of
meanings and intents and can
follow up with specific actions
and taskflows
– Soon you’ll run into the need
for full language and context
intelligence
Dialog
Management
Automatic
Speech
Recognition
VUI
Dialog
Management
Natural
Language
Understanding
Copyright © 2017. All rights reserved.
A Mobile Enterprise VA
13
• Robust NLU needs to resolve
ambiguity in context
– Leads vs. Leeds is a simple example
– ‘Diversification’ means ‘investment
variety’ in Finance, but ‘getting rid of
assets’ in Marketing
Image source: Oracle Intelligent UX
Copyright © 2017. All rights reserved.
A Mobile Enterprise VA
14
• Adding an NLU solution to
the mobile app is no simple
task
– Test performance
– Word error rate
– Intent error rate
Image source: Right Now Intent Guide
Copyright © 2017. All rights reserved.
Are We Done Yet?
15
• Users want language
support
Dialog
Management
Automatic
Speech
Recognition
VUI
Dialog
Management
Natural
Language
Understanding
Languages
Copyright © 2017. All rights reserved.
Languages
16
Source: Technology Review
– Your speech service recognizes in 40 languages – why
doesn’t your app?
The user
asked how
I’m doing
Respond
that I’m
doing well
How are
you doing?
Speech
Engine
How are
you doing?
Copyright © 2017. All rights reserved.
Languages
17
Source: Technology Review
I have no
idea what
that means
Error
handling
Comment
allez-vous?
Speech
Engine
Comment
allez-vous?
– A user speaks French. SR output is French text.
Copyright © 2017. All rights reserved.
Languages
18
• A middle step is missing, a translation, or a mapping
• You could translate the text to English before further processing, or-
• You could add NLP in other languages
– When adding NLP in other languages you also essentially add a mapping between key
words in English that associate intent with actions, and the corresponding words in
the other supported languages.
Source: Technology Review
Copyright © 2017. All rights reserved.
Languages
19
Source: Technology Review
• Translation services work differently, using statistics on many translated
examples
• In a late 2016 blog post Googlers’ implied that Google’s AI translation tool
seems to have invented its own secret internal language, an internal
representation, a machine initiated mapping
– The tool was trained to translate between English and Korean, and between English and
Japanese
– The team found that the tool has spontaneously acquired the ability to translate
between Korean and Japanese
– Science fiction? Read here:
https://techcrunch.com/2016/11/22/googles-ai-translation-tool-seems-to-have-invented-its-own-secret-internal-language/
Copyright © 2017. All rights reserved.
Languages
20
Source: Technology Review
A visualization of the translation system’s memory when translating a
single sentence in multiple directions.
Copyright © 2017. All rights reserved.
Are We Done Yet?
21
• Users want AI
Source: Technology Review
Copyright © 2017. All rights reserved.
Analytics=AI
Automatic
Speech
Recognition
VUI Design
Dialog
Management
Natural
Language
Understanding
22
Languages Analytics
• Users want AI
• What they are really asking
for is analytics
• Simple analytics gives you
hindsight about what
happened
Copyright © 2017. All rights reserved.
What Does It Take to Build an Enterprise Virtual Assistant?
Automatic
Speech
Recognition
VUI
Dialog
Management
Natural
Language
Understanding
Languages
Descriptive
Analytics
Predictive
Analytics
Internet of
Things
Prescriptive
Analytics
Machine
Learning
23
• Descriptive analytics allows answering
more complex questions and gives you
insight about what’s happening
• Predictive analytics gives you foresight
about what will happen. It should also
pull data from the real world
• Prescriptive analytics tells you what to
do to get specific outcomes
• Machine learning makes sure the
system gets better and smarter with
every interaction
Copyright © 2017. All rights reserved.
That’s What It Takes to Build an Enterprise Virtual Assistant
Automatic
Speech
Recognition
VUI
Dialog
Management
Natural
Language
Understanding
Languages
Descriptive
Analytics
Predictive
Analytics
Internet of
Things
Prescriptive
Analytics
Machine
Learning
24
When will you be done?
Source: http://theegeek.com/artificial-intelligence/
Copyright © 2017. All rights reserved.
Editt Gonen-Friedman
Editt.gonenfr@gmail.com
Editt.gonen-friedman@oracle.com
https://www.linkedin.com/in/editt
25

Contenu connexe

Tendances

An communication app for hearing impaired groups
An communication app for hearing impaired groupsAn communication app for hearing impaired groups
An communication app for hearing impaired groups
Vanessa Li
 

Tendances (15)

Farmer's handicapped son develops hindi browser cx otoday
Farmer's handicapped son develops hindi browser   cx otodayFarmer's handicapped son develops hindi browser   cx otoday
Farmer's handicapped son develops hindi browser cx otoday
 
Native, Web App, or Hybrid: Which Should You Choose?
Native, Web App, or Hybrid: Which Should You Choose?Native, Web App, or Hybrid: Which Should You Choose?
Native, Web App, or Hybrid: Which Should You Choose?
 
Prijector - Meeting Room In a Box.
Prijector - Meeting Room In a Box. Prijector - Meeting Room In a Box.
Prijector - Meeting Room In a Box.
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
The Mobile Learning Tipping Point
The Mobile Learning Tipping PointThe Mobile Learning Tipping Point
The Mobile Learning Tipping Point
 
Few Chatbots Expert Interview Questions & Answer For Freshers
Few Chatbots Expert Interview Questions & Answer For FreshersFew Chatbots Expert Interview Questions & Answer For Freshers
Few Chatbots Expert Interview Questions & Answer For Freshers
 
Open Source Governance for your Organization
Open Source Governance for your OrganizationOpen Source Governance for your Organization
Open Source Governance for your Organization
 
An communication app for hearing impaired groups
An communication app for hearing impaired groupsAn communication app for hearing impaired groups
An communication app for hearing impaired groups
 
Technology management
Technology managementTechnology management
Technology management
 
Mobile Strategy 2013
Mobile Strategy 2013Mobile Strategy 2013
Mobile Strategy 2013
 
Putting Mobile First by Lindsay Herbert
Putting Mobile First by Lindsay HerbertPutting Mobile First by Lindsay Herbert
Putting Mobile First by Lindsay Herbert
 
ROI of Mobile Learning
ROI of Mobile LearningROI of Mobile Learning
ROI of Mobile Learning
 
Jan Šedivý - Intelligent Personal Assistants
Jan Šedivý - Intelligent Personal AssistantsJan Šedivý - Intelligent Personal Assistants
Jan Šedivý - Intelligent Personal Assistants
 
Artificially Intelligent chatbot Implementation
Artificially Intelligent chatbot ImplementationArtificially Intelligent chatbot Implementation
Artificially Intelligent chatbot Implementation
 
Ycs iphone-development
Ycs iphone-developmentYcs iphone-development
Ycs iphone-development
 

En vedette

En vedette (14)

Props media
Props mediaProps media
Props media
 
6706 bench joinery unit 212
6706 bench joinery unit 212 6706 bench joinery unit 212
6706 bench joinery unit 212
 
Glycogenesis
Glycogenesis Glycogenesis
Glycogenesis
 
State of art
State of artState of art
State of art
 
Technology Vision 2017 infographic
Technology Vision 2017 infographicTechnology Vision 2017 infographic
Technology Vision 2017 infographic
 
The Next Tsunami AI Blockchain IOT and Our Swarm Evolutionary Singularity
The Next Tsunami AI Blockchain IOT and Our Swarm Evolutionary SingularityThe Next Tsunami AI Blockchain IOT and Our Swarm Evolutionary Singularity
The Next Tsunami AI Blockchain IOT and Our Swarm Evolutionary Singularity
 
designing innovation, insitutions for social transformation D1s3 gupta anil i...
designing innovation, insitutions for social transformation D1s3 gupta anil i...designing innovation, insitutions for social transformation D1s3 gupta anil i...
designing innovation, insitutions for social transformation D1s3 gupta anil i...
 
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
 
Recovery: Job Growth and Education Requirements Through 2020
Recovery: Job Growth and Education Requirements Through 2020Recovery: Job Growth and Education Requirements Through 2020
Recovery: Job Growth and Education Requirements Through 2020
 
Beyond the Gig Economy
Beyond the Gig EconomyBeyond the Gig Economy
Beyond the Gig Economy
 
African Americans: College Majors and Earnings
African Americans: College Majors and Earnings African Americans: College Majors and Earnings
African Americans: College Majors and Earnings
 
3 hard facts shaping higher education thinking and behavior
3 hard facts shaping higher education thinking and behavior3 hard facts shaping higher education thinking and behavior
3 hard facts shaping higher education thinking and behavior
 
The Online College Labor Market
The Online College Labor MarketThe Online College Labor Market
The Online College Labor Market
 
Game Based Learning for Language Learners
Game Based Learning for Language LearnersGame Based Learning for Language Learners
Game Based Learning for Language Learners
 

Similaire à Adventures on the Road to Enterprise Virtual Assistants

DigitalInvestor-BharatBhushan
DigitalInvestor-BharatBhushanDigitalInvestor-BharatBhushan
DigitalInvestor-BharatBhushan
Bharat Bhushan
 

Similaire à Adventures on the Road to Enterprise Virtual Assistants (20)

Creating a unified consistent UI language for scalable apps
Creating a unified consistent UI language for scalable appsCreating a unified consistent UI language for scalable apps
Creating a unified consistent UI language for scalable apps
 
Mobile design | development services
Mobile design | development servicesMobile design | development services
Mobile design | development services
 
sample PPT.pptx
sample PPT.pptxsample PPT.pptx
sample PPT.pptx
 
What Are The Most Popular Programming Languages For Mobile Application?
What Are The Most Popular Programming Languages For Mobile Application?What Are The Most Popular Programming Languages For Mobile Application?
What Are The Most Popular Programming Languages For Mobile Application?
 
Programming Languages Part II (1).pptx
Programming Languages Part II (1).pptxProgramming Languages Part II (1).pptx
Programming Languages Part II (1).pptx
 
Gender.AI Natural Language AI Startup that didn't get funded in 2015.
Gender.AI Natural Language AI Startup that didn't get funded in 2015.Gender.AI Natural Language AI Startup that didn't get funded in 2015.
Gender.AI Natural Language AI Startup that didn't get funded in 2015.
 
App localization in a new way
App localization in a new wayApp localization in a new way
App localization in a new way
 
Essential Skills Your Next App Development Company Must Have.pdf
Essential Skills Your Next App Development Company Must Have.pdfEssential Skills Your Next App Development Company Must Have.pdf
Essential Skills Your Next App Development Company Must Have.pdf
 
The ultimate guide and facts on cross platform app development in 2021.
The ultimate guide and facts on cross platform app development in 2021.The ultimate guide and facts on cross platform app development in 2021.
The ultimate guide and facts on cross platform app development in 2021.
 
DigitalInvestor-BharatBhushan
DigitalInvestor-BharatBhushanDigitalInvestor-BharatBhushan
DigitalInvestor-BharatBhushan
 
Ai app development venkat vajradhar - medium
Ai app development   venkat vajradhar - mediumAi app development   venkat vajradhar - medium
Ai app development venkat vajradhar - medium
 
Mobile App Development Tools For Building Apps
Mobile App Development Tools For Building AppsMobile App Development Tools For Building Apps
Mobile App Development Tools For Building Apps
 
Conversational Business - Trends
Conversational Business - TrendsConversational Business - Trends
Conversational Business - Trends
 
Increase App Downloads and Revenue with App Localization and LiveCode 7
Increase App Downloads and Revenue with App Localization and LiveCode 7Increase App Downloads and Revenue with App Localization and LiveCode 7
Increase App Downloads and Revenue with App Localization and LiveCode 7
 
Voice Tech TO #1
Voice Tech TO #1Voice Tech TO #1
Voice Tech TO #1
 
User Experience
User ExperienceUser Experience
User Experience
 
Types of mobile apps mobile app development
Types of mobile apps  mobile app developmentTypes of mobile apps  mobile app development
Types of mobile apps mobile app development
 
The UX Toolbelt for Developers
The UX Toolbelt for DevelopersThe UX Toolbelt for Developers
The UX Toolbelt for Developers
 
Putting Mobile First
Putting Mobile FirstPutting Mobile First
Putting Mobile First
 
Seminar: Putting Mobile First
Seminar: Putting Mobile FirstSeminar: Putting Mobile First
Seminar: Putting Mobile First
 

Adventures on the Road to Enterprise Virtual Assistants

  • 1. Copyright © 2017. All rights reserved. Pardon My French And Other Adventures on the Road to Enterprise Virtual Assistants Editt Gonen-Friedman Oracle Voice & Emerging Technologies Editt.gonen-friedman@oracle.com
  • 2. Copyright © 2017. All rights reserved. Voice Interaction “Voice-based technologies are the most important area of growth for mobile user interfaces… hands-free use and always-on interfaces will drive increased use of speech recognition… enterprise application developers will need to accommodate new ways of accepting input”. - Intelligence report, May 2015, Tractica “Enterprises are going to be affected by a worker’s need to do more than type, click and swipe” - ITWC “2016 will be the year of Conversational Commerce” – Chris Messina on Medium 2
  • 3. Copyright © 2017. All rights reserved. What Does it Take to Build an Enterprise Virtual Assistant? (ASR) Automatic Speech Recognition Voice UI Dialog Management (NLU) Natural Language Understanding 3 • Multiple technologies must come together to build it.
  • 4. Copyright © 2017. All rights reserved. 4 • Needs SR • Build your own- advantages: – Build a massive language corpus (Google) – Handle surround sound, priority by proximity (Amazon’s Alexa) – Use voice biometrics to identify speaker (Alexa, Nuance) • Or use 3rd party services A Mobile Enterprise VA Image source: itpro.co.uk
  • 5. Copyright © 2017. All rights reserved. 5 • Speech service considerations: – Footprint: local install (Sensory) or cloud service – Security: enterprise data is sensitive – WER (word error rate) – Device support – Languages (global enterprise) – Vocab customization: ability to add recurring entity names and industry jargon Automatic Speech Recognition Mobile Enterprise VA
  • 6. Copyright © 2017. All rights reserved. 6 • Compared to a general purpose VA – Supported actions are limited – Context is limited • Is it easier? • As a rule there’s less ambiguity, but sometimes need to resolve to less popular meaning • Example: – The user says: “Leads” or “Go to leads” – Intent: navigate to the leads page in my speech-enabled mobile app for sales Speech Considerations: Vocabulary
  • 7. Copyright © 2017. All rights reserved. 7 – Result: “Go to Leeds”. A general purpose VA might understand this to mean “bring up the map for Leeds, England” Speech Considerations: Vocabulary Leeds, Northern England – In this case we had to add “Leads” to the ASR custom vocabulary with an increased ‘weight’ of 50% instead of 10% • Could also be solved at the NLP step, with full NLP that resolves ambiguity
  • 8. Copyright © 2017. All rights reserved. A Mobile Enterprise VA 8 Automatic Speech Recognition Voice UI • Needs voice interaction design – How to make it look like a speech app? – How to deal with command discoverability? – Can you ‘wake it’ with a key word? – In what case do you allow touch and voice combo? – How to indicate ‘listening’?
  • 9. Copyright © 2017. All rights reserved. A Mobile Enterprise VA 9 – Here are some attempts to answer those questions in a dedicated speech app, Oracle Voice
  • 10. Copyright © 2017. All rights reserved. A Mobile Enterprise VA 10 – And this is a UI change to a regular app, Oracle Sales Cloud Mobile, where speech capabilities have been added
  • 11. Copyright © 2017. All rights reserved. A Mobile Enterprise VA 11 • Needs dialog management – One-step response: gives you a simple answer or link, or navigates you to another page – Multi-step dialog: manages a back-and-forth dialog in which context is retained – Perhaps add more useful interactions • Such as business content reading (news, emails, app listings) Dialog Management Automatic Speech Recognition VUI Dialog Management
  • 12. Copyright © 2017. All rights reserved. A Mobile Enterprise VA 12 • Needs NLU • Many 3rd party solutions available – It’s possible to start with a basic solution that understands a number of meanings and intents and can follow up with specific actions and taskflows – Soon you’ll run into the need for full language and context intelligence Dialog Management Automatic Speech Recognition VUI Dialog Management Natural Language Understanding
  • 13. Copyright © 2017. All rights reserved. A Mobile Enterprise VA 13 • Robust NLU needs to resolve ambiguity in context – Leads vs. Leeds is a simple example – ‘Diversification’ means ‘investment variety’ in Finance, but ‘getting rid of assets’ in Marketing Image source: Oracle Intelligent UX
  • 14. Copyright © 2017. All rights reserved. A Mobile Enterprise VA 14 • Adding an NLU solution to the mobile app is no simple task – Test performance – Word error rate – Intent error rate Image source: Right Now Intent Guide
  • 15. Copyright © 2017. All rights reserved. Are We Done Yet? 15 • Users want language support Dialog Management Automatic Speech Recognition VUI Dialog Management Natural Language Understanding Languages
  • 16. Copyright © 2017. All rights reserved. Languages 16 Source: Technology Review – Your speech service recognizes in 40 languages – why doesn’t your app? The user asked how I’m doing Respond that I’m doing well How are you doing? Speech Engine How are you doing?
  • 17. Copyright © 2017. All rights reserved. Languages 17 Source: Technology Review I have no idea what that means Error handling Comment allez-vous? Speech Engine Comment allez-vous? – A user speaks French. SR output is French text.
  • 18. Copyright © 2017. All rights reserved. Languages 18 • A middle step is missing, a translation, or a mapping • You could translate the text to English before further processing, or- • You could add NLP in other languages – When adding NLP in other languages you also essentially add a mapping between key words in English that associate intent with actions, and the corresponding words in the other supported languages. Source: Technology Review
  • 19. Copyright © 2017. All rights reserved. Languages 19 Source: Technology Review • Translation services work differently, using statistics on many translated examples • In a late 2016 blog post Googlers’ implied that Google’s AI translation tool seems to have invented its own secret internal language, an internal representation, a machine initiated mapping – The tool was trained to translate between English and Korean, and between English and Japanese – The team found that the tool has spontaneously acquired the ability to translate between Korean and Japanese – Science fiction? Read here: https://techcrunch.com/2016/11/22/googles-ai-translation-tool-seems-to-have-invented-its-own-secret-internal-language/
  • 20. Copyright © 2017. All rights reserved. Languages 20 Source: Technology Review A visualization of the translation system’s memory when translating a single sentence in multiple directions.
  • 21. Copyright © 2017. All rights reserved. Are We Done Yet? 21 • Users want AI Source: Technology Review
  • 22. Copyright © 2017. All rights reserved. Analytics=AI Automatic Speech Recognition VUI Design Dialog Management Natural Language Understanding 22 Languages Analytics • Users want AI • What they are really asking for is analytics • Simple analytics gives you hindsight about what happened
  • 23. Copyright © 2017. All rights reserved. What Does It Take to Build an Enterprise Virtual Assistant? Automatic Speech Recognition VUI Dialog Management Natural Language Understanding Languages Descriptive Analytics Predictive Analytics Internet of Things Prescriptive Analytics Machine Learning 23 • Descriptive analytics allows answering more complex questions and gives you insight about what’s happening • Predictive analytics gives you foresight about what will happen. It should also pull data from the real world • Prescriptive analytics tells you what to do to get specific outcomes • Machine learning makes sure the system gets better and smarter with every interaction
  • 24. Copyright © 2017. All rights reserved. That’s What It Takes to Build an Enterprise Virtual Assistant Automatic Speech Recognition VUI Dialog Management Natural Language Understanding Languages Descriptive Analytics Predictive Analytics Internet of Things Prescriptive Analytics Machine Learning 24 When will you be done? Source: http://theegeek.com/artificial-intelligence/
  • 25. Copyright © 2017. All rights reserved. Editt Gonen-Friedman Editt.gonenfr@gmail.com Editt.gonen-friedman@oracle.com https://www.linkedin.com/in/editt 25