Using voice commands has been pretty ubiquitous nowadays, as more mobile phone users use voice assistants like Siri and Cortana, as well as devices like Amazon Echo and Google Home have been invading our living rooms. You can interact with the intelligent assistant without leaving your couch. At the same time, chatbots have been insanely popular, and services like Slack and Facebook Messenger let you achieve multiple tasks without leaving the client- you can schedule a meeting, order some pizza, call a taxi, etc.
Historically in web development, we have been relying on various UI elements to interact with your users. Now with the new technologies, you can develop rich applications with natural user interactions with a minimal visual interface. This enables countless use cases for richer and more accessible web applications.
In this talk, Tomomi Imura will talk about the examples of the conversational interface, and what and how you can build with JavaScript in a browser using the Speech API, the open web standard, also with Node.js to work with the 3rd party platforms!
4. 4
@girlie_mac
4
What are Bots?
a software application or script that
runs automated tasks
Typically, bots perform tasks that are both simple and
structurally repetitive, at a much higher rate than would be
possible for a human alone.
https://en.wikipedia.org/wiki/Internet_bot
5. 5
@girlie_mac
5
What are Bots?
Also are a user experience that can expose
services to users via conversational
engagement and rich interactions.
6. 66
"Bots are like new applications
that you can converse with."
-- Satya Nadella, Microsoft (2016)
7. 77
Future of computing revolves around
three principal factors:
● Hunams
● Digital assistants
● Bots
13. 13
Google Assistant
Web Search to Assistant (Voice)
★ Assistant queries are 20% longer than the similar
search
★ Assistant queries are 200x conversational than
search
14. 14
Web Search vs.
“Weather Stockholm”
“Time San Francisco”
“How’s the weather in Stockholm?”
“What time is it now in San Francisco?”
Voice Assistant
15. 15
Google Assistant
Web Search to Assistant
★ 40% as likely to be an action
“Set a timer”
“Call home”
“What is the song?”
“Open YouTube”
“Turn light off”
OK Google, walk my dog!
31. 31
@girlie_mac
31
Slack PlatformSlack Client
Bots on Slack
Using Slack APIs to customize your workspace
Your App Server
(or go serverless e.g.
Google Cloud Functions)
Web API (HTTP)
RTM API
(WebSocket)
or
32. 32
Slack Interactions:
Slash Commands
1. A user sends a command
2. A payload sent to the configured server
URL via HTTP POST
3. The server sends back a JSON response
with HTTP 200
33. 33
Slack Interactions:
Conversational
1. A user sends a message
2. Events API “pings” to the
configures URL with the message
details
3. The bot sends back a message
via Web API method
chat.postMessage
38. 38
@girlie_mac
38
Conversational Bots + NLP
What is Natural Language Processing (NLP)?
Process and analyze natural language data to make the
interactions between computers & human, artificial & natural
languages possible
42. 42
@girlie_mac
42
NLP w/ Dialogflow
Dialogflow
Unstructured Text
(Human Understandable)
User Intent
(Machine Understandable)
Remind me to
water plants at
7am everyday
Intent: Agenda / Create event
Task: Water plants
Time: 7am
Date: Everyday
User
{
“id”: “7098-293d-343e-b4556-6e3df36”
“timestamp”:”2019-06-04T10:45.4382”,
“lang”: “en”,
“result”: {
“source”: “agent”,
“fulfillment”: {
“speech”: “Reminds me to water plants at 7am
every day”,
“messages”: [
{
“type”: 0,
“speech”: “Reminds me to water plants at 7am
every day”,
}
]
},
“score”: 0.34
}
}
43. 43
@girlie_mac
43
Powerful NLP Platforms
Natural Language Processing & Cognitive platforms with variety of
APIs (e.g. text analysis, spell-check, translations, etc.)
● IBM Watson
● Google Cloud Natural Language API
● Microsoft Azure Bot Services & LUIS
● Amazon Lex
● Baidu UNIT
45. 45
@girlie_mac
Hi, I want to book a flight!
Yes, from SFO.
Where are you flying to?
Stockholm
Hi Linda, welcome back! Are you
flying from San Francisco, as
usual?
Okey-dokey, hold on.
51. 51
@girlie_mac
51
Web Speech APIs
1. Speech recognition API
2. Speech synthesis API
http://caniuse.com/#feat=speech-recognition
http://caniuse.com/#feat=speech-synthesis
52. 52
@girlie_mac
52
Accessibility & Voice APIs
Who depends on the features?
● People with physical disabilities who cannot use the keyboard or mouse
● People with chronic conditions, such as repetitive stress injuries (RSI)
● People with cognitive and learning disabilities who need to use voice rather
than to type
● People who are blind / partial sight and cannot see what is on the screen
● People with dyslexia and other cognitive and learning disabilities who need to
hear and see the text to better understand it
https://www.w3.org/WAI
62. 62
@girlie_mac
62
What I’ve mentioned:
● Give voice assistant some UI / visual cues
● Don’t trick users - let them know they chat with a bot
● Pick right APIs - Need no-code tools? Need translations?
● Think accessibility
Think Humans First
63. 63
@girlie_mac
63
Also,
● Avoid the text-only UI. Combine with rich interactions like links
& buttons, if the platform allows
● Give users a proper intro & onboarding guide
● Support keywords like help
● Handle errors gracefully
Think Humans First
68. “Imitation is the sincerest form of flattery that mediocrity can
pay to greatness.” - Oscar Wilde
There are some near-exact copy of my talk floating around and I’d like to claim my originality. I created the
content from my own research and experiences, also wrote sample code from a scratch.
I have modified my content slightly since I discovered, yet, the copy talks are still way too similar to mine. The
side-by-side comparison proved the plagiarism. Yes, I am flattered, but this is definitely not cool.
However, I still like to keep my content under CC BY-SA 4.0 for educational purpose, so if you’d like to use the
materials for meetups, brown bag sessions, etc. please do state you borrowed content from me.
Thank you ❤
Tomomi