Voice enable all the things with Alexa

VOICE ENABLE ALL THE THINGS WITH ALEXA
M a r k B a t e
Solutions Architect, Alexa Skills Kit
@markbate
markbate@amazon.com

VOICE EXPERIENCE
A B R I E F H I S T O R Y O F

70s 80s 90s 00s Present
mode
GUI
web
mobile
character
VUI

WE BELIEVE VOICE REPRESENTS
THE NEXT MAJOR DISRUPTION IN COMPUTING

CONVERSATION IS THE MOST NATURAL WAY TO ENGAGE
WITH YOUR PRODUCTS
VOICE RELEASES THE FRICTION OF TRADITIONAL
TECHNOLOGY INTERACTION
USERS CAN NOW INTERACT WITH YOUR PRODUCT IN A
MORE INTIMATE WAY

EUROPEAN ALEXA SKILLS PARTNERS

MEET OUR ALEXA
ENABLED DEVICES

Create Great Content:
ASK is how you connect
to your consumer
THE ALEXA SERVICE
Supported by two powerful SDKs
ALEXA
VOICE
SERVICE
Unparalleled Distribution:
AVS allows your content
to be everywhere
Lives In The Cloud
Automated Speech
Recognition (ASR)
Natural Language
Understanding (NLU)
Always Learning
ALEXA
SKILLS
KIT

ALEXA FOR MANY KINDS OF DEVICES

Skills are how you can make Alexa smarter.
They give customers new experiences.
They’re the voice-first apps for Alexa.

ALEXA APP
http://alexa.amazon.com

Key Design Principles for
ALEXA SKILLS
 Skills Should Provide High Value
 A Skill Should Evolve Over Time
 Users Can Speak to Your Skill Naturally and
Spontaneously
 Alexa Should Understand Most Requests to
Your Skill
 A Skill Should Respond in an Appropriate Way

High Utility Low Utility
Doing
Performs a Task
“Alexa, ask Scout to arm
away mode.”
“Away mode armed. You
have 45 seconds to leave
the house.”
Searching
Identifies specific info
“Alexa, ask Vendor if there
are Pearl Jam tickets
available for this
weekend.”
“There are a limited
number of tickets, ranging
from $49 to $279.”
Telling
Provides a quick reference
point
“Alexa, tell me a cat fact.”
“It is well known that dogs
are superior to cats.”
Browsing
Gives info on a broad
subject
“Alexa, ask Amazon what’s
on sale.”
“The following items are on
sale right now...”

Example of Automatic Learning
ALEXA SKILL
Alexa, launch Travel Buddy
Hi, I’m Travel Buddy. I can easily tell you about your
daily commute. Let’s get you set up. Where are you
starting from?
Las Vegas
Okay, and where are you going?
Los Angeles
Great, now whenever you ask, I can tell you about the
commute from Las Vegas to Los Angeles. The current
drive time is four hours and fourty-two minutes. There
is an accident on I-15 near Pasadena.
Alexa, launch Travel Buddy
Your commute is currently four hours and two minutes.
User engages skill
Give traffic information
Get destination
location
Do we have their
destination location?
Do we have their home
location?
Get home
location
YesNo
Is their home/destination
set up?
Give traffic information
No Yes
No Yes

AVOID FEATURE CREEP. KEEP IT SIMPLE
Don’t overwhelm your users with features out of the box. Voice is a new way for users to interact
with your product. Keep it simple and grow from there.
AS NATURAL CONVERSATION AS POSSIBLE
Try to make your utterances as natural as they possibly can. Top Tip: Have a real world
conversations with one another to create these.
CORE BUSINESS FUNCTIONALITY AS A MINIMUM
It’s important to do the fundamentals right. If you are a news company. Your users will naturally
expect you to at least provide the news. Do the extra features later.
UTILIZE THE BUILT IN LIBRARY
There are hundreds of entities that Alexa can understand using the Built-In library. You can handle
this in your skill by simply including them in your interaction model and respond with a useful
response.
VOICE DESIGN TOP TIPS

WHERE DO YOU START?
The Evolution of a Skill
Traffic Skill Example
Give an estimated time of
arrival from home to work.
Include accidents, construction
and closures on route.
Proactively alert user to delays
and provide alternate routes.
R U N
Evolve Over Time
CRAWL
What’s Your Core Functionality?
ANALYZE USER FEEDBACK
& OPTIMIZE SKILL
WALK
Expand Capabilities & Features
INNOVATE FOR
CUSTOMERS

UNDER THE HOOD OF THE ALEXA SKILLS KIT
A closer look at how the Alexa Skills Kit processes a request and returns an appropriate response
You Pass Back a Textual or
Audio Response
You Pass Back a Graphical
Response
Alexa Converts Text-to-Speech
(TTS) & Renders Graphical
Component
Respond to Intent
through Text & Visual
Alexa sends Customer
Intent to Your Service
Your Service
processes
Request
User Makes a
Request
Audio Stream is
sent up to Alexa Alexa Identifies Skill & Recognizes
Intent Through ASR & NLU

WHAT COMPONENTS MAKE UP A SKILL
Skills are made up of two components
Skill configuration in the Amazon Developer Portal. Our
Voice Interaction Model
and
Your skill code, hosted in AWS Lambda or your own
HTTPS endpoint. Our hosted service.

INVOCATION NAMES
Invocation names are how we know to route traffic
to your particular skill.
Interactions can be either:
One Shot – open your skill and perform an action
such as ‘Alexa, ask National Rail for my commute’
Conversational – Alexa, ask National Rail to set up
my commute’ - ‘OK, what is your regular departure
station’ – ‘Birmingham New Street’
Open Only – Alexa, open National Rail
Your skill can support all of these, it’s not one or
the other.
‘Alexa, ask National Rail for my commute’
Alexa, open Just Eat
Alexa tell Uber to get me a ride
Alexa, launch Cat Facts
Alexa, play RuneScape

INTENTS AND SLOTS
You define interactions for your voice app through
intent schemas
Each intent consists of two fields. The intent field
gives the name of the intent. The slots field lists the
slots associated with that intent.
Slots can also included types such as LITERAL,
NUMBER, DATE, etc.
intent schemas are uploaded to your skill in the
Amazon Developer Portal
{
"intents": [
{
"intent": "tubeinfo",
"slots": [
{
"name": "LINENAME",
"type": "LINENAMES"
}
]
}
]
}

CUSTOM SLOTS
Custom Slots increase the accuracy of Alexa when
identifying an argument within an intent.
They are created as a line separated list of values
It is recommended to have as many possible slots
as possible.
There are some built in slots for things such as
GB.City and GB.FirstName
bakerloo
central
circle
district
hammersmith and city
jubilee
metropolitan
northern
piccadilly
victoria
waterloo and city
london overground
tfl rail
DLR

 Intents for human driven events such as: Cancel, Play, Pause, Repeat, Stop
or Help
 Intents across multiple categories including: Books, Calendar, Cinema
Showtimes, General, Local Search, Music, Video, and Weather
 Slots for Numbers, Dates, Times and List Types
 AMAZON.DATE – converts words that indicate dates (“today”, “tomorrow”, or
“July”) into a date format (such as “2015-07-00T9”).

SAMPLE UTTERANCES
The mappings between intents and the typical utterances that
invoke those intents are provided in a tab-separated text document
of sample utterances.
Each possible phrase is assigned to one of the defined intents.
tubeinfo are there any disruptions on the {LINENAME} line
tubeinfo {LINENAME} line
“What is…”
“Are there…”
“Tell me…”
“Give me…”
“Give…”
“Find…”
“Find me…”

PUTTING IT ALL TOGETHER
tubeinfo are there any delays on the {LINENAME} line
{
"intent": "tubeinfo",
"slots": [
{
"name": "LINENAME",
"type": "LINENAMES"
}
]
}
bakerloo
central
. . .
Utterance
Intent
Slots

REQUEST TYPES
LaunchRequest
Occurs when the users launch the app without specifying
what they want
IntentRequest
Occurs when the user specifies an intent
SessionEndedRequest
Occurs when the user ends the session

AN EXAMPLE REQUEST
If hosting your own service, you will need to handle
POST requests to your service over port 443 and
parse the JSON
With AWS Lambda, the event object that is passed
when invoking your function is equal to the request
JSON
Requests always include a type, requestId, and
timestamp
If an IntentRequest they will include the intent and
its slots
type maps directly to LaunchRequest,
IntentRequest, and SessionEndedRequest
"request": {
"type": "IntentRequest",
"requestId": "string",
"timestamp":"2016-05-13T13:19:25Z",
"intent": {
"name": "tubeinfo",
"slots": {
"LINENAME": {
"name": "LINENAME",
"value": "circle"
}
}
},
"locale": "en-GB"
}

AN EXAMPLE RESPONSE
Your app will need to build a response object that
includes the relevant keys and values.
The alexa-sdk for Node.js makes this super simple.
ouputSpeech, card and reprompt are the supported
response objects.
ShouldEndSession is a boolean value that
determines wether the conversations is complete
or not.
You can also store session data in the Alexa Voice
Service. These are in the sessionAttributes object.
{
"version": "1.0",
"response": {
"outputSpeech": {
"type": "SSML",
"ssml": "<speak>There are
currently no delays on the circle
line.</speak>"
},
"shouldEndSession": true
},
"sessionAttributes": {}
}

CHANGING ALEXA’S INFLECTION WITH SSML
• Alexa automatically handles normal punctuation, such as
pausing after a period, or speaking a sentence ending in a
question mark as a question.
• Speech Synthesis Markup Language (SSML) is a markup
language that provides a standard way to mark up text for the
generation of synthetic speech.
• Tags supported include: speak, p, s, break, say-as, phoneme,
w and audio.

Existing Customer with
ACCOUNT LINKING
• Allow your customers to link their existing
accounts with you, to Alexa.
• Customers are prompted to log in to your
site using their normal credentials with
webview url you provide.
• You authenticate the customer and
generate an access token that uniquely
identifies the customer and link the
accounts.

DEMO
Alexa Skills Kit for NodeJS

SOME USEFUL RESOURCES
http://developer.amazon.com/ask
http://developer.amazon.com/blog
http://developer.amazon.com/alexa-fund
http://bit.ly/alexadevchat
http://bit.ly/alexaforums
http://bit.ly/alexacerthelp
http://bit.ly/alexadevevents

Voice enable all the things with Alexa

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (18)

Similaire à Voice enable all the things with Alexa

Similaire à Voice enable all the things with Alexa (20)

Plus de Mark Bate

Plus de Mark Bate (6)

Dernier

Dernier (20)

Voice enable all the things with Alexa

Notes de l'éditeur