This document provides information about integrating Cortana into Windows 10 applications. It discusses using text-to-speech to convert text to audio, speech recognition to convert audio to text, and creating foreground and background integrations with Cortana to launch apps or handle voice commands without launching apps. Examples are provided for common Cortana integration scenarios. Contact information is included for the authors.
2. WHO ARE WE?
Edward Moemeka – Chief Architect, e-Builder
Follow me on Twitter @moemeka
Email me at edward.moemeka@thethinkmine.com
Connect with me on LinkedIn https://www.linkedin.com/in/edwardmoemeka
Obinna Igbokwe – Solution Architect, e-builder
Email obinna.igbokwe@platformbasedsolutions.com
3. GET THE BOOK TO LEARN MORE
#1 Windows 10 development title on
Amazon
Get it to learn more
http://www.amazon.com/Real-
World-Windows-10-
Development/dp/1484214501/ref=
sr_1_1?ie=UTF8&qid=1455962970&
sr=8-
1&keywords=windows+10+develop
ment
4. WHY CORTANA?
The obvious reasons
World’s most personal digital assistant helps you get things done throughout the day
Be the one helping!
A whole new natural, way of interacting with your PC
Create the experience
Available to millions of users across Windows 10 PCs, tablets, and phones
More money, more problems
Integrated search across device, the Web and personal and professional clouds
Omni-channel integration can lead directional growth
Typical usage scenarios
“App first” experiences
I’ve got a pre-existing app and I want to integrate voice
I’ve got an app on the system and I want to surface it's content
“Voice first” experiences
I want to create an audio interaction experience
5. MAKING YOUR PROGRAM SPEAK
THE EASY WAY
Text to speech, commonly referred to as TTS, uses a speech synthesis engine (voice) to convert
a text string into spoken words. The simplest form of this takes as input the text you actually
want the engine to utter.
7. MAKING YOUR PROGRAM SPEAK
THE HARD WAY
For added flexibility, Speech
Synthesis Markup Language
(SSML) can be used to describe
the manner in which you want a
speech synthesis engine to read
text that is inputted into it.
9. RECOGNIZING SPEECH
Converts words spoken by the
user into text – which
ultimately can be used for
form input, text dictation, to
specify an action or
command, and/or to
accomplish tasks.
Supports
pre-defined grammars
for free-text dictation and web search
Custom grammars authored using Speech
Recognition Grammar Specification (SRGS)
11. RECOMMENDATIONS ON SPEECH
RECOGNITION
Always provide a visual cue to indicate that speech recognition is supported
and available to the user and whether the user needs to turn it on.
Provide ongoing recognition feedback to minimize any apparent lack of
response while recognition is being performed.
Let users revise recognition text using keyboard input, disambiguation
prompts, suggestions, or additional speech recognition.
Stop recognition if input is detected from a device other than speech
recognition, such as touch or keyboard. This probably indicates that the user
has moved onto another task, such as correcting the recognition text or
interacting with other form fields.
Specify the length of time for which no speech input indicates that
recognition is over. Do not automatically restart recognition after this period
of time as it typically indicates the user has stopped engaging with your app.
Disable all continuous recognition UI and terminate the recognition session
if a network connection is not available. Continuous recognition requires a
network connection.
12. HELLO CORTANA
Utilizes VCD files
Requires an alias that uniquely identifies your app
If an alias name collision happens, a prompt (within the Cortana interface) will be presented to the
user which gives them the option to select which app the would like to use to service the request
Does not require apps to be running for it to function
Allows for you to use Cortana to launch your UWP apps as though it is a command intrinsically built
into the system
Can launch your app
Can use your app to service the customer’s requests
Two categories
Foreground apps
Apps launched in this manner are launched into the foreground, meaning that the app takes focus and Cortana is
dismissed.
Your voice input is passed into the app as a string parameter of the OnActivated method.
Saved best for commands that require additional context or user input.
Background apps
The user is given no visual indication that your app is servicing the request (the primary request from the user is handled
by the Cortana interface)
Allows for providing lists, secondary actions (links), images, to the customer. Your app can be launched from one of those
13. STEPS TO CREATING
FOREGROUND INTEGRATION
Create a VCD file. This is an XML document that defines all the
spoken commands that the user can say to initiate actions or invoke
commands when activating your app.
Register the command sets in the VCD file when the app is launched.
Handle the foreground activation of the of the app through the
OnActivated handler. As part of the launch parameters you will have
access to command that was triggered by Cortana and the text of the
words that were uttered.
16. HANDLING ACTIVATION FROM
CORTANA
If the window content is not set to a UIElement by the time the
OnActivated method completes it will not be initialized
18. STEPS TO CREATE BACKGROUND
INTEGRATION
Create a VCD file. This is an XML document that defines all the
spoken commands that the user can say to initiate actions or invoke
commands when activating your app.
Create an app service (Windows.ApplicationModel.AppService) that
Cortana invokes in the background.
Register the command sets in the VCD file when the app is launched.
Handle the background activation of the of the app service and the
execution of the voice command.
Display and/or speak the appropriate feedback to the voice command
within Cortana.
22. DEEP LINKING
A "Go to <app>" link on various Cortana screens.
A link embedded in a content tile on various Cortana screens.
The app service programmatically launches the foreground app.
Requires element added to the Extensions node of the Application
element:
<uap:Extension Category="windows.personalAssistantLaunch"/>
As with any protocol contract, your app must override its OnActivated event and
check for an ActivationKind of Protocol.
When your app is launched in this manner the resulting URL sent to it is
"windows.personalassistantlaunch:?LaunchContext=<AppLaunchArgument>"
23. PROMPTING THE USER
In certain scenarios it may be
necessary to ask the user for
a confirmation before
proceeding with an action
through the background app
service. For this kind of
situation, the method
CreateResponseForPrompt
on the class
VoiceCommandResponse can
be used
25. WHO ARE WE?
Edward Moemeka – Chief Architect, e-Builder
Follow me on Twitter @moemeka
Email me at edward.moemeka@thethinkmine.com
Connect with me on LinkedIn https://www.linkedin.com/in/edwardmoemeka
Obinna Igbokwe – Solution Architect, e-builder
Email obinna.igbokwe@platformbasedsolutions.com
26. GET THE BOOK TO LEARN MORE
#1 Windows 10 development title on
Amazon
Get it to learn more
http://www.amazon.com/Real-
World-Windows-10-
Development/dp/1484214501/ref=
sr_1_1?ie=UTF8&qid=1455962970&
sr=8-
1&keywords=windows+10+develop
ment