The document discusses Pebble's voice dictation API. It provides an overview of how the dictation works, including capturing audio with the microphone, encoding it with Speex, and sending it to a recognizer. It also covers the dictation API basics, UI flow, an example demo app, best practices, and development tools. The API allows apps to integrate voice input and transcription directly on the watch.
2. Voice
• Intro
• Basic overview
• Dictation API - Intro
• Dictation API - Example app
• How it works
• Do’s and don’ts with the API
• Development Help
12. API - Demo
• Translation tool
• Use dictation session to get text to be translated from user
• Use Google Translate API to translate the text
• Display response in the form of text to user
13. #define BUFFER_SIZE (512)
static void init(void) {
session = dictation_session_create(BUFFER_SIZE, handle_dictation_result, NULL);
if (!session) {
APP_LOG(APP_LOG_LEVEL_ERROR, "No phone connected, platform is not supported or "
"phone app does not support dictation APIs!");
}
// more initialization
}
17. static void init(void) {
session = dictation_session_create(BUFFER_SIZE, handle_dictation_result, NULL);
if (!session) {
APP_LOG(APP_LOG_LEVEL_ERROR, "No phone connected, platform is not supported or "
"phone app does not support dictation APIs!");
}
dictation_session_enable_confirmation(session, false /* is_enabled */);
// more initialization
}
19. Recognizer
How it Works - Microphone
Microphone
MCU
PDM -> PCM
Speex Encoder
Dictation UI6
1 2
3
4
5
20. How it Works - Microphone
• Single, MEMS (microelectromechanical system) microphone
• PDM output @ ~1MHz
•Pulse Density Modulation
•1 bit signal that encodes 16-bit data
• Pass 1 bit signal through decimation and low pass filter to
convert to 16-bit PCM (Pulse code modulation) data at 16kHz
Decimation
+ LPF
21. Recognizer
How it Works - Encoder
Microphone
MCU
PDM -> PCM
Speex Encoder
Dictation UI6
1 2
3
4
5
22. How it Works - Encoder
• Why encode?
•Bluetooth throughput limited
• Why Speex?
•Developed specifically for voice encoding
•Outperforms most telephony codecs (compression ratio v
quality)
•Tunable quality
•Recovers from dropped frames
23. How it Works - Encoder
• CELP (Code-excited linear prediction) coding
• Converts PCM to a sequence of frames
• Converts 16kHz, 16-bit PCM signal (256kbps) to a 12.8kbps
sequence of frames
• ~50% CPU cost
24. How it Works - The rest
Microphone
MCU
PDM -> PCM
Speex Encoder
Dictation UI6
1 2
3
4
5
Recognizer
25. Do’s and Don’ts
• Only create one session instance (~1.5kB RAM + buffer space)
- it can be reused.
• While session is in progress:
•No heavy processing
•No appmessage
• Clean up the session to recover precious RAM
• If you decide to disable error messages, provide some useful
feedback for the user.
26. Do’s and Don’ts
• Common failures:
•user not speaking clearly (helps to enunciate and
speak slowly)
•background noise.
• Encourage users to keep phrases brief
• Voice language setting may be different from
watch language
27. Development Tools
• Dictation API works in local emulator!
• Coming to CloudPebble soon!
• To use with the emulator:
•Fire up the emulator
•With the pebble tool:
•Use voice-enabled app like you would on a watch
$
pebble
transcribe
<status
code>
-‐t
<transcription
string>
$
pebble
transcribe
0
-‐t
“What
is
the
current
time
in
London
England"
28. More Info
• API Documentation: http://developer.getpebble.com/docs/c/
preview/Foundation/Dictation/
• Guide: https://developer.getpebble.com/guides/pebble-
apps/sensors/dictation/
• Example: https://github.com/pebble-hacks/voice-demo