3. A voice browser is a device :
that interprets voice input and interprets voice
markup languages to generate voice output.
that interprets a script which specifies exactly
what to verbally present to the user as well as
when to present each piece of information.
4. An advantage to people with visual
impairment
Mobile Web
Naturalistic dialogs with Web-based services
5. There are 10 times as many telephones as connected PCs.
Cell phones usage is growing dramatically.
Speaking and listening are the natural usage modes for
modes. Easy to use - for people with no knowledge or
fear of computers.
Voice interaction can escape the physical limitations
on keypads and displays as mobile devices become
ever smaller
6. Time frame: 1998 to ??
Hands-free accessing of web.
Pragmatic interface for functionally blind users.
10. World Wide Web Consortium(W3C)
Voice Browser Working Group
Speech Interface Framework
11. Established on 26 March 1999.
Re-chartered through 31 January 2009.
W3C Team Contacts are Kazuyuki Ashimura and Matt
Womer.
Co-chaired by Jim Larson and Scott McGlashan .
12. VoiceXML 1.0
VoiceXML 2.0
VoiceXML 2.1
Voice XML 3.0
Speech Recognition Grammar Specification (SRGS) 1.0
Speech Synthesis Markup Language (SSML) 1.0
Speech Synthesis Markup Language (SSML) 1.1
Call Control XML (CCXML)
State Chart XML (SCXML)
Semantic Interpretation (SISR) 1.0
Pronunciation Lexicon Specification (PLS) 1.0
13. Version 1.0 - designed for creating audio dialogs .
Version 2.0 - uses form interpretation algorithm(FIA).
Version 2.1 - 8 additional elements.
Version 3.0 - relationship between semantics and
(31 August 2010) syntax.
14. HTML don’t have
Tampered prompts
Grammar specifying alternative words that the user can
speak in response to the question.
Instructions to the text-to-speech synthesizer about how
to say words and phrases.
Adding these capabilities would complicate HTML,a
language developed just for visual UI.
15.
16. Version 1.0 -for specifying grammars of each user input to
a speech application.
17. Version 1.0 -for specifying the rendering of synthesized
speech to the user.
Version 1.1 - enhancement of SSML 1.0 for better
support of the world's languages including Asian,
Eastern European, and Middle Eastern languages.
18. For specifying call control functions
Execution environment based on CCXML and
Harel State Tables.
19. Version 1.0 - For specifying possible translation of text
from the output of a speech recognizer.
Version 1.0 - Syntax for specifying pronunciation
lexicons to be used by Speech Recognition and Speech
Synthesis.
20.
21.
22. It can be divided into three categories :
Web Browsing
Limited information Access
Spoken Dialog Systems
23. Browse any web pages using speech input.
Parsing for the purpose of voice recognition done
when the page is accessed.
May or may not produce a voice feed back.
24. Useful information in limited domains like weather in
a city, checking stock updates etc.
Audio feed back
25. Client-server architecture is used
Used for connecting to a remote server by a Java
applet(client).
Examples are connecting to email servers
26. Voice is a very natural user interface which speeds up
browsing.
Less space requirements.
Portable voice browsers can also be implemented.
Practical interface for functionally blind users.
Users can browse web while keeping there hands and
eyes for other jobs
27. Voice browsing will become visual(Multi-modal)
Can be integrated to an OS
Integrated to every application.
28. Browser technology is changing very fast these days
and we are moving from the visual paradigm to the
voice paradigm.
Voice browser is the technology to enter this
paradigm.
Voice browser is a device which interpret voice input
and generate voice output.