SlideShare a Scribd company logo
1 of 25
Download to read offline
Standards Update:

VoiceXML 3

Dan Burnett, Ph.D.
Dir. of Speech Technologies, Voxeo
(Dir. of Standards, Voxeo)
Voxeo on Standards

        Develop ahead of standards

        Make it Open Source




        Lead in standards creation

        Lead in standards adoption
© Voxeo Corporation
Past Leadership
    W3C
      •  VoiceXML 2.0/2.1, SRGS 1.0, SISR
         1.0, SSML 1.0
      •  CCXML 1.0, SCXML 1.0, EMMA 1.0

    IETF
      •  MRCPv1 extensions, MRCPv2,
            P-charge-info, SIP security




     © Voxeo Corporation
Where we are now
    W3C
      •  VoiceXML 3, SSML 1.1, Pronunciation
            Alphabet Registry, Speech in HTML 5
      •  CCXML 1.0, SCXML 1.0, EMMA next, MMI
            architecture

    IETF, 3GPP
      •  MRCPv2, XMPP (incl. multi-party Jingle and
            multiple chat), Media Control, SIP Overload,
            SIPREC, CODEC (Speex)

    JCP
      •  JSR 289, 309 – SIP servlets, media control
      •  JSR 154, 254 – Java servlets and servlet
         pages
      •  XMPP SIP servlet – submitting to JCP
     © Voxeo Corporation
VoiceXML


                                                     VoiceXML
                                                     3



                                   VoiceXML
                        VoiceXML   2.1
                        2.0
VoiceXML
1.0




 2000                     2004       2007     2010


  © Voxeo Corporation
VoiceXML


                                                     VoiceXML
                                                     3



                                   VoiceXML
                        VoiceXML   2.1
                        2.0
VoiceXML
1.0




  2000                    2004       2007     2010


  © Voxeo Corporation
V3 Motivations

        FIA flexibility

        New features

        Extensibility

        Better integration with other W3C languages




© Voxeo Corporation
V3 is . . .

        a restructured core

        some new features

        convenience elements to mimic VoiceXML 2.1




© Voxeo Corporation
V3 Architecture

        Core functionality defined in modules

        Modules combined with convenience syntax into
         profiles




© Voxeo Corporation
Core functionality defined in modules




        Module behavior defined precisely as state
         machines



© Voxeo Corporation
Modules + Conv. Syntax = Profiles




        Modules grouped into profiles
        Legacy (V2.1), Basic, Maximal
        Convenience syntax simplifies authoring

© Voxeo Corporation
Convenience Syntax

        New elements and attributes, but no new
         functionality

        Behavior defined in terms of core functionality

        For example, <menu> defined in terms of
         <form> with grammars and prompts




© Voxeo Corporation
Convenience Syntax

        Definite candidates are
          •  menu/choice/enumerate/option
          •  error/help/noinput/nomatch shortcuts
          •  link

        Possible (but different) candidates might be
          •  if/else/elseif (using SCXML)
          •  transfer (using CCXML)




© Voxeo Corporation
New Stuff

        New media, SIV functions

        Session root documents

        Real-time controls

        Author-specifiable transition controllers

        V2 eventing model now async & compatible
         with DOM Level 3



© Voxeo Corporation
New Functionality – Video 

        Video -- <audio> replaced by <media>, which
         allows both audio and video


       <media type="audio/x-wav" src="http://www.example.com/resource.wav"/>

       <media type="video/3gpp" src="http://www.example.com/resource.3gp"/>


       <media>     <!-- inline SSML with audio media fallback-->
        <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis">
          Ich bin ein Berliner.
        </speak>
        <media type="audio/x-wav" src="ichbineinberliner.wav">
       </media>




© Voxeo Corporation
New Functionality – Media Control 

        Media control -- media clipping, speed, and
         volume control now possible without resorting
         to SSML


  <media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2"
  src="http://www.example.com/resource.wav"/>

  <media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s"
  src="http://www.example.com/resource.3gp"/>




© Voxeo Corporation
New Functionality – SIV 

        SIV – speaker authentication capabilities
         available as core functionality
          •  Enrollment – creates voice model, associates it with
                id in speaker database
          •  Identification – which voice model in speaker
                database is a match for the speech?
          •  Verification – for the claimed id,
                does the speech match the voice
                model in the speaker database?



© Voxeo Corporation
New Control – Session Root 

        Just like application root
          <vxml session="blahblah.vxml" ...>



        Well, not exactly
          •  If not specified, no session root
          •  Session root change is ignored or causes error


        First, let’s review application roots




© Voxeo Corporation
Application Root Review



  A: <vxml>
               AppRoot A

  B: <vxml>
               AppRoot B

  C: <vxml root="B">
      AppRoot B

  D: <vxml root="E">
      AppRoot E

  F: <vxml root="E">
      AppRoot E

  G: <vxml>
               AppRoot G




© Voxeo Corporation
Session Root



  A: <vxml>
                                  No Session Root

  B: <vxml session="C">
                      Session Root C

  D: <vxml>
                                  Session Root C

  E: <vxml session="F" >
                     Session Root C

  G: <vxml session="H" requiresession="true">
 error.badfetch




© Voxeo Corporation
Real-time Controls

        Special grammars that are always active (not just in
         the wait state)
          •  Allows arbitrary speech/dtmf
          •  Immediate: volume, speed, skip
          •  At next event processing: cancel, goto
         <form>
           <rtc grammar="digit3.grxml" action="volume" params="+5"/>
           <field name="a"> ... </field>
           <field name="b">
             <cancelrtc grammar= "digit3.grxml "/>
             ... 
           </field>
         </form>

        Acts as pre-filter on input stream,
         replacing matches with silence

© Voxeo Corporation
Transition Controllers

        Inter-element transitions now under author
         control

        Controllers at form, document, application, and
         perhaps session levels
          •  e.g. form controller specifies which form item to
                execute next

        Controllers can be in SCXML or another flow
         control language

        Default controllers will give FIA behavior in
         Legacy Profile
© Voxeo Corporation
Transition Controllers Example 1

    <!-- document-level transition controller controls inter-form transitions -->
    <vxml ...>
     <controller ...>
       <scxml:scxml version="1.0" ...>
         <!-- SCXML code determining which form to go to next -->
       </scxml>
     </controller>

      <form id="form_a" >
       ...
        <goto next="form_b"/>     <!-- goto is only a suggestion now -->
      </form>

     <form id="form_b" >
      ...
     </form>
     ...
    </vxml>



© Voxeo Corporation
Transition Controllers Example 2


  <!-- form-level transition controller controls inter-field transitions -->
  <vxml ...>
   <form>
     <controller src= "myformbehavior.scxml">

     <field name="field_a" > ... </field>
     <field name="field_b" > ... </field>
     <field name="field_c" > ... </field>
     <field name="field_d" > ... </field>
   </form>
   ...
  </vxml>




© Voxeo Corporation
For More V3 Info

        Follow the work
          •  http://www.w3.org/Voice

        Check out our recent Developer Jam Session
          •  http://developers.voiceobjects.com/tech-topics/
                monthly-jam-sessions/

        Contact me
          •  dburnett at voxeo dot com


                               Dan Burnett, Ph.D.
                      Dir. of Speech Technologies, Voxeo
© Voxeo Corporation

More Related Content

Similar to Voxeo Summit 2010: Standards Update: VoiceXML3

Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011
telestax
 
Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?
Voxeo Labs
 
Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow
WSO2
 
Using Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShiftUsing Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShift
Kenneth Peeples
 
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self ServiceVoxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Corp
 
Mike Taulty Beyond Silverlight With W P F
Mike Taulty  Beyond  Silverlight  With  W P FMike Taulty  Beyond  Silverlight  With  W P F
Mike Taulty Beyond Silverlight With W P F
ukdpe
 
XML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processorXML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processor
jimfuller2009
 

Similar to Voxeo Summit 2010: Standards Update: VoiceXML3 (20)

Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011
 
vCenter Orchestrator APIs
vCenter Orchestrator APIsvCenter Orchestrator APIs
vCenter Orchestrator APIs
 
Developing SIP Applications
Developing SIP ApplicationsDeveloping SIP Applications
Developing SIP Applications
 
Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?
 
01 introduction
01 introduction01 introduction
01 introduction
 
High Volume Web API Management with the WSO2 ESB
High Volume Web API Management with the WSO2 ESBHigh Volume Web API Management with the WSO2 ESB
High Volume Web API Management with the WSO2 ESB
 
Facets of applied smw
Facets of applied smwFacets of applied smw
Facets of applied smw
 
How fluentd fits into the modern software landscape
How fluentd fits into the modern software landscapeHow fluentd fits into the modern software landscape
How fluentd fits into the modern software landscape
 
Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow
 
Using Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShiftUsing Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShift
 
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self ServiceVoxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
 
XOOPS 2.5.x Operations Guide
XOOPS 2.5.x Operations GuideXOOPS 2.5.x Operations Guide
XOOPS 2.5.x Operations Guide
 
DEVNET-1122 Integrating Cisco Collaboration into Web Apps
DEVNET-1122	Integrating Cisco Collaboration into Web AppsDEVNET-1122	Integrating Cisco Collaboration into Web Apps
DEVNET-1122 Integrating Cisco Collaboration into Web Apps
 
WSO2Con 2011: Introduction to the WSO2 Carbon Platform
WSO2Con 2011: Introduction to the WSO2 Carbon PlatformWSO2Con 2011: Introduction to the WSO2 Carbon Platform
WSO2Con 2011: Introduction to the WSO2 Carbon Platform
 
VAST 7.5 and Beyond
VAST 7.5 and BeyondVAST 7.5 and Beyond
VAST 7.5 and Beyond
 
Developer Jam Session - Intro to Voxeo Products
Developer Jam Session - Intro to Voxeo ProductsDeveloper Jam Session - Intro to Voxeo Products
Developer Jam Session - Intro to Voxeo Products
 
Mike Taulty Beyond Silverlight With W P F
Mike Taulty  Beyond  Silverlight  With  W P FMike Taulty  Beyond  Silverlight  With  W P F
Mike Taulty Beyond Silverlight With W P F
 
Apache cloud stack 4.1 new features deep dive
Apache cloud stack 4.1 new features deep diveApache cloud stack 4.1 new features deep dive
Apache cloud stack 4.1 new features deep dive
 
What's New In InduSoft Web Studio 8.0 +SP1
What's New In InduSoft Web Studio 8.0 +SP1What's New In InduSoft Web Studio 8.0 +SP1
What's New In InduSoft Web Studio 8.0 +SP1
 
XML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processorXML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processor
 

More from Voxeo Corp

Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on SteroidsVoxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Corp
 
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analyticsVoxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Corp
 
Voxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactionsVoxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactions
Voxeo Corp
 
Voxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTCVoxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Corp
 
Voxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business usersVoxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Corp
 
Voxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fansVoxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fans
Voxeo Corp
 
Voxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topicsVoxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topics
Voxeo Corp
 
Voxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsessionVoxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsession
Voxeo Corp
 

More from Voxeo Corp (20)

Voxeo Summit Day 2 -What's new in CXP 14
Voxeo Summit Day 2 -What's new in CXP 14Voxeo Summit Day 2 -What's new in CXP 14
Voxeo Summit Day 2 -What's new in CXP 14
 
Voxeo Summit Day 2 -Voxeo APIs and SDKs
Voxeo Summit Day 2 -Voxeo APIs and SDKsVoxeo Summit Day 2 -Voxeo APIs and SDKs
Voxeo Summit Day 2 -Voxeo APIs and SDKs
 
Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on SteroidsVoxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
 
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analyticsVoxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analytics
 
Voxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactionsVoxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactions
 
Voxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTCVoxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTC
 
Voxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business usersVoxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business users
 
Voxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fansVoxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fans
 
Voxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topicsVoxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topics
 
Voxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsessionVoxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsession
 
Voxeo Summit Day 1 - Extending your IVR investment to mobile
Voxeo Summit Day 1 - Extending your IVR investment to mobileVoxeo Summit Day 1 - Extending your IVR investment to mobile
Voxeo Summit Day 1 - Extending your IVR investment to mobile
 
Voxeo Summit Day 1 - The Art of The Possible
Voxeo Summit Day 1 - The Art of The PossibleVoxeo Summit Day 1 - The Art of The Possible
Voxeo Summit Day 1 - The Art of The Possible
 
Voxeo Summit Day 1 - Prophecy log search
Voxeo Summit Day 1 - Prophecy log searchVoxeo Summit Day 1 - Prophecy log search
Voxeo Summit Day 1 - Prophecy log search
 
Voxeo Summit Day 1 - Customer experience analytics
Voxeo Summit Day 1 - Customer experience analyticsVoxeo Summit Day 1 - Customer experience analytics
Voxeo Summit Day 1 - Customer experience analytics
 
Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)
Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)
Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)
 
Voxeo Summit Day 1 - A view into the Voxeo cloud
Voxeo Summit Day 1 - A view into the Voxeo cloudVoxeo Summit Day 1 - A view into the Voxeo cloud
Voxeo Summit Day 1 - A view into the Voxeo cloud
 
Voxeo Summit Day 1 - Lessons learned from large scale deployments
Voxeo Summit Day 1 - Lessons learned from large scale deploymentsVoxeo Summit Day 1 - Lessons learned from large scale deployments
Voxeo Summit Day 1 - Lessons learned from large scale deployments
 
Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?
Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?
Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?
 
How Do You Hear Me Now?
How Do You Hear Me Now?How Do You Hear Me Now?
How Do You Hear Me Now?
 
IPv6 and How It Impacts Communication Applications
IPv6 and How It Impacts Communication ApplicationsIPv6 and How It Impacts Communication Applications
IPv6 and How It Impacts Communication Applications
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 

Voxeo Summit 2010: Standards Update: VoiceXML3

  • 1. Standards Update:
 VoiceXML 3
 Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)
  • 2. Voxeo on Standards   Develop ahead of standards   Make it Open Source   Lead in standards creation   Lead in standards adoption © Voxeo Corporation
  • 3. Past Leadership   W3C •  VoiceXML 2.0/2.1, SRGS 1.0, SISR 1.0, SSML 1.0 •  CCXML 1.0, SCXML 1.0, EMMA 1.0   IETF •  MRCPv1 extensions, MRCPv2, P-charge-info, SIP security © Voxeo Corporation
  • 4. Where we are now   W3C •  VoiceXML 3, SSML 1.1, Pronunciation Alphabet Registry, Speech in HTML 5 •  CCXML 1.0, SCXML 1.0, EMMA next, MMI architecture   IETF, 3GPP •  MRCPv2, XMPP (incl. multi-party Jingle and multiple chat), Media Control, SIP Overload, SIPREC, CODEC (Speex)   JCP •  JSR 289, 309 – SIP servlets, media control •  JSR 154, 254 – Java servlets and servlet pages •  XMPP SIP servlet – submitting to JCP © Voxeo Corporation
  • 5. VoiceXML VoiceXML 3 VoiceXML VoiceXML 2.1 2.0 VoiceXML 1.0 2000 2004 2007 2010 © Voxeo Corporation
  • 6. VoiceXML VoiceXML 3 VoiceXML VoiceXML 2.1 2.0 VoiceXML 1.0 2000 2004 2007 2010 © Voxeo Corporation
  • 7. V3 Motivations   FIA flexibility   New features   Extensibility   Better integration with other W3C languages © Voxeo Corporation
  • 8. V3 is . . .   a restructured core   some new features   convenience elements to mimic VoiceXML 2.1 © Voxeo Corporation
  • 9. V3 Architecture   Core functionality defined in modules   Modules combined with convenience syntax into profiles © Voxeo Corporation
  • 10. Core functionality defined in modules   Module behavior defined precisely as state machines © Voxeo Corporation
  • 11. Modules + Conv. Syntax = Profiles   Modules grouped into profiles   Legacy (V2.1), Basic, Maximal   Convenience syntax simplifies authoring © Voxeo Corporation
  • 12. Convenience Syntax   New elements and attributes, but no new functionality   Behavior defined in terms of core functionality   For example, <menu> defined in terms of <form> with grammars and prompts © Voxeo Corporation
  • 13. Convenience Syntax   Definite candidates are •  menu/choice/enumerate/option •  error/help/noinput/nomatch shortcuts •  link   Possible (but different) candidates might be •  if/else/elseif (using SCXML) •  transfer (using CCXML) © Voxeo Corporation
  • 14. New Stuff   New media, SIV functions   Session root documents   Real-time controls   Author-specifiable transition controllers   V2 eventing model now async & compatible with DOM Level 3 © Voxeo Corporation
  • 15. New Functionality – Video   Video -- <audio> replaced by <media>, which allows both audio and video <media type="audio/x-wav" src="http://www.example.com/resource.wav"/> <media type="video/3gpp" src="http://www.example.com/resource.3gp"/> <media> <!-- inline SSML with audio media fallback--> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"> Ich bin ein Berliner. </speak> <media type="audio/x-wav" src="ichbineinberliner.wav"> </media> © Voxeo Corporation
  • 16. New Functionality – Media Control   Media control -- media clipping, speed, and volume control now possible without resorting to SSML <media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2" src="http://www.example.com/resource.wav"/> <media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s" src="http://www.example.com/resource.3gp"/> © Voxeo Corporation
  • 17. New Functionality – SIV   SIV – speaker authentication capabilities available as core functionality •  Enrollment – creates voice model, associates it with id in speaker database •  Identification – which voice model in speaker database is a match for the speech? •  Verification – for the claimed id, does the speech match the voice model in the speaker database? © Voxeo Corporation
  • 18. New Control – Session Root   Just like application root <vxml session="blahblah.vxml" ...>   Well, not exactly •  If not specified, no session root •  Session root change is ignored or causes error   First, let’s review application roots © Voxeo Corporation
  • 19. Application Root Review A: <vxml> AppRoot A B: <vxml> AppRoot B C: <vxml root="B"> AppRoot B D: <vxml root="E"> AppRoot E F: <vxml root="E"> AppRoot E G: <vxml> AppRoot G © Voxeo Corporation
  • 20. Session Root A: <vxml> No Session Root B: <vxml session="C"> Session Root C D: <vxml> Session Root C E: <vxml session="F" > Session Root C G: <vxml session="H" requiresession="true"> error.badfetch © Voxeo Corporation
  • 21. Real-time Controls   Special grammars that are always active (not just in the wait state) •  Allows arbitrary speech/dtmf •  Immediate: volume, speed, skip •  At next event processing: cancel, goto <form> <rtc grammar="digit3.grxml" action="volume" params="+5"/> <field name="a"> ... </field> <field name="b"> <cancelrtc grammar= "digit3.grxml "/> ... </field> </form>   Acts as pre-filter on input stream, replacing matches with silence © Voxeo Corporation
  • 22. Transition Controllers   Inter-element transitions now under author control   Controllers at form, document, application, and perhaps session levels •  e.g. form controller specifies which form item to execute next   Controllers can be in SCXML or another flow control language   Default controllers will give FIA behavior in Legacy Profile © Voxeo Corporation
  • 23. Transition Controllers Example 1 <!-- document-level transition controller controls inter-form transitions --> <vxml ...> <controller ...> <scxml:scxml version="1.0" ...> <!-- SCXML code determining which form to go to next --> </scxml> </controller> <form id="form_a" > ... <goto next="form_b"/> <!-- goto is only a suggestion now --> </form> <form id="form_b" > ... </form> ... </vxml> © Voxeo Corporation
  • 24. Transition Controllers Example 2 <!-- form-level transition controller controls inter-field transitions --> <vxml ...> <form> <controller src= "myformbehavior.scxml"> <field name="field_a" > ... </field> <field name="field_b" > ... </field> <field name="field_c" > ... </field> <field name="field_d" > ... </field> </form> ... </vxml> © Voxeo Corporation
  • 25. For More V3 Info   Follow the work •  http://www.w3.org/Voice   Check out our recent Developer Jam Session •  http://developers.voiceobjects.com/tech-topics/ monthly-jam-sessions/   Contact me •  dburnett at voxeo dot com Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo © Voxeo Corporation