Voice Enabled Desktop Interaction and Control System (VEDICS).

• Team Members:
 Nischal E Rao
 Bharat Joshi
 Suhas Kamath N
 Sharath M Puranik

• Project Guide: Prof. Shantharam Nayak
• Carried out at:
R.V. College of Engineering,
Bangalore, India.

• Voice Enabled Desktop Interaction and
Control System(VEDICS) is a software
solution for controlling the desktop system
using voice based commands.

• The system takes audio signals as input,
processes it, recognizes it and executes
the desired action on the desktop system.

• All software products should incorporate
accessibility features to enable differently-abled
people to use the software easily and efficiently.

• For persons with physical disabilities, the ability
to simply talk to a computer could be a priceless
asset.

• Hands-free computing is more convenient than
conventional I/O.

• The user should be able to
o access any element present on the user’s screen.
o run common programs and applications.
o navigate through the file system.
o perform common window operations like minimize,
maximize, close etc.

• User commands should be easy to remember and use.

• The user must be able to turn the system on and off
whenever required.

• VEDICS follows MVC design pattern.

• Flexibility of using any speech-to-text converter for use
with VEDICS.

• VEDICS uses a feedback mechanism to learn what is
being displayed on the desktop.

• Increased accuracy since only relevant words are
recognized.

Recognized Text
Desktop
Speech-to-text
Control
Converter
System
Grammar and
Names of visible
elements
Command Currently visible
objects

User’s
Desktop

• Speech to text Conversion

Speech To
Text Converter

• Grammar and Dictionary are used to
convert sound signals into text.
Speech To
Text Converter

Grammar

Dictionary

• The recognized text is given as input to
the Desktop Control System.
Speech To “Open Firefox” Desktop
Text Converter Control
System

Grammar

Dictionary

• The Desktop Control System determines
the command to execute on the desktop.
Speech To Desktop
System
Open_firefox
command

• After successful execution, the names of
objects visible on the screen are collected.
Speech To Desktop
System
“File” | “Edit” | “Google”

• The collected names are used to update
the grammar and the dictionary files.
Speech To Desktop
System

“File”, “Edit”, “Google”
Grammar

Dictionary

• The updated grammar and dictionary files
are used in the next recognition cycle.
Speech To
Text Converter

Updated
Grammar

Updated
Dictionary

• VEDICS consists of the following parts:
o Sphinx 4 Sub-system : Open Source tool used to convert
speech to text.

o Desktop Control Sub-system: Used to execute the converted
text into corresponding command on the desktop. It re-creates
the grammar file based on what is displayed on the screen.

o Logios Tool : Used to generate a new dictionary based on
what is displayed on the screen.

• Accuracy of VEDICS depends on accuracy of Sphinx 4.
• Summary of performance of Sphinx 4:
Parameters Performance

Vocabulary Size 79

Word Error Rate (in %) 1.192

RT Ratio in Single CPU Configuration* 0.25

RT Ratio in Dual CPU Configuration* 0.20

* RT Ratio: Ratio of utterance duration to the time taken to decode the utterance.

• Increased accuracy due to context aware nature of
VEDICS.

• Use of small vocabulary further improves accuracy.

• Use of Logios enables recognition of custom words.
Words with any sequence of characters can be
recognized.

• Almost all components accessible on the desktop.

• VEDICS can be used to perform most actions that can
be done using a pointing device.

• Using voice to access and control the desktop has many
advantages. This feature can be a boon to the
differently-abled people.

• VEDICS can navigate through file system, open
applications, control the desktop window, and recognize
almost any word.

• VEDICS is context aware. It determines what
is currently being displayed on the desktop and
dynamically generates the grammar and the dictionary.

• Dictation facility: The ability to dictate into a text editor or
text field.

• Artificial Intelligence in VEDICS.

• If there is a conflict in name of object on the screen then
the user should be able to select the right object.

• The user should be able to either pronounce the entire
word or spell individual characters of the word.

• Facility to add custom commands to suit the user.

• Screen Reader Facility.

Project Link: http://vedics.sourceforge.net/
References:
• Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh,
Evandro Gouvea, Peter Wolf, Joe Woelfel, “Sphinx-4: A Flexible
Open Source Framework for Speech Recognition”, SML Technical
Report, Sun Microsystems, SMLI TR-2004-139, Nov. 2004
• Kai-Fu Lee, Hsiao-Wuen Hon, Raj Reddy, “An Overview of the
SPHINX Speech Recognition System”, IEEE Transactions on
Acoustics Speech and Signal Processing, Vol 38, No. 1, Jan,
1990.
• Frank Buschmann, Regine Meunier, Hans Rohnert, Peter
Sommerlad, Michael Stal, “Pattern-Oriented Software Architecture
– Vol 1: A System of Patterns”, Wiley Publications, 1996.

• Gnome Voice Control [Online]. Available:
http://live.gnome.org/GnomeVoiceControl
• “Java Speech Grammar Format (JSGF)” [Online]. Available:
http://java.sun.com/products/java-
media/speech/forDevelopers/JSGF/
• “Logios Lexicon Tool” [Online]. Available:
http://www.speech.cs.cmu.edu/ tools/lextool.html
• “Gnome Accessibility API” [Online]. Available:
http://library.gnome.org/devel/at-spi-cspi/
• “Libwnck: Window Navigator Construction Kit” [Online]. Available:
http://library.gnome.org/devel/libwnck/
• “GConf Configuration System” [Online]. Available:
http://library.gnome.org/devel/gconf/

Voice Enabled Desktop Interaction and Control System (VEDICS).

Voice Enabled Desktop Interaction and Control System (VEDICS).

Recommended

Recommended

More Related Content

Similar to Voice Enabled Desktop Interaction and Control System (VEDICS).

Similar to Voice Enabled Desktop Interaction and Control System (VEDICS). (20)

More from AEGIS-ACCESSIBLE Projects

More from AEGIS-ACCESSIBLE Projects (20)

Recently uploaded

Recently uploaded (20)

Voice Enabled Desktop Interaction and Control System (VEDICS).