Introduction to myanmar Text-To-Speech

Introduction to
Myanmar Text-To-Speech Engine
Ngwe Tun
Consultant
Yangon Education Center for Blinds

What is Text-To-Speech Engine?
 A text-to-speech (TTS) system converts normal language text into speech.
 Also known as speech synthesizer.
 e.g Microsoft SAM.
 "The quick brown fox jumps over the lazy dog 1,234,567,890 times. soi"

Overview of Text-To-Speech Engine
 A text-to-speech system (or "engine") is composed of two parts: a front-end and a
back-end.
 The front-end converts raw text containing symbols like numbers and
abbreviations into the equivalent of written-out words. This process is often called
text normalization, pre-processing, or tokenization.
 The front-end then assigns phonetic transcriptions to each word, and divides and
marks the text into prosodic units, like phrases, clauses, and sentences.
 The process of assigning phonetic transcriptions to words is called text-to-
phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions and
prosody information together make up the symbolic linguistic representation that
is output by the front-end.
 The back-end—often referred to as the synthesizer—then converts the symbolic
linguistic representation into sound. In certain systems, this part includes the
computation of the target prosody (pitch contour, phoneme durations),which is
then imposed on the output speech.

Features and Functions of
Text-to-Speech Software
 Text-to-speech (TTS) software tools are similar in that they speak text on a
computer. However, they vary widely in their functionality.
 Formatting text – Allows you to format digital text you create, download from
the Internet, or scan into your computer similar to a word processing
program.
 Speaking what you type – Speaks text as you type to give you support in
writing. Within this function there may be the ability to set the level of
support, such as speaking words or speaking each letter and then the word.

Speaking the Text
 Continuous reading – reads from where you choose to begin reading and stops
when it reaches the end of the text or you use a stop command
 Incremental reading – reads an increment of text such as a word, sentence,
chunk/phrase, or paragraph and stops and waits for you to request another
increment of text read.
 Highlighted text – reads just text you highlight with the cursor. Some TTS
programs read the document from a starting point until the users stops the
program. Other TTS programs only read text selected and highlighted by the
user.
 Voices – TTS software can use one voice or allow you to choose from a
selection of male, female, and even foreign language voices
 Reading Speed – you can choose to read faster or slower in precise words per
minute or in speed increments

Who will be using speech synthesis?
 for users with visual disabilities. Screen readers not only read text files but also
give the user other audible navigation support such as reading the user
interface, indicating where the user’s cursor is on the screen, and indicating
when the user’s cursor has passed over a folder.
 Text readers are commercial TTS software tools for users who read below grade
level because of a learning disability, English as a second language, a reading
disability, or low vision.
 Stephen Hawking is one of the most famous people using speech synthesis to
communicate
 Mobiles Phones can speak out incoming text such as SMS, E-Mail and notification

Advantage of Text-To-Speech.
 listen to class notes, text books and electronic text.
 Facilitates education
 Avoids eyestrain from too much reading
 Make proofreading effective
 Learn English, Myanmar or other languages
 Prepare for speeches by hearing your work read aloud.
 Listen to e-books or e-material during your commute.
 Amuse children by letting your PC read stories to them.
 Help seniors or those with vision problems.

Myanmar Languages in Digital Era
 Computers, Mobile Phone, MP3 Players, Watches and Electronic Devices
 Widely used in Social Networks & Online Content.
 Accessible to News, Information, Knowledge from International & Local
 Localization & Rich Myanmar Content in Electronic Form.

Why we need Myanmar Text-To-Speech?
 Myanmar Language Users can not use Screen Reader.
 Screen/File/Text Reader do not support Myanmar.
 Any Myanmar computer user must easier to use Myanmar Language with Text-
To-Speech Features.
 The initiative of Open Source Myanmar Text-To-Speech Engine will empower
to other Software Vendor who want to develop/integrate with their
application. E.g. Mobile Phone Manufacturer can integrate Myanmar Language
support easily and without reinvent the wheel.
 Myanmar Language Learning will be more easier through Text Reader.

How shall we develop Myanmar TTS?
 Learn
 Define Scope of Work
 Collect Digital Asset
 Discover the best tool to make TTS and plan for the future
 Develop
 Develop Myanmar Language Model/Tokenization and Grapheme-To-Sound
 Train TTS Engine
 Test TTS Engine internally/Public Review
 Enhanced
 Review the work plan, try to find improvement, applied the feedback
 Invite Specialist on the TTS and Improve the Engine with third party opinions
 Develop Tools and apply in the real environment (e.g. audio books)

Open Source Model of Myanmar TTS
 Everyone can participate in the development Team
 Anyone can guide to Project
 Whoever can contribute their idea
 Open Source Model of TTS Engine for Myanmar Language.
 We realized that 1 Consultant, 1 Project Leader and 3 Developers will not be
fulfilled the complete Myanmar TTS.
 WE need You, Your Feedback, Your Contribution.

The purpose of a Text-to-Speech system
 To convert any text into natural sounding speech.
 First, text needs to be normalized. Normalization is the process of
transforming text into a single canonical form, therefore text is parsed into
single tokens.
 Next, the text-to-speech system assigns the appropriate phonetic
transcriptions to each word which reflect how text should be pronounced in
any given natural language. The synthesizer then converts the symbolic
linguistic representations into sound.
 The last step is to choose the right speech units which ensure the high quality
and natural sound of generated speech.

Architecture of Myanmar TTS
 Minimal unit of sound will be Syllable or Syllable-Chain
 က ကာ ကာား ကိ ကီ ကီား
 မန္တ သစ္စာ ဥမမာ
 Word Segmentation or Tokenization
 ကလ ားလ ွေလက ာင်ားကိိုကာားဖြင့််သွောားခ့်ကကသည်။
 ကလ ားလ ွေလက ာင်ားကိိုကာားဖြင့််သွောားခ့်ကကသည်။
 ကလ ား / လ ွေ / လက ာင်ား / ကိို / ကာား / ဖြင့်် / သွောား / ခ့်ကကသည် / ။
 Compose Syllable Sound to compose words sound with concatenation.
 က+လ ား / လ ွေ / လက ာင်ား / ကိို / ကာား / ဖြင့်် / သွောား / ခ့်+ကက+သည် / ။
 Need to adjust speed and intonation between syllable and words.
 ကလ ားလ ွေ / လက ာင်ားကိို / ကာားဖြင့်် / သွောားခ့်ကကသည်။

Application of Myanmar TTS
 The longest application has been in the use of screen readers for people with
visual impairment
 commonly used by people with dyslexia and other reading difficulties as well
as by pre-literate children
 Speech synthesis techniques are also used in entertainment productions such
as games and animations
 Text to Speech for disability and handicapped communication aids have
become widely deployed in Mass Transit.
 Text-to speech is also used in second language acquisition

Dream for the Myanmar TTS
 Text-To-Speech Engine integrated with Mobile Phone, Computer and
Electronic Devices
 Voice Command integrated with Mobile Phone, Computer and Electronic
Devices
 Every one can read any Myanmar news, information and electronic Text by
Screen Reader
 Text-To-Speech Engine empowered in Public Announcement and weather
notification.
 Screen Reader Functions will be integrated with OCR, even image can read
aloud.

Thanks for being here and participating
in the Project
 Sponsorship of the TTS Project by KBZ Group of Company
 Great arrangement by Yangon Education Center for Blind
 Contribution of Knowledge by several people
 Last, not the Least, Warmly welcome to Future Contributors.

Introduction to myanmar Text-To-Speech

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (9)

Similaire à Introduction to myanmar Text-To-Speech

Similaire à Introduction to myanmar Text-To-Speech (20)

Plus de Ngwe Tun

Plus de Ngwe Tun (20)

Dernier

Dernier (20)

Introduction to myanmar Text-To-Speech