Contenu connexe Similaire à How AI Creates Synthetic Speech (20) Plus de Bernard Marr (20) How AI Creates Synthetic Speech2. © 2021 Bernard Marr, Bernard Marr & Co. All rights reserved
HOW AI CREATES SYNTHETIC SPEECH
Talking machines are getting more and more sophisticated, and with the
help of AI and machine learning, it is now possible to create high-quality,
customizable synthetic speech.
3. © 2021 Bernard Marr, Bernard Marr & Co. All rights reserved
Having machines turn text into speech is nothing new.
Professor Stephen Hawking communicated with a computerized voice for many years, and by
now, we're used to our GPS devices or smart speakers asking questions and responding to our
queries.
What is different these days is that the quality of synthesized speech is improving, thanks to
several companies using AI to create voice skins for enterprise companies and content creators
that give more options for turning text into speech.
LOVO, an AI voice and synthetic speech startup company, uses a voiceover API to turn text into
speech in real-time using 200+ human-like voices in 33 languages using their “voice library.” Users
also can clone their own voices to create their own skins, simply by reading 15 minutes of a script.
LOVO recently announced the close of a $4.5 million pre-Series A round, led by South Korean
Kakao Entertainment.
4. © 2021 Bernard Marr, Bernard Marr & Co. All rights reserved
WHAT IS AI SPEECH SYNTHESIS?
Speech synthesis is simply the computer-generated production of audible human words.
Traditional text-to-speech robotic voices you hear on software or hardware products like
Amazon Echo, Google Home, your GPS, or your ebook reader are fast and cheap for
companies to create, but they can also be unoriginal and unrealistic.
Artificial intelligence or AI voice operates a little differently. AI voice uses deep learning to
create higher-quality synthetic speech that more accurately mimics the pitch, tone, and pace
of a real human voice.
For example, if you wanted to use LOVO AI to generate synthetic text, you can upload a script
that you want to turn into audio content. Then choose one of the voices in their library, based
on language, style, and character. With a click of a button, LOVO turns your script into audio
that sounds pretty lifelike.
5. © 2021 Bernard Marr, Bernard Marr & Co. All rights reserved
You can also clone your own voice by reading a short script, and LOVO will generate a
custom voice skin you can use over and over again for videos, audiobooks, or anything else
that requires voiceover.
Will AI voice technology replace voiceover professionals? Tom Lee, Co-founder and COO of
LOVO, says no.
“I believe that isn’t going to happen. If you think about how humans and how AIs work,
we can complement each other. As a voice actor, you can only do 6 or 7 hours of work a
day. You can't work 24/7, and you want to focus your energy on the most important
gigs, or maybe you want to have a day job, and then you want your AI voice to make
money while you sleep. You can record once with us, then take the revenue shares. One
of our most famous voices is raking in a couple of grand a month without doing any
work."
6. © 2021 Bernard Marr, Bernard Marr & Co. All rights reserved
THE MANY POTENTIAL USES OF SYNTHETIC
SPEECH
AI voice has a myriad of use cases, including:
Translation: Papercup is using AI voice to translate videos by generating voices that sound like
the original speaker.
Video or audio ads: You can upload a script and create an ad without the added expense and
time involved in hiring a voiceover artist. Descript has a collaborative audio/video editor that
works just like a regular Word document.
E-learning (for kids, or for corporate training): Teachers and trainers will be able to make written
materials more accessible for different types of learners with the help of AI voice automation.
7. © 2021 Bernard Marr, Bernard Marr & Co. All rights reserved
Augmented reality and virtual reality: With the AR and VR markets exploding right
now, there is a huge need for realistic, authentic human voices for apps and
The global text to speech (TTS) market is estimated to reach $5.0 billion by 2026,
according to marketsandmarkets.com – so the sky's the limit for this exciting new
technology.
To find out more about the latest trends in AI and machine learning, check out the
of my website or subscribe to my YouTube channel.
8. © 2021 Bernard Marr , Bernard Marr & Co. All rights reserved
Bernard Marr is an internationally best-selling author, popular keynote speaker,
futurist, and a strategic business & technology advisor to governments and
companies. He helps organisations improve their business performance, use data
more intelligently, and understand the implications of new technologies such as
artificial intelligence, big data, blockchains, and the Internet of Things.
LinkedIn has ranked Bernard as one of the world’s top 5 business influencers. He is
a frequent contributor to the World Economic Forum and writes a regular column for
Forbes. Every day Bernard actively engages his 1.5 million social media followers
and shares content that reaches millions of readers.