1. eTranscriber Transcriptions | Online Transcriber
Google Voice Transcription vs. Humans
In the first week of May, 2010 Google announced the worldwide release of its YouTube video transcription
services. Although released in mid 2009, the beta version of YouTube video transcription was available to a
select few Universities, News Broadcasters and Government agencies.
The history of speech recognition technology dates back to the late 1930’s, when AT&T Bell Laboratories
developed a primitive device that could recognize speech. Researchers knew that the widespread use of
speech recognition would depend on the ability to accurately and consistently perceive subtle and complex
verbal input. But because the computing technology was not good enough, the development of speech
recognition was snail paced.
50 years down the line, the capabilities of many digital electronic devices had surpassed even the best and the
costliest technologies of the 1930’s. This was made possible due to the breakthroughs made in chip and
semiconductor fabrication. The largest barriers to the speed and accuracy of speech recognition - computer
speed and power - were no longer an issue.
With more computing power (measured in units of FLOPS) than our 1930’s computer scientists could imagine,
programmers could now develop algorithms to code and decode a multitude of voice patterns. Practically they
could now build a database of thousands of different voice patterns, convert them into digital sine waves and
analyze words based on the mathematics of voice pattern signals. Over a period of time, as the speech to text
technologies became usable; many companies started offering voice recognition to its consumers – Dragon
Dictation, Microsoft (XP, Vista), Google Voice and other niche companies.
So now the question arises – How reliable are these technologies, particularly Google YouTube
transcription and will they ever compete if not surpass human transcription accuracy?
Those who like to view YouTube videos with captions turned on, you may see that the accuracy of the captions
has increased several folds over the past few months. The accuracy is going up day by day and is only going
to improve as more people use the service. As Eric Schmidt, CEO of Google Inc. says –‘ Our Google voice will
improve over a period of time as more and more users use it, it’s a self learning technology “
But there are still a few major flaws that could be foreseen despite it being a self learning technology -
eTranscriber Transcription Services | www.etranscriber.net | Academic Transcriber Services
2. eTranscriber Transcriptions | Online Transcriber
1. Accurate captioning is possible only in the case when the speaker is speaking very clearly and
distinctly.
2. The environment has to be free from any sort of disturbance
3. Errors creep in because of similar sounding words such – sky and high –when spoken quickly, the
system is not able to differentiate between the two.
4. Interjections – People often pause or make some thinking sounds during speeches – these include
uh’s, Hmmms, ahh etc. The recognition software makes an effort to transcribe these as well, at times
giving hilarious results. (Search YouTube for Hilarious Google voice transcription)
And finally comes the major downside of them all
5. Psychological Satisfaction – After the captioning has been done by the Google robots, can
uploader be sure of the accuracy? It is quite obvious that the transcribed captions would need to be
thoroughly checked for errors and proofread several times. This means going through the whole video
several times, manually correcting the words, correcting the grammar portion including commas,
hyphens, quotes etc and them uploading them. A very time consuming process.
So what is the ultimate solution to transcribing files if not voice to text recognition technology?
The answer is simple, the way digital and analog files have been transcribed for the past 50 years - Humans.
Can speech recognition technologies ever surpass human transcribing abilities?
eTranscriber Transcription Services | www.etranscriber.net | Academic Transcriber Services