Accuracy Still a Problem for Google’s Ears

June 30, 2009 BY JOSH MILLER
Updated: January 4, 2018

As we’ve discussed, speech recognition can be a very powerful tool.  But it can’t quite complete the transcription process all on its own.  There is still a gap between what its capabilities are and what would be a high quality, legible transcript.  Many have tried to conquer this automated linguistic feat including Google.

On Friday, David Gallagher of the NY Times started a discussion on Google’s new Google Voice app that allows users to have their voicemail transcribed into text automatically.  The Google app uses an automatic speech recognizer to decipher the spoken content into a friendly email format.  While some might expect Google to be able to put the speech recognition puzzle together, even Google Voice gives us some entertaining reading material.  Yesterday, Mr. Gallagher posted the results of his Google Voice testing.

As much as the speech and AI experts try to model a human’s voice, only a human ear can pick up all the tiny nuances of speech.  From dialect to tone to context, so much can go wrong so fast with a machine.  There is a lot speech recognition can offer, but there has to be a way to allow a human to be part of the process to ensure quality.  And if search or ad delivery is part of the equation (as we might guess with our Google friends), you can imagine what happens to those results when you start with a misguided transcript.

Read the free report: 2017 State of Captioning.

The closed caption CC icon shown in the middle of a TV.