On Accuracy, Part 2: What does accurate even mean?

June 11, 2009 BY JOSH MILLER
Updated: January 4, 2018

A couple weeks ago, CJ laid out how detrimental just a few percentage points of inaccuracy can be to the integrity of a transcript.  Even at 99% accuracy, there are residual effects on the overall results that may jeopardize the true quality.  Pretty startling really.  But the question remains: what does 99% accuracy even mean?

Speech and writing are different means of communication.  For example, writing has the advantage of being independent of time.  It can be revised several times before it is made public in what is essentially a final draft format.  Speech on the other hand is a one swing event.  First draft and that’s it – what’s said is said.  Naturally, people have developed mechanisms to give their brains a little more time to polish their speech.  Fillers, including “um”, “uh”, and “you know”, or elongated sounds are common examples of these time-buying speech patterns.  In addition, we don’t speak with proper sentence structure.  Filler words allow us to talk in run-on sentences without even hesitating.  As listeners, we’ve even trained ourselves to filter out a lot of these sounds to capture a seemingly clean delivery from the speaker.  Imagine if the next book you read was filled with false sentence starts and “you knows” – you’d go insane.

So should a written transcript capture every single utterance or should it be edited for a reading audience?  Should accuracy be measured on every sound that comes out of a speaker’s mouth or should it be based on a cleaned up representation that captures all the intended content?  All of a sudden, the objective measure that so many people want to use can be extremely subjective.  For reference, many transcription firms guarantee an accuracy rate of 98%.

We call capturing every single utterance a “verbatim” transcript.  As such, we would capture every single “um”, “uh”, stutter, interrupting speaker change of “uh-huh”, and so on.  Very frustrating to read.  But in a way, easier to measure.  You either have the sound written down or you don’t.

Most people/customers prefer what we call a “clean read.”  This is the case where we cut out the stutters and unneccessary filler words.  The most important part here is to preserve the meaning of every single sentence.  But how can one measure accuracy in objective terms for a process that calls for subjective editing?

There is a significant difference between the two methods, and this is just a brief excerpt.  Over the course of a one hour interview, the gap in word counts will dramatically widen.

We tend to avoid throwing guaranteed accuracy rates around because we realize just how difficult it is to measure.  What if a transcript really is 98% accurate, but the 2% of mistakes happen to be critical words within sentences, resulting in lost meaning?

We firmly believe that it is our responsibility to provide an output with the critical content in tact.  We’d rather miss a filler word than mis-type a noun.  Stated rate or not, the work we do is high quality and consistent.  One of my favorite quotes from one of our customers is, “this is better than what I get back from my copy editor!”

MIT gets a reputation for throwing numbers at every problem imaginable.  In fact, every course, room, building, and student are identified by number.  But in reality, MIT excels at teaching how to use quantitative models effectively as well as when these models break down in the real world.  Transcription accuracy is one of those cases where the numbers cannot tell the entire story.

Besides, if you’re worried that we’re not comfortable using numbers, just remember that we put one in our name.

Read the free report: 2017 State of Captioning.

The closed caption CC icon shown in the middle of a TV.