How Accurate is Your Transcription & Subtitling Service?
Updated: January 4, 2018
Accuracy is often the most important quality to look for when hiring a video transcription and subtitling service. If you’re going to pay to outsource your transcription, you deserve an accurate transcript.
Whether you’re a media broadcaster who needs to meet certain FCC standards for closed caption accuracy, an educator who needs maximum accuracy for accessibility reasons, or if you simply want to avoid embarrassing caption errors, accuracy matters.
When choosing a captioning vendor, find a company that guarantees a transcript accuracy rate of 99% or higher. Investigate how the transcription company copes with accents, inarticulate speakers, poor audio quality, background noise, and complex vocabulary. Can they still guarantee that level of accuracy despite those challenges?
Automatic Speech Recognition
YouTube’s automatic captions use automatic speech recognition alone to create captions for YouTube videos: this is an example of a well-intentioned initiative that has produced some hilariously inaccurate captions.
Typically, automatic speech recognition produces about 60-70% accurate transcripts, which means that 1 out of 3 words are wrong — and when speech recognition is wrong, it’s usually spectacularly wrong (like in the example above).
Accuracy and Comprehension
The chart below outlines the propagated implications of accuracy rates from speech recognizers, assuming a range of accuracies, and 8 & 10 word sentences. You can see how quickly accuracy rates drop as more words are introduced into a sentence. For example, 67% accuracy means 1 out of every 3 words is incorrect. For an 8-word sentence, the likelihood that the recognizer got all 8 words correct is 67%8 ≅ 4%. Similarly for a 10-word sentence, the likelihood of the recognizer getting all 10 words in a row correct is 67%10 ≅ 2%.
This explains why an accuracy rate of at least 99% is needed to provide an equivalent experience for deaf and hard-of-hearing viewers.
|Video Transcription Accuracy Rates|
|Word-to-Word Accuracy||1 of x Words Incorrect||8-Word Sentence Accuracy||10-Word Sentence Accuracy|
|50%||1 of 2||0%||0%|
|67%||1 of 3||4%||2%|
|75%||1 of 4||10%||6%|
|85%||1 of 7||27%||20%|
|90%||1 of 10||43%||35%|
|95%||1 of 20||66%||60%|
|98%||1 of 50||85%||82%|
|99%||1 of 100||92%||90%|
Caption Quality Standards
The FCC released quality standards for closed captioning of all network and broadcast video, including online distribution of that content. The FCC rules are a helpful guideline for other industries.
Your captioning vendor should comply with the FCC’s standards for caption accuracy, synchronicity, program completeness, and caption placement. On accuracy, the FCC states, “Captions must match the spoken words in the dialogue, in their original language (English or Spanish), to the fullest extent possible.”
Captions must include essential nonverbal information, such as sound effects, music playing, and audience reactions, in order to be considered accurate.
Captions and subtitles should also preserve the tone and intent of the speaker. The ultimate goal: maintain the impact of the original performance as much as possible.
How Accuracy Affects SEO
If YouTube search rank or video SEO is your main objective, then accuracy is critical.
Transcription errors are not uniformly distributed. The most common errors happen with words that are most vital for search: names of products, people, and places, URLs, formulas, technical vocabulary, and acronyms. What this means is that even a slight reduction in accuracy rate (e.g. 98% instead of 99%) makes the content significantly less viable for search.
Keep in mind that using automatic speech recognition alone may register with Google as “automatically-generated gibberish” and could actually harm your SEO efforts.
Speaker Identification and Verbatim vs. Clean Read
Does your video subtitling company provide options for speaker identification? If not, what is the default? If you need something different, will the transcription service follow instructions correctly?
Do they allow you to choose between verbatim and clean read practices for transcription? Most people prefer a “clean read” transcript, where the transcriptionist removes words like “um” or “uh,” as well as stutters and unnecessary filler words that take away from the meaning of the sentence.
Verbatim transcripts capture every utterance that comes out of the speaker’s mouth, including “um,” “uh,” and stutters. They are usually much more frustrating to read and follow than clean read transcripts. However, for scripted television, where stutters are intentional, verbatim transcription is preferred.
Make sure the captioning service can provide you with the transcription style of your choice.
Finally, it’s important to assess a vendor’s ability to maintain accuracy and consistency across many files. When testing out vendors, keep in mind that anyone can produce high accuracy for just a few files. Vendors should be tested with a large quantity of files containing a range of different types of content.
Your video transcription and captioning vendor should provide you with near-perfect captions. Inaccurate captions could even be detrimental to your accessibility and SEO initiatives.
Learn more about How to Select the Right Closed Captioning Vendor.
Video Search: Quick Tools for Making Your Videos & Video Libraries Searchable and SEO-Friendly
In this day and age, people are used to being able to search for something and find exactly what they are looking for almost instantaneously. More and more information is at our fingertips every day, and being able to sort through it…
3 Innovative Ways to Approach English Descriptive Audio
3.5% of the world’s population live with vision impairment, making audio description – also referred to as English Descriptive Audio – an important component to our society’s fast-growing video content. A major barrier to audio description is cost, as traditional methods require…
From Television to Trending: Online Video and Captioning On the Rise
Picture this. You’re back in your childhood, outside playing in the backyard, when your playtime is cut short (because it always was) by you being summoned inside for dinner around the table with your family. After you’ve finished all your vegetables, you…