Survey Results: Taking a Closer Look at the Quality of Video Captions

Updated: June 3, 2019

Many of us go for best-looking produce when shopping at the grocery store, while others aren’t always as discerning — maybe it’s all just going in the juicer, anyway.

But when organizations add captioning to their video content, quality is usually taken much more seriously. Captioning standards, and legal requirements, in many cases, require that viewers who are Deaf or hard of hearing are given an equivalent experience while watching a video. This means that any captions that are less than 99% accurate (compared word-to-word with the spoken audio in the video) could have lots of mistakes, making the video content extremely hard to understand or completely unintelligible.

In our recently published report, the 2017 State of Captioning, we surveyed over 1400 people, representing a broad range of industries, about video captioning at their place of work.

Here are highlights from the section entitled, Caption Accuracy and Quality.

Rating of Caption Quality

As it turns out, most people are pretty satisfied with the quality of video captions at their organizations. On a scale of 1 to 10, with 10 being near perfect and 1 being very inaccurate, about 75% of respondents rated their captions 8 or above.

How would you rate the quality of your organization’s video captions?

We also asked whether respondents received any feedback about the quality of their organization’s video captions. According to our respondents, about 89% of the feedback was positive: over 45% said, “captions are generally high quality but sometimes inaccurate,” and nearly 44% said, “captions are high quality.”

Use of Automatic Captions

Automatic captions, or captions generated by automated speech recognition (ASR) computer programs, are an incredible technological achievement. The fact that Google has added automatic captions to over 1 billion videos on YouTube is an amazing feat and shows that captions are growing steadily in popularity.

However, automatic captions tend to only achieve 50-80% accuracy rates and are not a good solution for organizations that want to make their video content accessible to the estimated 360 million people in the world who have little or no hearing.

Does your organization use automatic captions?

More than half of our respondents’ places of work don’t use automatic captions at all.

About a quarter of organizations use automatic captions but then edit for accuracy after they’ve been generated. This process is actually fairly efficient when done properly, and is similar to 3Play Media’s captioning process: we run all videos through a speech recognition program to produce a rough draft of the transcript, which then goes through two levels of human editors who clean up the transcript and make the captions as accurate as possible.

Surprisingly, though, over 22% of organizations use automatic captions for “all” or “some” of their captioning without cleaning up the transcript. This is fairly concerning and we are curious to see how this number changes over time.

Organizations that rely exclusively on ASR to caption their videos risk excluding Deaf and hard of hearing viewers who want to watch that video content. Harvard and MIT are currently fighting a lawsuit brought upon by the National Association of the Deaf (NAD) for using inaccurate captions (made using only ASR) on their online course videos.

To learn more about trends in video captioning including how organizations use video, what captioning budgets look like, and how organizations get their captions, download the 2017 State of Captioning report below:

2017 State of Captioning: Keep up to date with the latest trends in video captioning. Download the free report here.

3play media logo in blue

Subscribe to the Blog Digest

Sign up to receive our blog digest and other information on this topic. You can unsubscribe anytime.

By subscribing you agree to our privacy policy.