Q&A: The Nuts & Bolts of Captioning and Describing Online Video

Updated: June 3, 2019

With the predominance of accessibility laws and the upcoming refresh of Section 508 (which requires organizations to comply with WCAG 2.0 Level AA standards by early 2018), it’s the perfect time to become well-versed in captioning and audio description.

In this Q & A portion from the webinar entitled, The Nuts & Bolts of Captioning and Describing Online Video, Lily Bond, Director of Marketing at 3Play Media, and Owen Edwards, Senior Accessibility Consultant at SSB BART Group, dive into the legal requirements, standards, and best practices for captioning and audio description.

Read on for some helpful tips and highlights from the Q & A session:

What are the legal considerations of captioning and describing someone else’s videos? What are the necessary steps to get such videos captioned?

LILY BOND: Captioning videos for accessibility purposes in higher education is often considered fair use, but you would want to consult with your legal counsel at your organization to make sure that they are comfortable with that.

However, if you’re using a vendor, some vendors– and certainly 3Play Media, we actually provide a captions plugin, which allows you to add captions without having to republish or edit the original video at all. It’s a very simple one line embed code that includes both the YouTube embed and the caption embed. And it’ll play the captions along with the video.

It does work for several other video players as well. And that allows you to publish captions to a video to make it accessible without worrying about copyright. And the audio description plug-in would be very similar. It would allow you to publish description to a video without having to republish and get in trouble with copyright law.

What is the acceptable accuracy rate or quality requirement for captioning and describing educational video content?

LILY BOND: For captioning, the generally assumed accuracy rate is 99% accurate or higher. Errors really start to compound when you get below 99% accurate. And the first errors to go are really clarifying words, like did and didn’t, which can be a big issue for educational content, when students are relying on those captions to understand and to learn. 99% or higher is the generally assumed rate for captioning, although I will say that there are no clearly specified accuracy rates in any of the laws, which is something that we are hoping for more clarity on.

OWEN EDWARDS: There aren’t clear ways to measure quality of audio description. There are certainly things to avoid, which are things like stepping on the audio in the main video content, particularly dialogue, and giving away things that are coming up in the video. But there haven’t been test cases that say this kind of description isn’t good enough. So really, we’re recommending that people go to reputable lenders of description where they have a quality level that certainly exceeds those requirements.

Can you share some best practices for audio description?

OWEN EDWARDS: I would refer to the DCMP description key, which really breaks down how description should be created in terms of what needs to be described and what style should be used. In general, if there’s onscreen text, that needs to be read out if it isn’t included in the soundtrack. But that’s really a matter of, is it text which is there to convey information? It wouldn’t be necessary, for example, if somebody was speaking and there was a road sign. But it would be necessary if the name of a speaker and maybe their job title appeared on the screen. So the DCMP description key is a great open guideline to the best practices around description.

Are there instances where you can’t describe everything in the video itself, or where a description isn’t necessary, or a description is not even possible?

OWEN EDWARDS: WCAG itself specifically points out that if all of the information in the video track is already provided in the audio track, no audio description is necessary. We at SSB, create our own training videos where, in the production process, we intend for that main audio track to describe everything that’s important in the video content, so that there’s no need for supplemental description. The video is considered self-described.

There aren’t clear guidelines on in which situation that is considered acceptable. We can certainly give guidance on that, on a case-by-case basis. But it’s really a matter of considering is there something that would be missed, if you couldn’t see the content?

And then, the separate case are the situations where you can’t add description successfully, or there isn’t a possibility of doing it. There are certainly videos that don’t have gaps in the audio to insert that additional description, and there isn’t a clear way to deal with that situation.

And WCAG added the level at level AA. At level AAA, there’s a feature called extended description where the video can be paused. The description happens, and then the video resumes. It’s been a little confusing, but that’s considered a triple AAA requirement because, really, it’s a solution to a AA problem. The AA problem is that video that has too much speech for there to be spaces.

Can you still add captions and audio descriptions for videos that have already been posted to sites like YouTube and Vimeo?

LILY BOND: You can always add captions after you’ve published a video, particularly to video players like YouTube and Vimeo. Both platforms allow you to upload a caption file, so it’ll just be associated with your video once you publish that caption file.

It will never republish your video or require you to publish a new video. Audio description, it’s kind of dependent on how you choose to publish. YouTube and Vimeo do not really allow for a secondary audio track anyway. But if you are using something like an audio description plug-in, you could certainly add that for a video that already exists on YouTube.

OWEN EDWARDS: There isn’t a mechanism to add description onto an existing video, except for these new plug-ins that are coming along, like 3Play’s. There is also a research platform called You Describe, which allows people to describe.

How does captioning increase views for Facebook and other social media platforms?

LILY BOND: Facebook started implementing captioning just over a year ago. They have done a little bit of research into how captions have impacted viewer engagement, and they found a couple of interesting things.

One was that the vast majority of people did not like videos auto playing with sound, which is why they turned the sound off, although they also, obviously, were in violation of the WCAG requirements, as Owen said. But they also found that adding captions increased viewer engagement by over 12%.

When videos auto play on your newsfeed without sound, captions really help get viewers engaged in your video and draw them into something that they otherwise would likely just skip. So captions are actually really important for getting that engagement on Facebook.

If we are concerned about SEO. And we have a video with no voiceover and only music, would adding audio description help improve our SEO?

LILY BOND: It really depends on how you are publishing that audio description. SEO, or search engine optimization, really draws from text. And if you are publishing audio description, for the most part, it’s another audio file, which doesn’t contain a text alternative. Google wouldn’t read the audio track, just like it wouldn’t read the audio of a video with spoken word.

You could certainly publish a text version of the audio description in the description of your video, or for the very few players that have the ability to use a web VTT description track, that would also help with SEO. But the use case is really small there. The main SEO benefit from the description would come from adding it to the description of your video itself in the video player.

OWEN EDWARDS: Right. That’s part of why there’s been a lot of discussion around the idea of whether description is better done as an audio track or as a text track. But inherently, the people who are looking to consume it, looking to get the benefit of it, want to hear it as audio. So it’s usually a recorded audio track. There have been ways to do it with a web VTT track that’s spoken by the screen reader. I touched on some limitations around that, but it would have SEO benefits.

The best solution there would be to include an audio description track as a piece of recorded audio, but to also include the transcript, which combines both the captioning and text version of that audio description track.

Watch the full webinar below!

3play media logo in blue

Subscribe to the Blog Digest

Sign up to receive our blog digest and other information on this topic. You can unsubscribe anytime.

By subscribing you agree to our privacy policy.