5 Elements of Accessible Video

Updated: June 29, 2021

people on their phones, looking up, and then looking back down to their phones

If you follow any marketing or tech blogs you’ve probably read about the staggering growth of online video. As technology continues to evolve, video remains the preferred medium of content. In fact, nearly one-third of online activity is spent watching video.

With so much video being produced and viewed, it’s important that online video is accessible to everyone – including people with disabilities.

An inaccessible video leaves millions of people living with a sensory or motor disability excluded from watching your videos. Not only is this discriminatory, but it’s limiting your brand from reaching an entire demographic of viewers who want to engage with your content.

So what exactly does a video need to become accessible? An accessible video should be perceivable, operable, understandable, and robust enough that anyone using the broadest range of technologies and interfaces can easily navigate the video.  

Below, we equip you with the tools you need to make achieving accessibility easy. Here are 5 elements you can incorporate today for accessible video.

1. Captions

Captions are a visual representation of the audio in a video. They make video accessible to deaf and hard of hearing viewers by providing a text-based and time-synchronized alternative to the audio. Alongside the spoken audio, captions also include non-speech elements like speaker IDs and sound effects that are critical to understanding the plot of the video.

There are two types of captions: closed captions and open captions. The difference is based on user control.

Closed captions are a separate asset from the video, and are usually added to the video as a sidecar file. The viewer can toggle the captions on or off when needed. Closed captions are typically used for online videos.

Open captions, on the other hand, are burned directly into the video and don’t allow the user to toggle the captions on or off. This works well if a video platform doesn’t accept sidecar files. Open captions are ideal for social media videos because they catch the viewer’s attention. Social media videos on platforms like Facebook and Instagram autoplay without sound, so adding captions help viewers understand the audio. Facebook found that adding captions to video increased viewership by 12%.

How to Add Captions to Video

Not all video players are created equally. Therefore, the method of which you add captions to your video will vary depending on what platform you use.

There are four main ways to add captions:

the four main ways to add captions are with a sidecar file, open captions, caption encoding, or integration and API

Creating Captions

There are individuals and organizations that create captions themselves, but this isn’t recommended if you want high-quality captions. In order to create captions manually, you have to use an automatic speech recognition (ASR) software to transcribe the audio from your video into text.

ASR is notoriously inaccurate, and you’ll spend a good amount of time editing the transcript. When you have a large amount of content, this can become extremely inefficient and time-consuming.  

The best way to create high-quality captions is to go to a professional captioning vendor. A captioning vendor that uses both technology and human editing when creating a transcript is the most cost-effective way to ensure accurate captions.

At 3Play, our team of professional transcriptionists reviews each file to deliver true 99% accuracy. 

Captioning Standards

Caption quality is important because deaf and hard of hearing viewers rely on captions to understand the audio information of a video. Inaccurate or unintelligible captions are not only inaccessible, but they’re distracting.

The Web Content Accessibility Guidelines, or WCAG, was created by the World Wide Web Consortium (W3C). WCAG is a set of guidelines for accessible media, including accessible video. Although WCAG isn’t a law, it’s a standard recognized by many countries.

WCAG provides criteria for achieving inclusive web design with three different compliance levels: A, AA, and AAA. 

Level A Captions are provided for all pre-recorded audio content
Level AA Captions are provided for all live audio content
Level AAA Sign language interpretation is provided for all pre-recorded audio content in synchronized media

In order to provide quality captioning, you should follow these elements provided by the DCMP.

  • Accurate: errorless captions are the goal for each production.
  • Consistent: uniformity in style and presentation for all captioning features is crucial for viewer understanding
  • Clear: a complete textual representation of audio, including speaker identification and non-speech information, provides clarity.
  • Readable: captions are displayed with enough time to be read completely, are in synchronization with the audio, and are not obscured by (nor do they obscure) the visual content.
  • Equal: equal access requires that the meaning and intention of the material is completely preserved.

Captioning Laws

the laws that apply to captioning are the ADA, The Rehabilitation Act, The CVAA, and the FCC

2. Audio Description

Audio description is an audio track that narrates the relevant visual information in media, and it’s a great element for accessible video.

Audio description assumes the viewer cannot see. Therefore, it’s used as an accommodation for blind and low vision viewers.

With audio description, a human or synthesized voice describes the key elements like character movements, facial expressions, and other key visual information essential to understanding the plot of the video.

Standard vs. Extended Audio Description

icon for audio description

Standard audio description allows a small amount of narration to be interspersed within the natural pauses in dialogue of the original content. The descriptions should be concise and able to fit into the allotted time to ensure that they enhance the original piece rather than distract from it. Standard description works best for content that normally has frequent pauses or a small amount of detail that needs to be described.

Unlike standard, extended audio description is not constrained to the natural pauses of a video. Instead, it allows you to pause the original source content to make room for descriptions as needed. When utilizing extended description, you will notice the video and description begin playing, then the source video pauses temporarily while the description continues. After the description is complete, the source video will resume playing again.

How to Create & Add Audio Description

Since audio description is a newer, less understood technology, there are fewer platforms and systems that support it. 

how to create audio descriptions

how to add audio descriptions

Pros & Cons of Synthesized Speech

Typically, audio descriptions are done by human actors. However, since technology has evolved, we’ve now reached a point where we can use computer-generated, or synthesized speech. Both techniques have their advantages and disadvantages.

  • Extremely cost-effective compared to paying a human voice actor
  • Shorter production/turnaround time
  • Easy to make edits post-production
  • Control over voice output settings
  • Screen reader users are familiar with synthesized voice
  • Doesn’t portray tone and emotion as a human would
  • No subjective input from voice actors (exact dictation from script)
  • Pronunciation isn’t exactly the same as a human

Audio Description Laws

laws that apply to audio description are the Rehabilitation Act, the ADA, and the CVAA

3. Interactive Transcripts

An interactive transcript is a time-synchronized transcript that allows a user to search across the spoken audio of a video and play from any point in the video by clicking within the transcript. The interactive transcript is hooked up to the video player and works to provide an interactive experience for the viewer.

With an interactive transcript, you can simply search for a keyword in the search bar to see every instance of where the keyword is spoken within the transcript. By clicking on the keyword, the viewer can jump directly to that point in the video where the keyword is spoken. This is especially helpful when you want to navigate to more pertinent parts of the video.

example of the interactive in use. person is searching for wcag as the video plays. in the search bar you see everywhere that wcag is mentioned

There are many benefits to providing an interactive transcript in tandem with your video. It improves user engagement, aid in comprehension, helps English as second language learners, and boosts SEO. These are just to name a few.

Improve Engagement for Learners

Interactive transcripts have been a very useful tool in the classroom. They help students with spelling, note-taking, comprehension, and clarification.

Without an interactive transcript, students would have to scan an entire video to find the most useful part. Now, students can simply search for a specific keyword and go directly to that point in the video – saving them time to study what’s most important.

At the University of South Florida St. Petersburg (USFSP), the Distance Learning Accessibility Committee created a study to determine if interactive transcripts were beneficial for students. They found that 94% of students, including those who aren’t deaf or hard of hearing, found interactive transcripts helpful as a learning tool.

How to Add an Interactive Transcript

You can add an interactive transcript with your video in one of the following ways.


  1. Under the video player, select the ellipsis “more” icon (…)
  2. Select open transcript. A transcript will pop-up on the right-hand side of the video.

John Oliver and sesame street video clip


With 3Play Media, you can publish an interactive transcript on your own web page. After your file has been transcribed, you’ll get the embed code from your 3Play account. The embed code is dynamically generated based on the video player you use, features enabled, size, and styling.

MIT OpenCourse lecture

4. Accessible Video Player

red play button

An accessible is one that supports audio descriptions, captions, and transcripts. In addition, the following should apply in order for a video player to be accessible:

  • Be screen reader and keyboard accessible
  • Doesn’t start automatically, but if it does, there should be a mechanism for viewers to pause or stop the video
  • Use proper color contrast in video player controls

You’ll know if a video player is accessible by running a simple test. If you want to test if it’s keyboard accessible, unplug your mouse and try to access the video player elements using only the keyboard.

  • Press enter or the spacebar to activate the play/pause control
  • The up or down arrow keys should control the volume
  • The left or right arrow keys should control the rewind/fast forward function

Alternate Forms of Media

When video players aren’t fully accessible and don’t support audio description, captions, or transcripts, there are some ways around it so that you can ensure accessible video content.

When captions aren’t supported:

  • Use open captions: publish on your main video or use as a separate video with a link to the alternative video
  • Provide a transcript: you can paste it in the description of the video or provide a link to the transcript

When descriptions aren’t supported:

  • Use the audio description plugin: this allows you to have a separate audio track on your original video
  • Provide a separate video: make sure to link the described video where the original video is found
  • Provide a merge descriptive transcript: you can publish it within the video description or provide a link to the transcript for screen readers to access

Video Player Legal Requirements

the laws that apply to accessible video players are the ADA, Rehabilitation Act, and the CVAA

5. Use a Script & Good Audio Equipment

An element of accessible video is ensuring the sound is clear, understandable, and readable. This will provide an equal viewing experience no matter how your viewers choose to engage with your content.

The first step is to create a script and stick to it as best as possible. This will limit any unnecessary filler words like “um” and “uh”, which can be distracting for the viewer. The only time-filler words should be used is in scripted content, like a TV show or movie. In this case, it’s part of the character’s line and it’s deliberate.

If you don’t have a script, go back and edit the final transcript to eliminate any filler words.

When using ASR, the filler words can bleed into other words, completely changing the meaning of the word.

Note: if you work with a captioning vendor, ask them for a “clean read” transcript. They will omit the filler words for you.

The second step is to use a good microphone, which is essential for audio quality. It may cost a bit of money, but trust me when I say it’s worth the investment! The clearer your audio is, the easier it will be to transcribe later.

Here are some microphone types we recommend:

dynamic, condenser, and ribbon microphones

One last tip: make sure you avoid placing the mic right next to your mouth. The ideal microphone placement is below or to the side of the speaker’s mouth.

Ready to create accessible video? get started blog cta

3play media logo in blue

Subscribe to the Blog Digest

Sign up to receive our blog digest and other information on this topic. You can unsubscribe anytime.

By subscribing you agree to our privacy policy.