5 Elements of Accessible Video

June 13, 2019 BY ELISA LEWIS
Updated: February 14, 2025

Whether you’re a producer or consumer, online video has exploded in popularity and necessity in recent years. It’s estimated that the past two years have shown a 52% increase in weekly video consumption, and experts know that the power of video content can be harnessed for the benefit of your brand. With so much video being produced and viewed on a regular basis, it’s important that online video is accessible to everyone.

Inaccessible video excludes the millions of people living with a sensory or motor disability from watching your videos – not only is this discriminatory, but it prevents your brand from reaching an entire demographic of viewers who want to engage with your content.

So how exactly do you make video content accessible? According to the most comprehensive and widely accepted standards for digital accessibility, accessible video should be perceivable, operable, understandable, and robust enough so that anyone can easily navigate and engage with your video, using the broadest range of technologies and interfaces.

In this blog, we’ll equip you with the tools you need to make achieving accessibility easy – let’s dive into the top 5 elements of accessible video.

1. Captions

It’s no surprise we’re talking about captions first – it’s what we do! Captions provide a text-based and time-synchronized alternative to a video’s audio content. In order to make video accessible to d/Deaf and hard of hearing viewers, captions must also include relevant non-speech elements (like speaker identifications and sound effects) that are critical to understanding the content’s plot.

There are two types of captions: closed captions and open captions. The difference is based on user control.

Closed captions are a separate asset from the video, and are usually added to the video as a sidecar file. The viewer can toggle the captions on or off when needed. Closed captions are typically used for online videos.

Open captions, on the other hand, are burned directly into the video and don’t allow the user to toggle the captions on or off. This works well if a video platform doesn’t accept sidecar files. Open captions are ideal for social media videos because they catch the viewer’s attention. Social media videos on platforms like Facebook and Instagram autoplay without sound, so adding captions help viewers understand the audio. Facebook found that adding captions to video increased viewership by 12%.

Creating Captions

While it is possible for individuals/organizations to create captions themselves, we don’t recommend this if you’re looking for truly high-quality captions.

In order to create captions manually, you have to use an automatic speech recognition (ASR) software to transcribe the audio from your video into text.

ASR is notoriously inaccurate, and you’ll spend a good amount of time editing the transcript. When you have a large amount of content, this can become extremely inefficient and time-consuming.

The best way to create high-quality captions is to go to a professional captioning vendor. A captioning vendor that uses both technology and human editing when creating a transcript is the most cost-effective way to ensure accurate captions.

At 3Play, our team of professional transcriptionists reviews each file to deliver true 99% accuracy.

How to Add Captions to Video

Not all video players are created equal. Therefore, the method by which you add captions to your video will vary depending on what platform you use.

There are four main ways you can incorporate captions:

Sidecar file = the most common method for adding captions, predominantly uses the SRT file format.
Open captions = captions are included directly in the video content.
Caption encoding = captions are embedded into the video and presented as a singular asset.
Integration or API delivery = automates the captioning process by creating a link between your vendor and video player.

Captioning Standards

Caption quality is important because deaf and hard of hearing viewers rely on captions to understand the audio information of a video. Inaccurate or unintelligible captions are not only inaccessible, but they’re distracting.

The Web Content Accessibility Guidelines, or WCAG, was created by the World Wide Web Consortium (W3C). WCAG is a set of guidelines for accessible media, including accessible video. Although WCAG isn’t a law, it’s a standard recognized by many countries and provides comprehensive criteria for achieving inclusive web design.

In order to more explicitly clarify standards related to what does or doesn’t need to be included in captions, we recommend following key elements from the Described and Captioned Media Program’s (DCMP) Captioning Key:

Accurate = errorless captions are the goal for each production.
Consistent = uniformity in style and presentation for all captioning features is crucial for viewer understanding
Clear = a complete textual representation of audio, including speaker identification and non-speech information, provides clarity.
Readable = captions are displayed with enough time to be read completely, are in synchronization with the audio, and are not obscured by (nor do they obscure) the visual content.
Equal = equal access requires that the meaning and intention of the material is completely preserved.

2. Audio Description

Audio description is an audio track that narrates the relevant visual information in media, and an accommodation for providing accessible video to blind and low-vision viewers.

Synthesized speech or human voice actors can be used to generate audio description, which would narrate key elements like character movements, facial expressions, and other visual information that would be essential to understanding the media’s plot.

Standard vs. Extended Audio Description

Standard audio description allows a small amount of narration to be added in the natural pauses in dialogue of the original content. The descriptions should be concise and able to fit into the allotted time, so they enhance the original piece rather than distract from it. Standard description works best for content that normally has frequent pauses or a small amount of detail that needs to be described.

Extended audio description is not constrained to the natural pauses of a video. Instead, it allows the original source content to be paused in order to make room for descriptions as needed. When using extended description, you will notice the video and description begin playing, then the source video pauses temporarily while the description continues. After the description is complete, the source video will resume playing again.

How to Create & Add Audio Description

Since audio description is a newer technology, that means it’s less widely understood and fewer systems support it. However, there are still ways you can include audio description alongside your media content!

When creating descriptions…

Include descriptions early in the video production stage, so it can be naturally woven into the content
Create a separate soundtrack with additional time allowed for description
Write a separate description script that aligns with the video content’s timeline

When adding descriptions…

Publish a second track
Publish a text-based description that screen readers can read
Use an audio description embed, or work with a vendor

3. Transcripts

An interactive transcript is a time-synchronized transcript that provides a better user experience, by allowing search across the spoken audio of a video and play from any point.

With an interactive transcript, you can search for a keyword and see every instance of where that keyword is spoken within the transcript. By clicking on the keyword, the viewer can jump directly to the point in the video where it’s spoken. This is especially helpful when you want to navigate to more pertinent parts of the video.

Benefits of adding transcripts

Associating a transcript with your video content has the potential to improve user engagement, increase comprehension, boost search engine optimization (SEO), and help foreign language learners. Interactive transcripts have proven to be an especially beneficial tool in the classroom, aiding in note-taking, comprehension, and clarification.

Without an interactive transcript, students would have to scan an entire video to find the most useful part. Now, students can simply search for a specific keyword and go directly to that point in the video – saving them time to study what’s most important.

At the University of South Florida St. Petersburg (USFSP), the Distance Learning Accessibility Committee created a study to determine if interactive transcripts were beneficial for students. They found that 94% of students, including those who aren’t deaf or hard of hearing, found interactive transcripts helpful as a learning tool.

4. Accessible Video Player

An accessible video player is one that supports audio descriptions, captions, and transcripts. In addition, the following should apply in order for a player to be accessible:

Be screen reader and keyboard compatible
Should not autoplay, but if it does, it should provide a mechanism for viewers to pause or stop the video
Use proper color contrast in video player controls

You’ll know if a video player is accessible by running a simple test. To test for keyboard accessibility, for example, unplug your mouse and try the following commands using only your keyboard.

Press enter or the spacebar to activate the play/pause control
The up or down arrow keys should control the volume
The left or right arrow keys should control the rewind/fast forward function

Accessibility workarounds

When video players aren’t fully accessible and don’t support audio description, captions, or transcripts, there are some ways around it so that you can ensure accessible video content.

When captions aren’t supported:

Use open captions: publish on your main video or use as a separate video with a link to the alternative video
Provide a transcript: you can paste it in the description of the video or provide a link to the transcript

When descriptions aren’t supported:

Use the audio description plugin: this allows you to have a separate audio track on your original video
Provide a separate video: make sure to link the described video where the original video is found
Provide a merge descriptive transcript: you can publish it within the video description or provide a link to the transcript for screen readers to access

5. Scripting & Quality Audio Equipment

Some elements of accessible video can be influenced at the time of production. When recording media, it’s critical to ensure the sound is clear, understandable, and readable. No matter how viewers choose to engage with your content, starting with an emphasis on sound quality will provide an equitable viewing experience for everyone.

The first step is create a script and stick to it (as best as possible). The goal here is to limit any unnecessary filler words like ‘um’ and ‘uh,’ which can be distracting for the viewer. Not only that, but when using technology like ASR in the caption production process, filler words can bleed into others and ultimately result in incorrect caption outputs. If you don’t have a script, edit the final transcript to eliminate filler words.

Note: the only time filler words should be included is in scripted content, like for television or movies. In these instances, filler words are used deliberately as a part of the character’s performance.

The second step is to use good audio recording equipment – namely a microphone, which is an essential component to quality sound. The clearer the original audio, the easier it will be to transcribe later. For optimal output, avoid placing the microphone right next to your mouth, and instead place it below or to the side of the speaker’s mouth. We recommend three types of microphone for various uses cases:

Dynamic = perfect for recording at home or in noisier environments; less sensitive to background noise and can handle loud volumes.
Condenser = ideal for recording in locations where background noise is controlled; sensitive to noise to provide high output levels.
Ribbon = best for capturing all sounds in a room; sensitive to noise, and provides a smooth, warm output sound.

Learn more about implementing video accessibility in our checklist:

Captioning and Transcription for Higher Education

by Noah Pearson in Video Accessibility

Strategizing Accessibility in Higher Education [Webinar] There are many benefits to offering captions for online video in higher education institutions. Closed captioning in higher education makes videos more accessible to students who are deaf or hard of hearing. By prioritizing video accessibility,…

May 21, 2025

Caption Formats: Acronyms Explained

by Rebecca Klein in Video Accessibility

Understanding all of the caption formats and selecting the right one is essential in creating accessible, platform-ready video content. If you’ve encountered acronyms like SRT, WebVTT, or SMPTE-TT and aren’t sure what they mean, this guide provides a clear breakdown of the…

May 1, 2025

How to Prioritize Backlog Video Content for EAA Compliance

by sofia in Video Accessibility

The European Accessibility Act (EAA) has set a deadline of 2030 for audiovisual media services to ensure their backlog video content is compliant. This means that any existing video content that doesn’t meet accessibility standards must be updated or replaced by the…

Updated April 16, 2025

Localization

Accessibility

Platform

Further Reading

Captioning and Transcription for Higher Education

Caption Formats: Acronyms Explained

How to Prioritize Backlog Video Content for EAA Compliance