« Return to video

Intro to Audio Description [TRANSCRIPT]

REBECCA KLEIN: Hi, everyone. And thank you for joining me today for Intro to Audio Description with 3Play Media. This presentation is under 30 minutes. Let’s get started.

My name is Rebecca. And I’m a content marketing specialist at 3Play Media. And just so you all know, I stutter. So when you hear those pauses, that’s all that’s happening.

We have a simple agenda today. I’ll start out by covering what audio description is at a high level so that you have a solid understanding. This overview will include how to publish audio description, the benefits of audio description, and some accessibility laws that impact audio description. And then I’ll briefly cover who 3Play Media is and what we do. And we’ll end with a Q&A session.

So let’s start with the basics of audio description. What is it? Audio description is an accommodation for blind and low vision viewers. It’s a secondary audio track that plays in addition to the main audio track and is often represented by a little ad icon similar to the CC icon for closed captions.

On the next slide, I’ll show you an example of audio description in a trailer for the first Frozen movie by Disney. Pay close attention to any dialogue or textual clues from the characters, or the lack of contextual clues, and how the audio description supplements what’s lacking in the dialogue. If you’re interested in watching more examples of audio description, we can drop a link in the chat to our website.

OK. So for this video, you can try watching without looking at the screen. You can close your eyes or just look away. And see if you can figure out what’s going on in the scene based on the audio alone. So I’m going to start this now.


– From the creators of Tangled and Wreck-it Ralph, Disney, a carrot-nosed, coal-eyed snowman shuffles up to a purple flower peeping out of deep snow.

– Ooh. Hello.

– He takes a deep sniff.

– His nose lands on a frozen pond. A reindeer looks up and pants like a dog. Seeing the reindeer slip on the ice, the snowman smiles and moves towards him, though actually he’s running on the spot. The reindeer falls on his chin. The snowman uses his arm as a crutch. The reindeer paddles his front legs.

Head over heels, the snowman crawls along the ice. The reindeer does the breaststroke. The snowman rolls his body but flips onto his back. The reindeer’s tongue sticks to the ice. The snowman hurls his head. Twig arm and reindeer lips tug at the carrot. The carrot flies off and lands in soft snow.


– The reindeer goes after it.


REBECCA KLEIN: OK. So I’m going to stop there. As you may have noticed, there’s not any real dialogue in this scene to provide context. Really all that we have to go off of are some verbal expressions in the musical track. So the audio description track makes up for this lack of dialogue. And without it, it would be nearly impossible to know what’s happening.

The audio description track does a great job of visually bringing these fictional characters to life. And the description really paints a great picture of a whimsical scene. And it’s creative. It’s accurate. And it fits the nature of the scene perfectly. So I hope this gives you a better idea of what audio description is and why it’s important.

So next we’ll talk about the two types of audio description, standard and extended. The Frozen example was standard AD. The audio description snippets were able to fit in to the natural pauses within the video. And since there was no dialogue, there was a lot of space to insert the descriptions without interrupting the flow of the scene.

Extended audio description allows you to add pauses to the video to make room for descriptions as needed. So if content is packed with dialogue, then extended is a great option. And it’s also useful for more dense and complex content such as lectures or dialogue-heavy presentations.

So now let’s talk a bit about how to create audio descriptions. The first option is a more proactive solution. And you can narrate at the time of the recording. So for example, in a recorded lecture, the professor can describe the visuals on the slides as they present. And this allows you to eliminate the need to go through and add AD in post-production. But if you can’t do this, then there are some other solutions.

So you can create a text-only description, writing down all of the relevant visual information in the video and making the text available to viewers. However, it’s important to note that this method loses any cinematic effect for the viewer and doesn’t truly offer an equitable experience for blind and low vision viewers.

If you created a text description and have good recording equipment and video editing software, then you can record your own voice descriptions or use synthesized speech and then merge this with your sports audio and output a second video with AD. And then lastly, there’s the option to outsource to a professional description center.

When it comes to creating audio descriptions, quality really matters. The Described and Captioned Media Program provides helpful guidelines and standards to follow for audio description. From the DCMP, we learn what to describe, when to describe, and how to describe to create great descriptions. It’s a great resource to check out and reference whether you’re making your own descriptions or outsourcing. And we can drop the link to the DCMP in the chat.

So the DCMP has five main measures for quality. According to the DCMP, a quality description is accurate, meaning there must be no errors in the word selection, pronunciation, diction, or enunciation. It must be prioritized, meaning that the description should narrate what is essential to the intended learning and enjoyment outcomes. It must be equal. And to create an equal viewing experience, the meaning and intention of the program must be well conveyed.

It must be appropriate, meaning the description should consider the intended audience, be objective and simplicity, and it must be consistent, meaning that both the description and the voicing, whether that’s through synthesized speech or a professional voice artist, it should match the style, tone, and pace of the program. So a great example of consistency is the Frozen an that I showed earlier where the audio description really matched the whimsical nature of the [INAUDIBLE].

So once you have your descriptions created, how can you publish them? The first option is to upload the audio-described MP4 track to your host video platform if it supports it. And this is one of the more user friendly ways to publish AD since it allows viewers to toggle the description on and off. However, there’s limited player compatibility for in-player audio description tracks so this option is not always possible.

If your player doesn’t support in-player AD, you can publish one video without the description and one video with the description burned into the track. And so this is like the Frozen example. And a helpful comparison is closed versus open captions where open captions are burned into a video and closed captions can be toggled on or off by the viewer. And then the third option is to have the MP4 file on hand and to provide it when someone requests it, or to host it directly on your website for viewers to access.

And then with 3Play, there’s actually a fourth option with our ACCESS player, which is a fully accessible media player that works with your existing media player of your choice. The ACCESS player integrates additional accessibility capabilities with your media player that enables the user to search and interact with the time synced transcript and also listen to an audio description track. So you’re able to add AD without having to republish the video.

And I just want to note that the ACCESS player is currently in a pre-release stage so while it is ready for use, you won’t find it on our website yet. But if you’re interested in getting started with the ACCESS player, you can reach out to your account manager or sales at 3PlayMedia.com if you’re not yet a customer.

So what’s the benefit of audio description and why should it be a priority? The most important benefit for AD is that it provides equal access for blind and low vision viewers. In 2018, the National Health Interview Survey found that 32.2 million adult Americans, about 13% of the population, have trouble seeing even with corrective lenses. So audio description is a critical accommodation for these viewers to have access to video content, entertainment, and really information in general.

And then besides accessibility, there are also other benefits to audio description. Audio description provides flexibility to view with videos in eyes-free environments or in situations where someone is unable to look at the screen 100% of the time. Audio description also helps to increase focus for viewers as we all tend to miss important visual cues when looking at screens for extended periods of time. And AD can also help improve brand image for companies. Prioritizing accessibility and inclusive design won’t go unnoticed by consumers. And in fact, a 2018 study showed that 2/3 of consumers prefer to purchase from brands that stand for something important, such as equal access. And then finally, you may be required by law to provide audio description, which I will talk about in the next slide.

So in terms of legal compliance. There are multiple laws that impact audio description. And although many of these laws don’t explicitly mention video accessibility since they were created years before today’s technological advancements, case law has shown that these laws have strong backing for audio description. So the Rehabilitation Act of 1973 was the first legislation to address equal access for individuals with disabilities. Section 504 applies to federal programs and federally funded programs which must make their content accessible. And Section 508 applies to federal programs and can be applied on a state level. And 508 references WCAG 2.0 level AA, which requires audio description.

Next, we have the Americans with Disabilities Act, which prohibits disability discrimination and requires auxiliary aids for effective communication, which means providing services like AD. Title II covers government entities so the content and materials they offer must be accessible. And Title III covers places of public accommodation. And under Title III, some precedent has been set that the ADA may apply to websites as well as physical locations. So [INAUDIBLE] example for this is the American Council of the Blind Persons’ Netflix lawsuit, which required Netflix to provide audio description for many of its programs.

And then the third major accessibility law in the US is the 21st Century Communications and Video Accessibility Act. The CVAA makes sure that accessibility laws enacted in the 1980s and ’90s are brought up to date with 21st century technology. And the CVAA started phasing in AD requirements in 2010 for some of the largest television markets. And as of 2020, the CVAA provided the FCC with authority to expand its AD requirements through 2024 so you can expect to see more description requirements in the future.

And last, we have the Federal Communications Commission, which enforces subscription requirements for broadcast television and online video that previously aired on broadcast TV.

And now let’s briefly talk about WCAG, which is the international set of guidelines that helps make digital content accessible. The Web Content Accessibility Guidelines outline best practices for making web content universally [? accessible, ?] operable, understandable, and robust. Now there are three levels of accessibility standards, A, AA, and AAA, with AAA being the highest level of accessibility and AA being what most organizations aim to achieve.

If audio description is required under WCAG 2.0 level A guidelines for pre-recorded synchronized video. Under level A, you can provide an AD track or full text alternative. And for AAA, you must provide extended descriptions as necessary. And also please note that WCAG 2.1 is the most recent published update and provides the most inclusive and mobile friendly guidelines.

So that concludes a high level overview of audio description. I hope that you learned something and found it helpful. Before we wrap up, I quickly want to talk about who 3Play is and what we offer in the media accessibility space. So here at 3Play, we want to help you create compliant, accessible, and engaging media. And we offer a range of services to help you do so from closed captioning to live captioning, subtitling, dubbing, translation, audio description, and more.

Our goal is to provide a future-proof solution to make accessibility easy, flexible, and scalable. Our customers can upgrade their services at any time. So if you come to us only in the captions but then decide that you need audio description later on, you can easily add that. And we also have a dedicated support and account management team who can help you reach your goals. You can talk about your account strategy and just serve as advisors for your success.

And one of the big things about 3Play is that we provide you with flexibility. So we work with many different industries. We understand that every company has different needs. So we can help accommodate numerous workflows, turnarounds, and formats. And we also offer a lot of really great free resources. On our website, you’ll find weekly blogs, ebooks, checklists, and research studies. And then we also have a ton of monthly webinars and a monthly podcast called Allied, which features a different accessibility expert each month and covers a range of topics.

Great. And we are almost out of time, so I apologize for that. Maybe I can get to one question. OK. So there’s a great question about how to go about choosing whether to use synthesized speech or human voice.

Now there are a lot of iterations here. First off, you’ll want to consider your audience, your budget. Some people prefer listening to a human voice while others don’t mind synthesized speech. But voice artists are definitely more expensive.

So if synthesized is all that you can afford, then it’s definitely better to offer this than nothing at all. And it also depends on what kind of content you’re creating. So synthesized speech can work great for online video and for non-cinematic content while voice artists are often preferable for cinematic content like movies, television shows, and other media and entertainment content.

So unfortunately, I think that’s all that we have time for. I just want to thank you all for being here today. And I hope you have a great rest of your day.