« Return to video

Closed Captioning with 3Play Media [TRANSCRIPT]

SOFIA LEIVA: Thank you, everyone, for joining us today for the webinar, entitled, “Closed Captioning with 3Play Media.” So presenting today is myself. My name is Sofia, and I work on our marketing team here at the 3Play, and then I also have my colleague, Ryan, who is part of our implementation team. And he’ll be here to help answer your questions at the end.

On the agenda today, we are going to talk about what are captions. So we’ll look at all the basics of how to create them, the laws around them, the benefits, how to publish them. We’ll briefly chat about who is 3Play Media, and then we’ll leave plenty of time at the end to answer all your questions. All right, let’s dive right in into what are captions.

We’ve probably all seen the CC icon on most video players that indicates that a video has captions. And essentially, captions are time-synchronized text that you read along with the audio. They were created as an FCC mandate in the 1980s as an accommodation for deaf and hard of hearing individuals. And the main thing around captions, instead of transcripts or subtitles, is that they include non-speech elements. So for example, they’re going to include the sound of a car, or if keys are jingling off screen. All those elements are essential to having an equal experience for those who are deaf or hard of hearing while they’re watching a movie with captions.

It’s important to distinguish between captions, subtitles, and transcripts. So captions, like I mentioned, are going to assume that the viewer can’t hear the audio, and so they’re an accessibility accommodation. Subtitles, on the other hand, are just going to translate the audio into another language. So subtitles aren’t going to include the non-speech elements like the sound of a car or keys jingling. And then transcripts are just a plain text version of the audio, and it’s not time-coded. Around the world, you’ll hear people use subtitles synonymously to captions, but here in the United States, closed captions are the accessibility accommodation and subtitles are a translation.

There are many ways for you to create captions. Number one is to do it yourself. And this is probably going to be very tedious. You are going to be transcribing every single word in the audio by yourself and adding the time codes. So you can imagine that that’s a very hefty thing to do. So you can bring the use of automatic speech recognition or automatic captions to help you be more efficient and have a more accurate transcript.

This is going to be, for example, like YouTube automatic captions, and then what you would do is you would go back and edit that transcript for maximum accuracy And we use a similar process here at 3Play, which is the third option, that you can use a captioning vendor. This is ideal if you have a lot of content, time-sensitive, and maybe if you have special requirements and you don’t have a team that can create captions for you.

It’s important to talk about caption quality. If you’re shopping around for captions, you’re going to hear a 99% accuracy rate. And what exactly does that mean? This means that in a transcript of 1,500 words, there’s going to be 15 errors total per word. So it leaves a 1% chance of error. And you can only achieve a 99% accuracy with human intervention.

Automatic softwares aren’t at that accuracy level currently. They are more around 80% to 90%. And you’ll never achieve 100% accuracy rate, just because you need to have that little leeway for human error. But you can get close to it for sure.

It’s also important to talk about placement. So for example, captions are usually in the lower bottom center of the screen, but they should be moved if they’re obstructing important visual elements. So think of a documentary, where you have the title of a person talking. If the captions are obstructing that, ideally you would want to move that away.

Then we have the question of will you do it verbatim or clean read. And this really comes down to the type of content that you’re creating. If you’re doing a scripted show, you want to do it verbatim. So in your captions, it would be important to include any “uh”s or “um”s that the speaker may say out loud. And then for clean read, this would be more for a lecture, where the “um”s and “uh”s can be really distracting, and you just want to provide a very clean read for the viewer.

Then we have the frame requirements. Typically, you do one to three lines with 32 characters per line. And they should last a minimum of a second on the screen to give plenty of time to see and read the captions.

In terms of style requirements, you want to use a non-serif font. So one like we’re using here in the presentation, where it doesn’t have those serifs– little symbols– that serif fonts usually have. A non-serif font is just much easier on the eyes to read.

And then lastly, it’s important to mention some of the standards that there are for caption quality. There is one called the Described, Captioned and Media Program. And this one basically gives you basic guidelines for how you should create captions, how you should include things like speaker identifications, non-speech elements. These are general guidelines and they’re really helpful if you’re creating your own captions.

But then if you’re in entertainment, you would want to adhere to the FCC standards of quality. And these ones basically cover things like your captions must be complete, they must be readable, things like that. And then WCAG standards we’ll cover in a little bit.

All right, so you’ve created your captions, and now it’s time to actually publish them. The most common method for publishing captions is going to be with a sidecar file. So essentially what this means is it’s going to be the downloaded file that you use in order to upload into the video player. One of the most common ones is going to be something called an SRT file. And a link to a blog that talks more about what that is in a little bit.

The encoded captions are going to be perfect for offline video. So I would think things like DVDs or kiosks. And these allow the user to turn the captions on or off.

If you’re looking to provide captions for social video– for example, like Instagram, where you can’t necessarily upload a caption file– then you would want to use open captions. And these are actually burned into the video, and the user can’t turn them off or on.

And then lastly, there is integrations. This is mainly if you’re working with a captioning vendor. It’s just going to be a workflow that allows you to automatically publish the completed captions back to your video.

Now I want to talk about the benefits of captioning because there is definitely more. Number one is obviously going to be accessibility. There are 48 million Americans with hearing loss in the United States, and then 360 million people around the world. And so adding captions to your videos makes it accessible to everyone.

There is also the SEO. So SEO stands for Search Engine Optimization, and this is really helpful if you’re trying to rank in Google. You can actually use captions to help your videos rank better. So for example, Googlebots can’t crawl a video, so having captions or a transcript actually allows them to know what the video is about, and they can crawl that and then rank your video accordingly. And a study by Facebook found that there is a 135% greater organic traffic for videos with captions.

Then there’s the branding side. Branding is definitely very important nowadays, and captions can really help your brand look good because essentially you’re saying, I want to be inclusive, but also it provides a better user experience for the people who are consuming your content. And in a study by the Journal of the Academy of Marketing Science, they found that captions improve brand recall, verbal memory, and behavioral intent.

Then there’s the comprehension and focus side of captions. So if you’re in the education space, or if you create a lot of educational content, this is going to be really valuable. A study by the University of South Florida St. Petersburg found that 98.6% of students find captions helpful, and many of them are actually using transcripts as a study guide. And then 65% of students use captions to help them focus. So you can think about if you have a really complicated lecture, captions are helpful to sort of know what the speaker is saying and really understand what the content is about.

And then lastly, we have the engagement side, which I mentioned a little bit earlier about improving the user experience. Captions– in a study by Facebook, they found that 41% of videos are incomprehensible without sound or captions. So really, having captions in your video will help people focus and engage, will possibly prevent them from scrolling past your video if it’s not captioned, and allow them to have the flexibility to view it wherever they are.

All right, let’s dive into the accessibility laws. And I’m going to preface this by saying that I am not a lawyer, so I would definitely consult your legal counsel to really understand how these laws apply to you. But I will provide just a brief overview of the laws and what you should know. So the Rehabilitation Act of 1973 is going to apply to anyone who receives federal or federally funded laws. So think if you’re a university or an education house.

Section 504 is a broad anti-discriminatory law that requires equal access for individuals with disabilities, and then Section 508 requires federal communications and information technology to be made accessible. Section 508 actually references something called the Web Content Accessibility Guidelines, or more commonly known as WCAG. And I’ll cover those in the next slide, but that is something that if you need to adhere to Section 508, you would need to adhere to those guidelines.

Then we have the Americans with Disabilities Act, or more commonly known as the ADA. Traditionally, it’s been applied to physical environments. So for example, you would need to provide a ramp or an elevator for wheelchair use. But we’re seeing more and more it being applied to the online space, particularly in the last couple of years. You are going to want to care about Title II and Title III. So Title II is going to apply to public entities, and Title III applies to private entities.

And Title III in particular is going to mention something as a place of public accommodation, which traditionally, like I said, would be something like an airport, but due to precedence, it’s now being applied to online environment. So for example, Netflix was sued a couple of years ago because they didn’t closed caption their content, and the courts actually said that because you can enjoy Netflix wherever you are, it actually is considered a place of public accommodation and therefore does need to adhere to Title III of the ADA and provide closed captions.

Then we have the CVAA. This is mainly if you are in entertainment. This requires you to closed caption any content that previously appeared on television, to have captions when you publish it online. And it also has requirements for something called audio description, which is an accommodation for blind or low-vision users, and those started to phase in last year.

And then lastly, we have the FCC, as I mentioned briefly. This one, again, is for broadcast, and it has specific caption quality standards that talk about things like the synchronicity of the captions, the completeness, and the placement.

So I mentioned WCAG earlier, or Web Content Accessibility Guidelines, and essentially what these are are they’re guidelines to make web content more accessible. And they cover things about your website to your videos, even to mobile accessibility. There’s several versions of WCAG. So the one that is more commonly referenced in laws is going to be WCAG 2.0, but I believe now they have a new version of WCAG, 2.4, and when you’re trying to make your content accessible, it’s always best practice to look at the more recent ones because they’re going to be more up-to-date with the way the world is functioning right now.

So WCAG has three levels of accessibility compliance, and each one requires a little bit more work or dev work to achieve. Level A, for example, in terms of video accessibility, is going to require transcripts for audio-only content and captions for pre-recorded video. Level AA is going to require captions for pre-recorded video and captions for live. And then level AAA is going to require a sign language track and also live transcripts for audio-only content.

All right. Now, briefly, I’m going to talk about who we are here at 3Play and then share some free resources that you can find on our website. So here at 3Play, we really want to help you create compliant videos, and we offer a range of services from closed captioning to live automatic captioning, subtitle, and audio description.

And our goal here is really just to make it easy and flexible and scalable for you. We allow you to upgrade your services any time. So if you come to us only needing captions, but then down the line you need audio description, you can definitely easily add that on. And we also provide you with an account manager who helps you reach your goals and talk about your account strategy.

And one of the big things about us is that we provide you with a lot of flexibility. So we can help accommodate numerous workflows, turnarounds, formats. Anything that you need, we can help you with.

We also offer a lot of really great free resources, and I’m going to send the link here in the chat. You’ll find weekly blogs, free white papers, checklists, and research studies all on our website in that link that I just sent. We also have tons of monthly webinars, which are free, and I’ll send a link here in the chat and share about them in the next slide. And then we also have a video accessibility course, which will go more in-depth into the things that we talked about here, and it’s free and you’re able to test your knowledge.

All right, some upcoming free events. So this webinar is part of a three-part series. Next week we’re going to be talking about audio description, so if you’re interested in learning more about that I would encourage you to sign up, and then we’ll also be talking about live captioning the following week. And then if you would love to celebrate Global Accessibility Awareness Day with us, or GAAD, we’re going to be doing a special session with actor Mickey Rowe, who is an autistic actor and one of the first actors with autism to be featured on Broadway, and then activist Michael Agyin, who is a deaf advocate and has done some really great work and has an awesome TED Talk I encourage you to check out.

And then we also are hosting another ACCESS at Home event. This is basically a free conference, and this one will last three days, with two sessions per day, and it’s focusing on post-pandemic accessibility. So we’ll have a lot of really great people there, and it’s free for you to register. And I’ll throw the link here.

Well, I already see a lot of great questions coming in, and I believe my colleague has gotten into some of them. But we encourage you to keep asking them and we’ll try to get to as many as possible. Cool.

So the first question we have here is, “how do Googlebots crawl videos? Does it matter if the CC is in the file or does it need a sidecar file?” So essentially what Google– it cannot crawl a video. It can look at the title of the video, but you would need to have something called a JSON-LD, which basically injects the metadata into the top of the page, and so then the bots can crawl that. And within that, you can include your captions or any keywords that would allow the bots to know what the video is about and how to rank it accordingly.

We have here a question around captioning PowerPoint presentations. Ryan, are you familiar with this?

RYAN MARTINEZ: Yeah, so that’s a great question. I was actually just typing it out. And thanks so much for everyone submitting these questions. They’re awesome. So in the case of PowerPoint, when there is visual information on screen and that text can be read– captions relate specifically to words that are spoken aloud so that those words are actually transcribed and available to someone who is deaf or hard of hearing and wouldn’t be otherwise be able to hear those words that were spoken, whereas the text on screen, both for captions and the slides themselves, the assumption is that the person who’s reading the captions can also read what is on-screen on those slides.

In cases where somebody who is blind and they’re not able to read that embedded text, that is a valid concern, and it definitely comes into play with WCAG 2.0 AA in terms of that standard, where most content does, in fact, need captions and audio description. So in cases where you upload a video that is mostly slides, the on-screen text would actually be read back aloud so that somebody who is deaf or hard of hearing– excuse me, blind or low vision– could actually hear what is just included as words on the screen.

So there are different use cases for captions versus audio description, which was not the subject of today’s presentation. Captions, though, do deal specifically with writing out visually what was spoken in a piece of content.

SOFIA LEIVA: We had another question, which you typed but would be good to say aloud, is can you clarify what are integrations and how they work?

RYAN MARTINEZ: Absolutely. So there are a number of different integrations that we do offer, and I’ll go ahead and include that as well in the chat just so folks can find that easily. But we do offer over 40 platform integrations that, in most cases, allow for you to upload content directly from your video platform with the option to configure that integration so that 3Play can actually automatically post back the appropriate format to that video.

So if you have content hosted on YouTube, for example, you would log into 3Play, enter your YouTube username and password. Your media library would then be visible in 3Play. So you could check boxes of what you wanted to order, and once that work is done, we take care of delivering that captioned file right back to YouTube so that your audience can simply click CC and view the captions that 3Play has produced.

But we offer a number of different integrations. I mentioned YouTube. That’s just one example, but most of them function in very similar ways.

SOFIA LEIVA: Great, thank you. And we had another question around speaker identifications. How exactly do those work?

RYAN MARTINEZ: Oh, sure. Great question. So speaker identifications work by allowing our editors– in some cases, you can choose a label of preference if you have– but we do have the option to allow our human editors to choose a speaker label that is most appropriate based on the type of content that was uploaded. So speaker labels is just a setting within your 3Play account where you can turn that on or off.

It’s also very easy to add speaker labels once our work is done. So if you prefer to leave that up to your own internal QA process, for example, you can add those labels in after the fact. But if you choose one of the labeling options that we have, all of that is taken care of during the captioning process. So by the time the file is delivered back to you and ready for download, those speaker labels are included.

SOFIA LEIVA: Great. And then we have one here– if you’re able to answer this– can screen readers read that closed captions of the video, and how exactly does that work?

RYAN MARTINEZ: That’s something that we would have to get back to you on. I’m not sure if there are specific requirements for specific video players, for example. What I can tell you is that when we post a closed caption file back to a video platform, that closed caption file is posted in a very standard format. So you may hear things like SRT or WebVTT, these various caption file formats kind of thrown around.

That format is very standard. So when the file gets posted back to YouTube, most screen reader technology looks for specific metadata or specific formats to be able to basically isolate that and read that back to the end user. So I know that sort of functionality depends on the video platform that the content is hosted in, but the formats that we post back to those players are compatible with screen readers. I just would have to understand the logistics of that a bit more.

But if you would like, you can certainly either privately or in the chat there, you can throw in your email address and I’d be happy to circle back and follow up with you on that.

SOFIA LEIVA: Great, thank you. And another question we have here is, how many languages of captioning do you provide?

RYAN MARTINEZ: So quite a few. So traditionally– or at least I guess when we first rolled out our transcription and captioning service many years ago, English was the focus. Of course, fast forward almost 20 years, and things have changed and folks are translating content more and more than they ever have. And so 3Play offers translation into over 40 different languages.

So if your content, for example, is spoken in English but you need that same piece of content to be available around the world, we have many international customers that translate content on a regular basis. But we also offer transcription services if your file happens to be spoken in a language other than English. So if you have a Japanese file, for example, folks are speaking in Japanese, we can transcribe content in that language as well.

SOFIA LEIVA: Great, thank you. We have another question here. If the Zoom auto captioning is on, does that meet the caption requirement for the webinars? And essentially, there are no caption requirements in terms of the accuracy for your captions and in terms of a webinar, but auto captions are notoriously inaccurate and they can really disrupt the experience for users. And essentially, if you have viewers who are deaf or hard of hearing, it can be really inaccessible for them. So usually for live events like this, and even for a pre-recorded video, we do recommend that you use a accurate service or something like live transcribers or live stenographers for a live webinar like this.

We have a question here, “how can I get started with 3Play?”

RYAN MARTINEZ: Yeah, so the easiest way to get started, if you’re looking to do it through the 3Play Media website, if you go to 3play.com, and then at the top of– I should say– I should actually read that correctly. 3playmedia.com. If you go to that web page– and we can throw that in the chat as well– but there is a Getting Started link at the top. It’s a pink button that says “Get Started.” And if you click that, that will allow for you to choose– we have a couple of different tiers to choose from– and that will walk you through that entire process.

Or if you wanted to speak with somebody directly, there is a chat widget within our account system. So same thing. From the main page, in the bottom right, you’ll see the option where you can chat someone at 3Play. Or you could start by emailing sales@3playmedia.com, and somebody can assist you directly.

SOFIA LEIVA: Well, I believe that’s all the time we have for today. Thank you, everyone, for joining us today, and thank you, Ryan, for joining us here for the Q&A.