« Return to video

Toolkit for Live Captioning Events [TRANSCRIPT]

GEORGIA MCGOLDRICK: Thanks, everyone, for joining this webinar entitled Toolkit for Live Captioning Online Video. I’m Georgia McGoldrick from 3Play Media, and I’ll be presenting today. So let’s begin.

On today’s agenda, we’ll cover the following topics. The first question we’ll cover is what are live captions, followed by strategies for streaming remote events and meetings, some best practices for streaming live captions, what do you do with your live captions after your event has concluded, why you should live caption, and who is 3Play Media. Then, finally, we’ll finish off with a Q&A at the end.

So what are live captions? Live captioning is much different than closed captioning, which is typically used for prerecorded video. Live captioning is for events happening in real time– for example, this webinar or a meeting, fitness classes, conferences, or online learning environments. Live captions ensure that all of your live events are accessible to the deaf or hard-of-hearing community as well as make your content more engaging.

Live captions are created in various ways. The two main options would be through automatic speech recognition technology or by human stenographer. There might be a slight delay in live captioning as ASR technology is processing the words being spoken or the stenographer is typing. It all depends on the platform.

When talking about live captioning, there is some important terms to know. For example, CART, C-A-R-T, which stands for Communication Access Real-Time Translation. And this is a service for live captioning that involves humans or stenographers, or captioners, as they’re typically called. And this is mostly done remotely, although there are ways to do this on site as well at your live events.

Next we have ASR. This stands for Automatic Speech Recognition, and this technology is used for live automatic captioning solutions.

Now let’s talk about streaming remote events and meetings. This is clearly a topic that’s been coming up lately with growing concerns over the coronavirus. For example, this is remote right now, so this is a heavy topic here.

So it’s important to create a strategy for this. Before going remote, it’s important to have that plan in place. You’ll want to give ample notice to any potential meeting attendees. You’ll need to choose a platform that’s accessible for all attendees and make sure to host a training so everyone is in the loop on how everything works and what the expectations are.

You can give notice a few different ways. You can send notification reminders. Check your time zone compatibility with others. Make sure attendees are aware of the software you’re using and what they need to join.

You can also create and publish a public calendar that people can subscribe to, including meeting credentials on the meeting descriptions, and that makes it easier for people to join.

You’ll also definitely want to test your live stream to make sure the audio and video feeds are high quality. This is very important. The connection affects the stream quite heavily. And also have your attendees test their connections and alert you if they cannot hear you or see you.

Choosing a platform is critical. There’s a number of requirements you should look for, including whether or not the platform fits into your workflow. Some other things to keep in mind are, is it accessible? Can you live caption? Is it screen-reader and keyboard accessible? Is it secure? How do they use your data? How do they use your meeting attendees’ data? Is it interactive? Can attendees ask questions, raise their hands, answer polls? So these are all things to consider when looking at a number of different platforms.

Hosting a training is the best way to get people started and also answer any outstanding questions before you scale this up. Some things you can do are walkthroughs. This can act as a dry run to make sure everything works as designed. Also engagement tactics– things like handouts, polling, chat windows, Q&A, hand raising– are always to engage your audience. And finally, live captioning should also be covered in the training to make sure it’s working and everyone knows how to turn the captions on or off.

We have another great webinar coming up that kind of dives into this a bit more. It’s entitled How to Create Accessible Presentations. This is coming up on Thursday, March 26 at 2:00 PM Eastern Time. And again, we will send out this presentation after the fact, so you can register using that link there, which is bit.ly/2Q3ZqSV.

Let’s talk about quality of live captioning. For closed captioning, you’ll often hear about the 99% accuracy rate. And again, when I talk about closed captioning, that is that prerecorded video content. And since live captioning is happening in real time, the accuracy rate can be lower here.

There are many factors that can affect the accuracy of your live captions. For example, live captioning tends to be verbatim, so “ums” and “uhs” can get in the way. In addition, many words are homophones, or words that sound like other words, which will result in a distorted meaning.

Live captioning accuracy can typically range from about 80% to over 90%, sometimes 95%, and it really depends on the external environment, like background noise, for example. Also, accuracy changes depending on whether you’re involving human transcriptions or solely using ASR, or Automatic Speech Recognition, technology.

Typically, humans tend to be more accurate, but they will miss more words, whereas ASR will be less accurate but not miss any words. Live automatic captions with ASR will also not have speaker identifications or include non-speech elements. So things like if music is playing, typically closed captions would have in a bracket music playing, but live captions would not.

Latency is another element of live captioning quality. Since the live captions are being generated in real time based on the spoken dialogue within the video or audio, there’s always going to be a slight latency. It ranges depending on the quality of streaming equipment and overall connection, but on average, latency should be about three to five seconds.

Currently there are no governing bodies for live captioning, but certain states do have standards in place for the quality of live captioning in instances like live court reports or even sporting events, for example.

Let’s cover some general best practices for live captioning. These are both helpful if you’re using an automatic software or a CART service with human captioners. As I mentioned before, you definitely need a strong network connection based on the streaming equipment that you’re using. And it helps if you have a hardwired connection to the internet via an ethernet cable as opposed to using Wi-Fi since sometimes Wi-Fi can be quite unreliable.

You’ll want good audio quality. So we recommend investing it in a microphone or a headset instead of relying on your computer or phone mic. These microphones can often pick up surrounding sounds and sound very distant or echoey. A good microphone will cost around $50.

Next, your surrounding sounds– so this is very important. In particular, with automatic live captioning, you have to be careful of your surroundings. A computer’s not as smart as a human to detect what is just background noise, so making sure you have little to no background noise is imperative. You can hang blankets around your room, for example, to help dampen the sound, or sound boards also help. And we also recommend you avoid echoey rooms.

Next best practice here is you’ll want a single speaker at a time, if possible, to avoid multiple speakers speaking at the same time. And lastly, clear speech and pronunciation is imperative for accuracy.

So how do I add captions to live events? Most meeting, streaming, and webinar platforms allow you to add external captioning to events. So in other words, you can incorporate captions generated by a third-party vendor for your live events.

Right now, 3Play has live automatic captioning integrations with players like Brightcove, YouTube, JW Player, and Zoom. However, this is expanding to other platforms as well.

What platforms do you host your live streams on? So I pose this question to the group here if you guys can chat to us in the chat box. Zoom, Facebook, Brightcove, WebEx, Adobe, Blackboard Collaborate. Great. These are all top platforms that we’ve definitely heard of before from some live customers, like Skype, Periscope, YouTube again. Great. Awesome.

So what happens if my live platform doesn’t support live captioning? Live captioning– again, it is growing a lot, especially lately, but it’s still underserved in some cases.

So if your video or meeting player doesn’t support live captioning, you can always link to an external URL that streams the live captions or transcripts. And to do this, you just need to add a snippet of code, code that includes the streaming captions or transcript output, and embed that code onto a page that you’re linking to. This works similarly to a plug-in embed solution.

So to do this, you just want to make sure to share out that URL with your audience prior to the event starting, and the player and the captions should be located on that page. 3Play offers this capability to copy the embed code of the live streaming captions for implementing on your external URL page.

So an example of this would be– obviously YouTube supports live captions, but, for example, if it didn’t and you had your YouTube Live event there, in the description of the video, which is usually located below the video player, you could include a link out to a separate page that has the captions or transcript on that page.

So what should you do post-event? So are live captions good enough? That’s such a great question. So can we stop there? Are they good enough? And accuracy, obviously, comes into play here.

And from our tests and research, live automatic captions range from 80% to over 90%, like I mentioned before. And the industry standard for post-production captioning, or closed captioning, is 99% measured accuracy. So if you want to provide a truly accessible solution, you’ll want to make sure that you edit the transcript and remove any errors. Also add speaker IDs and non-speech elements once the live event is over.

And in a recent State of Live Captioning report, we analyzed ASR software from various providers such as IBM Watson, Google, and Amazon, and we uncovered that while ASR accuracy rates are improving across the board, they’re still not a sufficient solution to rely on in post-production captioning. So you’ll still want to strive to get to that 99% accuracy rate.

The report also helped us conclude that humans are crucial for providing accurate captions, which is why, when you download your transcript after a live broadcast, you should take the necessary steps to correct it if you’re going to publish the recording after the fact.

There are a few different ways you can go to make sure your transcript or captions from your live events are cleaned up for post-production. So you can do it yourself or DIY it by editing yourself and converting file formats, but this tends to take time and resources to keep up with. You can also utilize a third-party service to do it for you. You just need to submit your recording and let it process through.

So why should you caption? Accessibility, or also known as “a11y.” For the 11 words between the A and Y in accessibility, we call that a11y for short.

And there are many reasons why you should caption. Obviously the biggest is accessibility. So there’s 48 million Americans with hearing loss, which is about 20% of the US population, and 360 million deaf or hard-of-hearing individuals around the world, so captions help make your content accessible to them.

Also, a good stat here is that 41% of videos are incomprehensible without sound or captions, which means that if someone doesn’t have headphones, they won’t watch your video, typically,

Facebook uncovered that 85% of Facebook videos are watched with the sound off. So your video relies heavily on sound, and a lot of people are probably scrolling past them if there are not captions. So video accessibility is tremendous. Benefits also for improving SEO, the user experience, reach, and brand lift.

We included a study by Liveclicker that found that pages with transcripts earned an average of 16% more revenue than they did before transcripts were added. Also according to Facebook here, videos with captions have 135% greater organic search traffic.

So a research study from the Journal of the Academy of Marketing Science found that captions improve brand recall, verbal memory, and behavioral intent. So there are true benefits of captions beyond just accessibility.

Also, taking a look at the education space, video accessibility definitely benefits students, as noted here. So 98.6% of students find captions helpful. 65% percent of students use captions to help them focus, and 75% of students that use captions said they use them as a learning aid.

Let’s quickly go into some of the accessibility laws. So laws around captioning– we have the Rehabilitation Act of 1973, which was the first major a11y law in the US. It has two sections which specifically impact video accessibility. Section 504 is a broad antidiscrimination law that requires equal access for individuals with disabilities. And this applies to federal and federally funded programs. Also, section 508 requires federal communications and information technology to be made accessible. There is also a section 508 refresh which references Web Content Accessibility Guidelines, or also known as WCAG.

Diving into WCAG a little bit, there are three levels– level A, AA, and AAA. So depending on what your organization is striving for, level A is the easiest to maintain. Level AA is what most people are aiming for, sort of the mid-level here of standards. And level AAA is the most comprehensive highest-accessibility standard.

Most laws and lawsuits actually mention WCAG compliance. So for now, that’s what is legally required. Only if a law explicitly states that web developers have to adopt the newest WCAG version do you need to make your content with WCAG 2.1 compliant.

So who is 3Play Media? So a little bit about who we are– we’re a video accessibility company based in Boston and spun out of MIT in 2007. We started out offering captioning, transcription, and subtitle services. We also offer audio description service for blind and low-vision individuals, and we’ve recently released our live automatic captioning solution.

We have over 5,000 customers spanning higher education, media, government, e-commerce, fitness, associations, and enterprise companies. So our goal is really just to make the whole captioning and video-accessibility process much easier. And we do that in a number of ways.

We have an easy-to-use online account system where you can manage everything from one place. We have a number of different options for turnaround, anywhere from a couple of hours to over a week– whatever fits your needs, really.

We have different video search plugins. I briefly touched on the plugin earlier. And we also have integrations for captioning and description that help simplify the process of creating accessible video. And really what we’re working toward is being a one-stop shop for captioning, description, transcription, and subtitling, and video accessibility as a whole.

So just quickly about our live automatic captioning solution– this allows you to schedule automatic captions for live streaming events and meetings. The solution uses 3Play’s automatic speech recognition technology and can be implemented through iframe or JavaScript embeds or natively within select video players.

All right, so now we can dive into some questions here. If you want to submit some of your questions in the Q&A box, we’ll get started in a minute.

All right, so let’s dive in. First question here is, “How do you get started with 3Play’s live captioning?” So to get started with that, you can feel free to contact us at livecaptioning@3playmedia.com. That’s an email alias that includes everyone on our team that is involved in this process, and that can get you up and running as quickly as possible. So again, that email is livecaptioning@3playmedia.com. You can also visit our website to learn more, and that specific page is 3playmedia.com/solutions/services/liveautocaptioning.

“Does your live captioning service integrate with any video platforms?” Yes. So we integrate with top live streaming video and meeting platforms such as YouTube, Zoom, Brightcove, and JW Player. I alluded to this a bit earlier, but we’re also looking to expand those players. So it’s great to know some of the platforms that everyone here is using.

“How does the live captioning process work?” So first you would create a live event in any of our integrated live stream video-platform integrations. So you’d create your event there. Then you would schedule the live automatic captioning in your 3Play account for that corresponding live event. Then you would start streaming your event, and your captions will display directly in the video player, or you can also display them through that embed code external URL solution.

Once the event is over, you can also download, edit, or upgrade your transcripts to our full human process. So that would include human QA and human editing. And you can also access the final transcript for editing yourself.

“Is there API support?” So yes, we do support API endpoints for ordering and also managing 3Play live events. All of our settings for live events are parameters you can set through the API. Our API documentation is located at docs.3playmedia.com/apiv3/overview. And we can include all of these links as well.

“Is Zoom an accessible platform? What other platforms are there that you would recommend?”

So Zoom has many accessibility features. They have the ability to add live captions, keyboard accessibility, and screen-reader support also. And then Brightcove, JW Player, and YouTube are the other players that allow live captioning and have built-in accessibility features.

“If the speaker talks more slowly, does human captioning accuracy increase?” So yes, there are many ways to improve the quality of live automatic captioning, such as a single speaker, clear pronunciation, and little to no background noise. But yes, if the speaker does talk more slowly, it helps.

“Will the captioning be done by ASR or humans?” So 3Play’s solution uses ASR strictly. So there’s that option, but there are also many other human-only options.

“Can you caption your own YouTube Live presentations, or do you have to hire a professional?” So you don’t necessarily have to hire a professional. One thing I will mention about human solutions is that there is definitely a bit more pieces to the process. You have to obviously schedule time with the humans that are available to caption for you, and you have to hire them. So there’s much more logistics involved there.

But as far as YouTube goes, our integration works with YouTube Live. So that’s all done automatically based off that ASR.

“Can 3Play Media’s live captioning solution be integrated into the Panopto live stream?” So not yet, but this is definitely something we’re willing to explore. We want to explore several new platforms that people are using for live, obviously. So it’s definitely on our minds. And as a workaround, you can also use the embed code external URL solution.

So we’ll take one more question here. “Can captions be added to existing commercially available videos that don’t have captions?” So yes, you can always add captions to prerecorded videos with the 3Play Plugin solution, which is similar to that embed code that I mentioned earlier. And that can basically provide captions for any videos you don’t own and videos that don’t currently have captioning.

All right, last question here. “What are the impacts of using it for meetings where many people speak?”

So this is obviously a common occurrence for any meeting. There might be multiple people talking at the same time, and live captioning does not capture those speaker identifications. So it’s important, really, to focus on that post-production process where you can apply speaker identifications after the fact.

All right. Thanks, everyone, for joining today.