Quick Start to Captioning [Transcript]
TOLE KHESIN: Welcome, and thanks for attending this webinar on captioning. Before we begin, can we just do a quick sound check. If you wouldn’t mind just raising your hand if you hear me OK.
OK, great, thanks. My name is Tole Khesin with 3Play Media. And again, this webinar is called Quick Start to Captioning. We have about 30 minutes to go through the basics of captioning.
We’ll try to make a presentation about 20 minutes, and leave the rest of the time for your questions. And the best way to ask questions is just to type them into the bottom right corner of your control panel. And we’ll keep track of them, and address them all at the end. And also feel free to email or call us after the webinar is done. Very good.
So for the agenda, we’ll start with a bit of an introduction to captioning. We’ll talk about captioning terminology, and how to create captions. We’ll then discuss the different benefits that come with captioning video. We’ll discuss the applicable accessibility laws, and then we’ll talk about some of the captioning products and services that we provide as well as some of the technologies that are out there. And we’ll leave the rest of the time for Q and A.
Let’s begin with just the very basics. So what are captions? So captions are text that has been synchronized with the media so that it can be read while watching the video.
Captions assume that the viewer can’t hear the audio at all. And so the objective is not only to convey the spoken content, but also the sound effects, speaker identification, and other non-speech elements. Basically, the objective is just to convey any sound that’s not visually apparent, but integral to the plot of the video.
So captions originated in the early ’80s as a result of an FCC mandate specifically for broadcast TV. But now, with the proliferation of online video in pretty much every facet of our lives, just the need for web captions has expanded a lot. And as a result, captions are being applied across many different types of media devices, especially as people become more aware of the benefits and as laws become increasingly more stringent.
So a little bit about the terminology that you’ll hear with captioning– so first of all, what is the difference between captions and a transcript? The difference is that a transcript is just a text. It’s something that you can create, for example, in a Microsoft Word document. There’s no time information.
Captions, on the other hand, have time information, and also the transcript is broken up into caption frames. And the timing information basically aligns each caption frame so it can be read along while watching the video. From an accessibility, from a legal point of view, captions are required anytime you have moving pictures or video. That even includes, for example, a PowerPoint presentation with an audio track.
A transcript, on the other hand, is sufficient in the case where you just have an audio-only program. So captions versus subtitles– the difference here is that captions assume that the viewer can’t hear anything at all, and so they include the non-speech elements, as I mentioned before– speaker identification, sound effects, and things like that. On the other hand, subtitles assume that the viewer can hear everything, but just can’t understand the language.
And so as a result, subtitles are really associated more with translation to foreign languages. Closed versus open captions– so this has to do with the way that the captions are rendered on the video. So closed means that they exist on the video as a separate track, and as a result, the users can turn them on and off.
Open captions means that they’re actually burned into the video, and the result is that users can’t turn them off. They’re there all the time. And there are advantages and disadvantages to each type, but with online video, things are really sort of moving toward closed captions, just because it provides a lot more flexibility.
Post production versus live captions– this has to do with the timing of when the captioning process is done. Post production means that the captioning process is done after the event has already been recorded. And the captions are usually ready a day later, or at some point afterward.
In contrast, live captions are performed by a person that’s typing in real time while the event is happening. And usually, there’s a lag of several seconds, but that’s happening live. And there are advantages and disadvantages to each type of process here as well.
All right, so we’ll go through some of the benefits of captioning. And there are a number of different reasons why people transcribe and caption their video content. And we’ll cover some of these things. So the first one is accessibility for deaf and hard of hearing. So this is really a very big issue.
In the US alone, there are 48 million people– so that’s roughly one in six– that have some sort of hearing loss that’s significant enough to impact the ability to understand a video. The second reason is search engine optimization, or SEO. If you’re creating video content for marketing purposes, and you’re trying to increase the traffic to your site, adding captions is a great way to allow search engines to understand more about your video, because search engines can’t watch a video.
So they rely on the metadata and surrounding text. And often with video, that means that they know the title of a video, and maybe some tags, but they really don’t know very much else about it. Whereas if you transcribe a video, it just provides search engines a much deeper and broader understanding of what that video is about, so that you have more keyword diversification.
We’ve done several studies with some of our customers to try and understand exactly what is the impact of adding captions to videos. And one such study that we did was with Discovery Digital Networks, where they did a controlled study with two pretty large groups of videos on their YouTube channels. And one group they added captions to, and the other group they didn’t, and they monitored the impact over the course of over a year. And they found that on average, the videos that had captions had 13.48% more views in the first two weeks, and then the lifetime increase was 7.3% more views for the videos that had captions compared to the ones that didn’t.
That next benefit is that captions improve comprehension, and they allow the video to be consumed in noise-sensitive environments, like a library or workplace, especially in cases where computers sometimes don’t even have speakers, so it’s impossible to hear the audio without headphones. So captions really make that content a lot more flexible.
And there is a really interesting study that was done by the BBC in conjunction with the Office of Communications. They found that 80% of people who turn on captions actually don’t have any sort of hearing loss whatsoever. They do it because it helps them to understand the content better. So we thought that was really interesting.
The other benefit is that it’s possible to use transcripts and captions to make the video more searchable, interactive, and more engaging. And we make a number of tools that work with the video for that very purpose. Captions may be required by law. We’ll talk more about that in just a minute.
Captions and transcripts also make the content a lot more reusable. So a couple of examples here– this is coming out of the University of Wisconsin. They did a study with some of their graduate students, and they found that 50% of their students use the transcripts for the videos. They actually download them, or print them out. And they use them as study guides, which was really interesting.
And then what a lot of our customers are also doing, is they’re taking a transcript for a video– for example, from a webinar– and they’re using that transcript to create a lot of different derivative works, such as case studies, blog posts, and white papers. Professors often use transcripts as a starting point to create journals, and even to write textbooks. And lastly, transcripts and captions are used as a starting point for translation and the creation of multilingual subtitles to foreign languages.
A little bit about the applicable accessibility laws– so in the US, there are three Federal laws that are applicable to captioning. The first one is the Rehabilitation Act, which was originally enacted in 1973. So the two sections there that impact captions are sections 508 and 504.
Section 508 is a fairly broad law that requires Federal communications and information technology to be accessible for employees and the public. And for video, of course, this means having closed captions for audio only. Transcripts are sufficient.
Section 504 is very similar, but it has a different angle, in that it’s more of an anti-discrimination law that just requires equal access for disabled people with respect to electronic communications. And both of these laws apply to all governmental agencies and certain public colleges and universities that receive federal funding, such as through the Assisted Technology Act.
The second Federal law that impacts captioning is the ADA. That’s the Americans with Disabilities Act, enacted in 1990. And so the ADA actually has five parts to it. The two that impact captioning are Title II and Title III.
So Title II is for public entities, and Title III is for commercial entities. And the one that has had the most activity recently is Title III. And so the way– that’s the one for commercial entities. And the bar there with Title III is that in order to qualify for that, you have to be considered a place of public accommodation.
And historically, the ADA has really been applied more towards physical structures, such as requiring a ramp for wheelchair access to a building. It has never been applied to online businesses. But one of the landmark lawsuits that happened a couple years ago was the NAD– that’s the National Association of the Deaf– sued Netflix on the grounds that a lot of their movies, their streaming movies, lacked captions. And they cited to the ADA Title III.
And one of Netflix’s arguments was that they do not qualify as a place of public accommodation. And the courts ended up ruling in the end that Netflix does qualify as a place of public accommodation. They ended up settling, and Netflix has captioning, I think, close to 100%, if not 100%, of all of their content at this point.
But the interesting thing that came out of the case is that if Netflix is considered a place of public accommodation, that sets a very profound precedent. That means that there are many other types of organizations out there that would also potentially qualify as places of public accommodation. Certainly private colleges and universities potentially would be covered by the ADA as well.
The last and most recent Federal law is the CVAA, which is the 21st Century Communications and Video Accessibility Act that was enacted in 2010. And that law requires captioning for online video that also was broadcast on television with captions.
So again, this is something that only strictly applies to content that aired on television as well. And the most recent update with the CVAA is a month ago, in July of this year. The FCC ruled that video clips are also covered by the CVAA. So if you publish, for example, a two-minute video clip that’s part a 30-minute TV show, then you have to add captions to that by law.
Also, earlier in the year, in February this year, the FCC published some rules around captioning quality. In the past, this was sort of a gray area about how good do captions really need to be. But FCC quantified a lot of that, and really just standardized the requirements for captioning quality.
And that rule was broken up into four parts. So one is captioning accuracy– that the accuracy really needs to be pretty much flawless– at least 99% accurate. It provided some leniency for live captioning, which is understandable, but for post production captioning, they pretty much said the accuracy needs to be flawless.
The second part of it was synchronization. Captions really need to be pretty much perfectly aligned with the audio track. And when captions lag, or drift, they said that that was unacceptable.
The third part is program completeness. There were a lot of complaints previously about sometimes captions would end before the show actually ends. Sometimes there would be a scene after the credits, which did not include captioning. So the FCC ruled that captions need to cover the entire program from start to finish.
And then the last part of it is on-screen caption placement, and this refers to the position of the captions on the screen. Captions are typically placed in the lower, bottom third. And that works well, but sometimes that obstructs other texts. And so the requirement now is that if captions obstruct something critical on the screen, then they need to be relocated to a different part of the screen.
So we actually have a patent-pending process that we use for this where we actually go through, and we look at the pixels on each frame. And if we notice that the captions are obstructing critical text, we’ll automatically reposition the captions. And that’s something that’s done automatically.
A little bit about caption formats– so there are all kinds of different caption formats. And they really depend on the video player or platform that you’re using. What you see here in this slide in the top right corner is an example of an SRT caption format. This is a very common caption format for online video.
For example, this is what you’d use if you want to add captions to YouTube. And what you’re seeing, there are actually three caption frames. And so, and the first line is the caption number, so it’s caption number two. Then the line under that is the start time and the end time, so this is the window during which that caption frame appears. And then under that is the actual text.
So there are two lines of text. In that caption frame at the top, you’ll see it says, “Hi I’m Arne Duncan, the Secretary of Education.” So that would appear in that time window, and then the subsequent caption frame would appear. That’s one example. This is a very simple one.
In the bottom right corner is an example of an FCC caption format, and this is something that is also used with web video, but it’s also used frequently with broadcast and for offering DVDs. And so this is actually much more difficult to create and interpret. In fact, it’s actually sort of hard to understand what’s going on, because it’s in hexadecimal representation, and much more difficult to create. But these are a couple of different examples.
As far as associating the caption file with the video file, there are several ways to do that. The most common way to do it is what’s called a Sidecar File. What this means is that you have your video file, and you have your caption file, and those two things are kept as separate assets. And the video player basically renders the captions as an overlay on top of the video, but keeps the two as separate files.
Another way to do it is to encode the captions in the video asset itself as a separate track. And that allows the users to turn it on or off. So in that case, you would only have one file. And then the third way is to actually burn in the captions in the video itself, so that they can’t be turned off.
So a little bit about 3Play Media– we’ve been around for about seven years. We focus on captioning transcription and subtitling. This is our bread and butter. It’s what we’ve been doing this whole time. We’re based in Cambridge, Massachusetts. We work with over 1,000 customers across a range of different industries in higher ed, media entertainment, enterprise, and government.
We have a range of different products and services. Captioning and transcription subtitling– we talked about that. I should mention that the translation and multilingual subtitling, that’s something’s completely integrated into our account system. That’s something you can just order after your videos have been transcribed.
We also have a process called automated transcript alignment. And this is something that’s really useful in a case where you already have a transcript for your video. So you don’t have captions, but you already have a transcript.
And so what you can do is you can upload your transcript along with the video, and we use an automated algorithm that will synchronize the transcript with the video file, and create closed captions. And that’s something that’s a lot less expensive, and it’s a very quick process.
So as for our process, the way that we transcribe and create closed captions is through a three-step process. We first put a video through speech recognition, which gets it to a point where it’s about 70% accurate. We then have a professional transcriptionist who will go in and clean up the mistakes left behind by the computer. And subsequently, we’ll have a QA person who will go through the transcript again, and will research difficult words, double-check the grammar and punctuation, and make sure that we’re adhering to all the different transcription captioning standards.
And so the result of this is that we produce pretty much a flawless, very well-synchronized caption file that is at least 99% accurate. We average about 99.6% accuracy, and it’s a very efficient process, because 2/3 to 3/4 of the work is actually done through an automated process using speech recognition. And all our transcriptionists are all based in the US.
In the account system, we’ve also developed a caption and subtitle editor. And this is something that is not needed very often, but once in while, for example, you might want to change the capitalization of the word, or maybe somebody’s name was misspelled. And we found that the easiest way for our customers is that they just go in, and make that edit on the fly. And as soon as you click Save, your changes propagate to all of the output files.
In the account system, there are many different ways to upload your videos. You can upload videos from your computer. If your videos are on a public server somewhere, you can just provide those links, and we’ll adjust them.
You can use FTP. You can upload them over API. You can also link your 3Play Media account with a bunch of different video platforms, and automate the upload process that way.
You can also control the turnaround. So we have a range of different turnaround options. Our standard turnaround is four business days, but you can select two-day, one-day, or even same-day service, which gets it back to you literally within a matter of hours.
As I mentioned, we are integrated with, I think, all the leading video platforms, and a bunch of different video players as well, just to make the process as simple as possible. And the way that that round-trip integration typically works– let’s say you have your videos on YouTube. What you do is you just link your YouTube channel with your 3Play Media account. And then you just select which videos you want to have captioned, and then we pull in those videos from YouTube. We create captions, and then we send it back, and they just show up on the video, so the workflow is completely automated.
As I mentioned, there are all kinds of different caption formats and subtitle formats. We produce more than 50 different formats. So basically whatever you need to do, we would have you covered. There is, as I mentioned before, the automated vertical caption placement option through the account system, and then you can also encode captions in the video file.
And last thing I just want to mention is that while we’ve built a lot of different tools and technologies that are self-serve. One thing that’s really important to us is providing the best possible support to our customers, and we’ve received a lot of praise for this. If you go to our website and look at some of the testimonials, people really address this a lot. We really spend a lot of time talking to our customers, and we really enjoy understanding what they’re doing, and walking them through every part of the process to make sure that they have the best possible experience.
OK, so we have a little bit of time left, and we’ll open it up to some Q and A. There are also some resources here that might be useful. There are some webinars, white papers, and some how-to guides in that first link, and some testimonials, case studies, and some other information as well. So we’ll be back in just a minute after we organize all the questions that have come in. And feel free to type them in that questions window.
OK, so there are some questions here about pricing. And so I’ll give you an overview of how the pricing works. But also feel free, if you go to our website, there’s a link there to pricing, and you’ll see all of the details of how this works. It’s all based on the duration of the recorded content.
And we have two types of accounts that you could sign up. There’s an express account and a pro account. The express account a sort of like a simplified version of the account system. It’s really intended more for light users with small or one-off projects. It has a variety of basic caption formats.
Then the pro account is something that has the complete suite of the different tools and technologies. It has all the different advanced caption formats. It also provides volume discounts. So the way that works is that you can pre-purchase a certain number of hours.
For example, if you pre-purchase 100 hours, what that does is that locks in a discount. And then whenever you use the service, it just debits against that balance. So if you click on that link– that bottom link for plans and pricing– you’ll be able to see all of the information on there.
So there’s a question here about trends of captioning formats, and whether we see some formats becoming more popular than others. Yes, so there’s so many different formats. The ones that I’d say are the most common are SRT, DFXP, SCC is something that’s used very often in media and entertainment, Cheeta .cap, SMPTE time text– those are sort of the main ones that come to mind.
So there’s some questions about how do we handle technical content and accents. There are number ways that we address technical content. We have quite a large pool of professional transcriptionists. We actually have over 800 transcriptionists that work for us now. And so as a result, they have a lot of domain-specific expertise– in pretty much any sort of domain.
So for example, if you upload content that’s based on math or chemistry, there will be a number of transcriptionists that specialize in those areas. Another way is that we provide the ability to upload glossaries, or specialized vocabulary, or supporting documents to each file. Or you can do that by folder, or for an entire project. And so that can be things like product names or people’s names. And that really helps the transcriptionist to decipher what is being spoken.
And then another part of it ISis that because we have this three-step process, when the first transcriptionist goes through the video, what that person will do is if he or she comes to a word that he can’t decipher, that they’ll flag it. And then subsequently, the QA person will go in and research that word. And because of this flagging system that we have in place, we’re really able to get almost all of the difficult words.
Basically anything that we can find on the internet, we usually find it. And if you can provide some glossaries or supporting documents along with your videos, then that will pretty much ensure that we’ll get it. So the result is that we handle the technical content very well. But even if we do miss something, we also have that real-time editor, where you can go in, and make the change, and have those changes instantly propagate to all the output files, so you wouldn’t need to reprocess anything.
There is a question here about whether this webinar is being recorded. And the answer is yes, it is being recorded, and you will receive an email with a link to watch the recording tomorrow. So you’ll have that.
So there’s a question here also related to pricing, and whether we charge separately for different caption formats, and the use of different video platforms. And the answer is no. There are no additional prices. That’s all included.
Basically, when you upload videos, you’re only paying for the recorded duration of each video, and that includes all of the different caption formats. And in fact, we even store those indefinitely on the system, so you can download as many as you want, whenever you want. And that includes all of the different integrations with the video platforms.
So there’s a question here about how to set up an account. It’s really easy. You just go to our website, and just click Get Started, and just fill out that brief form, and then you get rolling. It’s really easy.
Great, so I wanted to thank everyone for joining us today. So as I mentioned, we’re recording. This webinar with captions will be available tomorrow, and so you’ll receive an email to watch it. Thanks again, and hope you have a great day.