Quick Start to Captioning [Transcript]
LILY BOND: Welcome, everyone, and thanks for attending this webinar entitled Quick Start to Captioning. I’m Lily Bond from 3Play Media, and I’ll be presenting today.
Great. So as we get started, I just want to go over the agenda. I’m going to begin with some captioning basics, go over the benefits of captioning, go through the accessibility laws, go through our captioning services and tools, and then I’ll leave about 10 minutes for Q&A at the end. So let’s get started.
We’ll take it from the very beginning. What are captions? Captions are text that has been time-synchronized with the media so that it can be read while watching a video. Captions assume that the viewer cannot hear the audio at all, so the objective is not only to convey the spoken content, but also any relevant sound effects, speaker identification, and other non-speech elements.
Basically, the objective is to convey any sound that’s not visually apparent but is integral to the plot. For example, you would definitely want to include the sound effect “keys jangling” if you hear this sound behind a locked door, because it’s important to the thought development that someone is trying to get in. But you wouldn’t include it if it’s the sound of keys jangling in someone’s pocket walking down the street, because that’s not relevant.
Captions originated in the 1980s as a result of an FCC mandate specifically for broadcast television. And now as online video becomes more and more an everyday part of our lives, the need for web captions has expanded and really continues to expand greatly. So as a result, captions are being applied across many different types of devices and media, especially as people become more aware of the benefits and as laws become increasingly more stringent.
I’m going to go through some of the terminology before we get to those laws just to make sure that everyone is on the same page. So the difference between captions and a transcript is that a transcript is not synchronized with the media. On the other hand, captions are time-coded so that they can be displayed at the right time while watching a video. And for online media, transcripts are sufficient for audio only content, but captions are required any time there’s a video component.
The distinction between captions and subtitles is that captions assume that the viewer cannot hear, whereas subtitles assume that the viewer can hear but cannot understand the language. So that’s why captions include all relevant sound effects. Subtitles are really more about translating the content into a language that the viewer understands. The difference between closed and open captions is that closed captions allow the end user to turn the captions on and off, while open captions are burned into the video and cannot be turned off. With online video, you’ll usually see closed captions.
Post-production versus real-time refers to the timing of when the captioning is done. So real-time captioning is done by live stenographers, whereas post-production captioning is done offline and takes a few days. There are advantages and disadvantages to both of those.
There are a lot of different caption formats that are used with specific media players. On the left, you’ll see a list of some of the more common caption formats and where you might need to use them. The image at the top right shows what a typical SRT caption file looks like, and that’s the type of caption file you would use for a YouTube video, for example. And you can see that it has three captioned frames, and each frame has a start time and an end time followed by the text that appears in that time frame. And then below that on the bottom right is an SCC file which uses Hex frames and is a little bit more complicated to the average viewer.
So once you’ve created a caption file, it needs to be associated with the corresponding video file. And the way to do that depends– sorry– depends on the type of media and video platform that you’re using. So for sites like YouTube, all you have to do is upload the caption file for each video, and we call that a sidecar file. But in other cases like for iTunes, you actually need to encode the caption file onto the video.
And another way to associate captions with the video is with open captions. I mentioned these when I was talking about caption terminology, but again, these are burned directly into the video and can’t be turned off. And if you’re using one of the video platforms that we’re partnered with like Brightcove, Mediasite, Kaltura, or Panopto, then this stuff really becomes trivial, because it all happens automatically.
So the primary purpose of captions and transcripts is to provide accessibility for people who are hard of hearing. 48 million Americans experience hearing loss, and closed captions are the best way to make media content accessible to them. Outside of accessibility though, people have discovered a number of other benefits to closed captioning.
Closed captions provide better comprehension to everyone. The Office of Communications in the UK actually conducted a study where they found that 80% of people who were using closed captions were not deaf or hard of hearing, and that closed captions really provide increased comprehension in cases where the speaker has an accent, the content is difficult to understand, if there’s background noise, or if the viewer knows English as a second language. Captions also provide a flexibility to view videos in noise-sensitive environments like the office, the library, or your gym.
Captions also provide a really strong basis for a video search, and there are certain plug-ins that we offer that can make your video searchable. People are used to being able to search for a term and go directly to that point, and that’s what our interactive transcripts lets viewers do within a video. And I’ll go over that a little bit more later on.
For people who are interested in SEO, or Search Engine Optimization, closed captions provide a text alternative for spoken content. Because search engines like Google can’t watch a video, this text is the only way for them to correctly index your videos. Discovery Digital networks did a study to see the impact of captions on their SEO, and they actually found that adding captions to their YouTube videos increased their views by 7.3%.
Another benefit of captions and transcripts is their reusability. The University of Wisconsin found that 50% of their students were actually repurposing video transcripts as study guides, so they make a lot of sense for education. And you can also take the transcript from a video and use it to quickly create infographics, white papers, case studies, and other docs. Of course, once you have a captioned file in English, you can translate that into foreign languages to create subtitles, which makes your video accessible to people on a more global scale. And finally, captions may be required by law. And I’m going to dive into the federal accessibility laws right now.
The first big accessibility law in the US was the Rehabilitation Act of 1973, and in particular, the parts that apply to captioning are Sections 508 and 504. Section 508 is a fairly broad law that requires federal communications and information technology to be accessible for government employees and the public. So this is where closed captioning requirements come in.
Section 504 is basically an anti-discrimination law that requires equal access for disabled people. Section 504 applies to both federal and federally funded programs, and Section 508 applies only to federal programs. However, any states receiving funding from the Assistive Technology Act are required to comply with Section 508. So often that law will extend to state-funded organizations like colleges and universities, because most states do receive funding from the Assistive Technology Act.
The Americans with Disabilities Act is a very broad law that is comprised of five sections. It was enacted in 1990, but the ADA Amendment Act of 2008 expanded and broadened the definition of disability. Title II and Title III of the ADA are the ones that pertain to video accessibility and captioning. Title II is for public entities, and Title III is for commercial entities. And this is the area that has had the most legal activity. Title III requires equal access for places of public accommodation. So the gray area here is what constitutes a place of public accommodation.
In the past, this was really applied to physical structures, for example, requiring wheelchair ramps on buildings. But recently, that definition has been tested against online businesses. So one of the landmark lawsuits that happened a couple of years ago was the National Association of the Deaf versus Netflix.
The National Association of the Deaf sued Netflix on the grounds that a lot of their streaming movies didn’t have captions, and they cited Title III of the ADA. One of Netflix’s arguments was that they do not qualify as a place of public accommodation, but the courts ended up ruling in the end that Netflix does qualify. They ended up settling, and now Netflix has captioning on close to 100%, if not 100% of all of their content at this point.
So the interesting thing to come out of this case is that if Netflix is considered a place of public accommodation, that sets a really profound precedent for the ADA’s application to the web and online content, including for places like colleges and universities. Harvard and MIT were sued by the National Association of the Deaf in February for discriminating against the deaf and hard of hearing by not providing captions for their online content. The decision in this case will have huge implications for higher education.
And an interesting thing to note is that edX, which is the online video platform launched by Harvard and MIT in 2012, entered into a separate settlement agreement with the Department of Justice in April. And in that agreement, the Department of Justice, which has the duty of enforcing the ADA, required edX to provide accessibility measures to their online content, including closed captioning for all videos.
This settlement could play a role in the current lawsuit since it indicates that the Department of Justice believes that online content is subject to the ADA. There are a couple of other ADA cases that haven’t had decisions yet, for example, against FedEx and against Time Warner. And the decisions on these will further shape the scope of the ADA.
The 21st Century Communications and Video Accessibility Act is the most recent accessibility act, and it was passed in October of 2010. And this requires captioning for all online video that previously aired on television. So for example, this applies to publishers like Netflix and Hulu or any network websites that stream previously aired episodes online. There are a lot of upcoming FCC updates to this law, but the biggest one is that starting in 2016, clips from television programs must be captioned when they go online.
So, for example, a two-minute excerpt from a show that you can view online would be with captions, and then by 2017, that will be expanded to montages. So then things like trailers and previews for an upcoming show would need to be captioned. And with the CVAA, the copyright owner bears the responsibility for captioning.
In February of 2014, the FCC came out with specific quality standards for captions, which was the first time that legal standards had been placed on things like accuracy. And this applies to broadcast captions and online video that previously appeared on television, but they are a good standard for all captions to follow.
There were four parts to the ruling– accuracy, synchronicity, program completeness, and on-screen caption placement. All of these are fairly self-explanatory. Captions must match the spoken words include pertinent nonverbal information. They must coincide with the spoken words. They must run from start to finish, and they must not obscure important on-screen content. Like in a documentary, if the name and occupation of the speaker is written on the bottom of the screen, the caption should be moved. So our solution for that is vertical caption placement, where we move the captions to the top of the screen if we detect important information that the captions would otherwise obscure.
To talk a little bit about our company, 3Play Media is an MIT spin-out, and we’re based in Cambridge, Massachusetts. For the last seven years, we’ve been providing captioning, transcription, and subtitling services to over 1,600 customers in higher education, government, enterprise, and media and entertainment.
Our goal is really to simplify the process of captioning and transcription, which can be a barrier for a lot of people. We have a really user-friendly online account system, and we offer fast and reliable around with a lot of options in terms of turnaround time. We have integrations with most of the leading video players’ platforms, which can automate the process to make captioning even easier. And as I mentioned earlier, there are a lot of caption formats, and we offer over 50 different output options, so you should never really run into a problem there.
We also recently released a feature that allows you to import existing captions and subtitles, which gives you access to all of our tools. And those tools include various video search plug-ins that make your video searchable and attractive. I’ll go over those more soon. We now provide captioning and transcription for Spanish source content. And if you’re implicated by the FCC, the FCC does have the same captioning requirements for Spanish and mixed-English Spanish content as it does for English content.
OK, accuracy is very important to us. We comply with all of the FCC’s quality standards and have developed very strict best practices for our transcriptionists. We use a multi-step review process that delivers more than 99% accuracy, even in cases of poor audio quality, multiple speakers, difficult content, or accents. Typically 2/3 of the work is done by a computer, and then the rest is done by our transcriptionists. And that makes our process more efficient than other vendors, but more importantly, it affords our transcriptionists the flexibility to spend more time on the finer details.
For example, we diligently researched difficult words, names, and places, and we also put more care into ensuring correct grammar and punctuation. We’ve also done a lot of work on the operational side of the business, so what we can do now is actually match transcriptionists’ expertise to certain types of content. We have about 700 transcriptionists, and they cover a broad range of disciplines.
For example, if you send us tax-related content, we can match that content with a transcriptionist who has a financial background. And without exception, all of our work is done by professionally trained transcriptionists in the USA. Every transcriptionist goes through a really rigorous training program before they ever touch a real file, and they also go through a background check and enter into a confidentiality agreement.
So once your account is set up, the next step is to upload your video content to us. There are many different ways to do that. You can use our secure uploader in the account system. You can use our API, FTP, or upload via links, or you can use one of our integrations. Our account system is all web-based, and there’s no software to install. And as I mentioned earlier, we have flexible turnaround options.
So once you’ve uploaded your file, you can select the turnaround you need. Our standard is four business days. But if you have urgent deadlines, we have options for that. We actually just introduced two-hour turnaround, so if you have short files and need quick turnaround, that’s a really great option. Or if you have a more relaxed deadline, we have options for that as well.
We’ve also built integrations with the leading video platforms and lecture capture systems. You can see a lot of them listed on this slide, but it includes Brightcove, Mediasite, Kaltura, Panopto, and YouTube. If you’re using one of these platforms, then the process is even further simplified. Many of these integrations allow you to just select your files directly from within your video platform and tag them for captioning, and then those files will be submitted to us, and we’ll post the captions directly back to your videos when they’re completed.
These are some of the 50-plus output formats that we offer. After you’ve uploaded your media, it goes into processing. When your captions are ready, you’ll receive an email alert, and you can log in and download your files in many different transcript and caption formats. There’s no limit to the number of downloads or the number of formats that you use, and there are a lot of features in the account system that you’ll have access to at this point.
One of these features is the Captions Editor, which is an editing interface that lets you make changes to your transcripts or captions. When you finalize your edits, they propagate to all outputs and plug-ins without having to reprocess anything. So if you already have captions or subtitles, we have a feature called Caption Import, which is a monthly subscription, where you can import your captions and have access to all virtuals and plug-ins. Like the interactive transcripts, you can also translate those captions into multilingual subtitles, convert them into other formats, and securely manage your assets.
If you already have transcripts, then you can use our automated transcript alignment service to create time-coded captions for your videos. Then you’d have access to all of the same tools as with our captioning service like translation and interactive transcripts.
Speaking of interactive transcripts, this is one of our plug-ins, which is available to you and included in the cost of captioning. Basically, the interactive transcript is a time-synchronized transcript that highlights the words as they’re spoken in the video. And you can click anywhere in the transcript to jump directly to that point in the video, or search for a term within the transcript and go directly to that point. So this is really popular and engaging for all viewers. It really appeals to the modern tendency to be able to search and find something immediately, and that’s a big limitation in video when it doesn’t have something like this associated with it.
Interactive transcripts are really popular in education, because it helps so much for students when they are studying. So, for instance, imagine that a student wants to find a specific section of an hour-long video that they know is really important for their exam. Rather than struggling to find that section, they can just search for the term they’re looking for and go directly to that point. They can also print or download the transcripts and highlight important sections to study later.
So while we’ve built a lot of tools that are self-service and automated, a lot of our success as a company is based on the fact that we give all of our customers a lot of attention. We expect to walk people through the account tools, and we really enjoy building relationships with people. In December, we did a survey with our current customers, and the word cloud on the right highlights the most common terms that were mentioned. So as you can see, support really stands out as one of the key aspects that our customers are happy with.
So that brings us to Q&A. I apologize we’re running a little bit late due to the audio malfunction earlier, but I’m going to stay on the line for about 10 more minutes. If you have to go, don’t worry. This is being recorded, and you can view the Q&A later. So the first question here is “For accessibility compliance, is a transcript sufficient, or do you need captions?”
So transcripts are sufficient for audio-only content, so for instance, a podcast would be an example of that. But captions are really required anytime that there’s a video component, such as a recorded lecture, or even a PowerPoint slideshow that has an audio track would require captions in that case.
There’s a question here about copyright. “If a faculty member provides a clip of a full motion picture, is the campus legally allowed to caption that file?” So many people really believe that this use case is covered by fair use, which is a set of exemptions to the copyright law that balances the needs of content creators with those of content users. And the closed captioning fair use is really evaluated on four factors. Another layer of defense is that Section 107 of American Copyright Law states that teaching is a purpose that is considered exempt from copyright infringement.
Another question here. “How do you ensure high accuracy for complex or technical content?” I went into that a little bit, but we address it in a few different ways. First of all, our staff are highly trained on transcription and captioning standards. For example, we have specific standards on how to transcribe a mathematical equation or a chemical reaction, and then our staff are also continuously audited to ensure consistent quality.
Second, we make it very easy to upload terminology or other information along with your video. So adding names, places, or specialized vocabulary really helps our transcriptionists to decipher what is spoken.
And then third, our staff covers a broad range of disciplines, and we try to match their expertise with your subject matter. And although it’s rarely necessary, we do provide that caption editing interface that I showed you that makes it really easy to edit captions even after they’ve been processed. So any changes that you make there would automatically propagate to all of your outputs.
There’s a question here. “What are the pricing models? Is it subscription or per video?” Our pricing is based on per minute of content, of video content, and all of our pricing information is listed on our website. So you can just see a full spreadsheet of that there.
Question. “Are there any creative ways to pay for captions?” Yes, there are. We’ve actually compiled a list of resources showing how some of our customers pay for captioning through grants and other sources of funding. That’s a white paper, and I’ll include a link to that in the email that we send out with the recording.
“What are the considerations for captioning in-house versus using a third party vendor?” So that’s a complicated question. A lot of people honestly use some of both. There are a lot of different considerations for that. We’ve covered that in depth in another webinar on in-house captioning workflows that you can find on our website as well.
Another question here. “What is the workflow for captioning a film?” So that would really depend on how you choose to upload the content to us. You could either upload it directly to our account system, you could use an API, upload it via FTP. Or if you’re using a video platform, you can just link your accounts, and we’ll receive the file directly and automatically post the captions back.
But once you’ve uploaded the file to us, you’ll just receive an email when the captions are ready for you. You can select your turnaround options and all of that, and then you can download the caption file that you need– we offer a lot of different options– and then associate it with your file.
A question here. “If you upload a private YouTube video for the students to watch, does it have to be captioned?” So I mean, that would really depend on whether or not you have a student who is deaf or hard of hearing. If so, then yes, regardless of whether the video is private, that would need captioning to be accessible to your students.
A question here. “Are your closed captioning services only asynchronous, or do you offer live services?” We are a post-production captioning company, so we only handle post-produced video. But there are a lot of great live services out there.
It looks like that’s about it for questions. So thank you, everyone, so much for attending. I apologize again for the audio issue. And we will send out an email tomorrow with a link to view the recording with the slideshow and an interactive transcript. I hope everyone has a great day. Thank you.