Quick Start to Captioning [Transcript]
JOSH MILLER: All right, welcome. And thanks for attending this webinar on closed captioning. My name is Josh Miller, and we have about 30 minutes to cover the basics of captioning. We’re going to make the presentation about 20 minutes and leave the rest of the time for questions.
If at the 30-minute mark, you need to drop off, that’s no problem. And if there are a few extra questions that you want to stick around for, we’ll spend a few extra minutes if there are enough questions. The best way to ask questions is by typing them into the Questions window in the bottom right corner of your Control Panel.
We’ll keep track of them and address them all at the end. Please also feel free to call or email any time after this webinar. We’re happy to answer questions outside of this forum as well. And for anyone following on Twitter, the hashtag for the webinar is #3PlayCaptioning, as shown on the screen here.
So today we’re going to give you an overview of closed captioning for web video, including some of the applicable legislation. We’ll talk about the services that we provide and go over the process and workflow of adding captions to video step-by-step.
So from the beginning, we should answer the question, what are closed captions? Captioning refers to the process of taking an audio track, transcribing it to text, and synchronizing that text with the media. Closed captions are typically located underneath a video or overlaid on top of the video. And in addition to spoken words, captions convey all the meaning and include sound effects. And this is a key difference from subtitles.
Closed captions originated in the early 1980s by an FCC mandate that applied to broadcast television. And now that online video is rapidly becoming the dominant medium, captioning laws and practices are proliferating there as well. So, some terminology that’s worth going over– captioning versus transcription.
A transcript is usually a text document without any time information. On the other hand, captions are time synchronized with media. So you can make captions from a transcript by breaking the text of that transcript into smaller segments, which are called caption frames, and then synchronizing each one of those segments with the media. So that way, each caption frame is displayed at the right time.
Captioning versus subtitling– the difference between captions and subtitles is that subtitles are intended for viewers who do not have a hearing impairment but may not understand the language. Subtitles capture the spoken content but not necessarily any sound effects. So for web video, it’s possible to create multilingual subtitles for video that you already have English captions for, or captions for the spoken language.
Closed versus open captioning– the difference here is that closed captions can be turned on or off by the viewer, while open captions are burned into the video and can not be turned off by anyone. Most web video players and most web video media in general really do support closed captions. So it’s rare that you would be forced into having open captions.
And then post production versus real time– post production means that the captioning process occurs offline and usually takes a few days to complete. Real-time captioning is done by a live captioner. And there are certainly advantages or disadvantages of either process, depending on your needs.
So although captions originated with broadcast television, nowadays captions are being applied across many different types of media, especially as people become more aware of the benefits with the internet. And as laws become increasingly more stringent, we’re seeing the use of captions expand as well. Since every video player and software application handle captions a little bit differently, we’ve created a number of “how-to” guides, which you can find on our website under the how-to page. And that’s something that’s definitely worth checking out since it can vary from player to player. And those are all free resources.
So we’ll talk a little bit about accessibility laws because these come into play quite often. And sometimes there’s some confusion as to which ones apply. Section 508 is a fairly broad law that requires all federal electronic information technology to be accessible to people with disabilities, including employees in the public. In some cases, it applies to public higher education institutions that receive federal aid or subsidy– for example, the Assistive Technology Act. For video, this means that captions must be added. For podcasts or audio files, a transcript is usually sufficient.
Section 504 entitles people with disabilities to equal access to any program or activity that receives federal subsidy. Web-based communications for educational institutions and government agencies are covered by this as well. And section 504 and 508 are both from the Rehabilitation Act of 1973, although Section 508 wasn’t added until the mid-’80s. Many states have also enacted similar legislation to Sections 504 and 508. Sometimes they’re called something different, but the language and the intent is quite similar.
Next is the ADA, which is the Americans with Disabilities Act of 1990. And that covers federal, state, and local jurisdictions. It applies to a range of domains, including employment, public entities, telecommunications, and places of public accommodation. The Americans with Disabilities Act Amendments Act of 2008 broadened the definition of disability and made it pretty much the same as Section 504, which means more people ended up being covered by the ADA.
In a recent ADA lawsuit– and this is the ADA that was cited in the lawsuit versus Netflix– the National Association of the Deaf basically sued Netflix. And what happened was that Netflix argued that the ADA should not apply, because the ADA only applies to physical places of public accommodation, not virtual places of public accommodation, and therefore should not apply to Netflix’s streaming video service.
However, the judge ruled that the ADA does, in fact, apply to online content and that Netflix does qualify as a place of public accommodation. The definition that the court has offered is that an entity qualifies as a place of public accommodation if it has global economic impact. And this really has profound implications for anyone publishing online content, including video content used in education and enterprise.
The next law that has been talked about quite a bit recently is the 21st Century Video Communications and Accessibility Act, which is often referred to as the CVAA. And this was signed into law in October of 2010 and expands the closed caption requirements for all online video that previously aired on television with captions. So this is basically an expansion of the mandate that started with broadcast television.
So it really is an interesting law because it originally was a much broader law, well beyond television. And it was narrowed in focus back to content that aired on television with captions going online. But expanding legislation to move beyond network television is definitely being discussed.
So what you see here is a timeline of some of the milestones for this particular law for networks and studios to make sure that their content is captioned by a certain date. These new FCC rules apply to broadcast and cable networks and pretty much any TV station that makes content available online. And one area that’s interesting is that edited content does have to be captioned. So if it’s edited for web distribution, it does have to be captioned.
However, the decision on video clips has been deferred from the original timeline. Clips are defined as smaller segments of content that either came from a larger show or film and went to the web, or it’s a short segment straight to the web. So there are number of interest groups debating the captioning requirements for clips in particular. And this is something that we’ll probably see a decision on over the next year.
User-generated content, just to clarify, is not included in this law. And that’s something that will probably continue for a while. The most likely expansion will focus more on professionally-produced content.
So here’s a bit of global and national data from the 2011 WHO Report on Disability. It states that more than 1 billion people in the world today have a disability, and nearly one in five Americans age 12 and older experience hearing loss severe enough to interfere with day-to-day communication. The other interesting conclusion is that the number of people requiring accessibility accommodations is rapidly on the rise, relative to population growth.
And so, of course, we want to understand why. One is certainly medical advances. So people are now more likely to survive premature birth, a car accident. We have an aging population, and we’re able to accommodate that. So these are all really good things. It does come with a number of side effects, and we have to be aware of that. And that’s something that’s really important.
Another example would be we’ve been at war for the better part of a decade or more. So now we have people in the battlefield who are actually more likely to survive a certain casualty. We have better armor. We have better weapons. Again, it’s great that these people are able to survive. However, there are side effects, and there are real accessibility concerns that come with, especially, those types of casualties.
So all this points to the fact that accessibility is a critical issue, and it will continue to be a prevalent issue in the years ahead. Although the primary purpose for captions and transcripts is to provide accommodation for people with hearing disabilities, which is critical, people have discovered that there are many other benefits as well, especially with web video.
Captions improve comprehension and remove language barriers for people who know English as a second language. Captions can compensate for poor audio quality or a noisy background. And it allows the media to be used in sound-sensitive environments, like an office or a library.
Search Engine Optimization is made much more powerful when you have the text alongside the video. Once your video has been found, captions and transcripts allow it to be searched and reused more efficiently. This is especially important with long-form video. So, for example, if you’re looking for something within a one-hour lecture, you can quickly search through text, instead of having to watch the entire video and hope you come across the small segment that you’re looking for.
We actually have a number of tools ourselves that really focus on leveraging that caption and transcript data that we already have and creating an interactive, searchable experience. And we have a number of resources and guides on our website about that. And finally, transcription and captioning is really the first step to translating into other languages. So if you are interested in subtitles, if you are interested in reaching a broader audience, this is the first thing you would have to do.
There are many different caption formats that are used with different types of media players. So the image on the right shows what a typical SRT caption file looks like. It shows three caption frames. And you can see that each caption frame has a start time and an end time and then the associated text. So once a caption file is created, it needs to be associated with the corresponding video file.
And the way to do that really depends on the type of media and the type of video platform that you’re using. So for sites like YouTube, all you have to do is upload that caption file for each video. And it’s really, really easy. Other video platforms also have varying degrees of ease in terms of adding caption files.
And then we certainly offer plug-ins and tools to allow for easy caption publishing along with your video player. So for example, we have what’s called a Captions Plugin which up until recently was really one of the only ways to add captions to a Vimeo video player. Now that they offer some captioning support, there are other ways as well.
So I’ll give you a little bit of background, just on us and who we are. The inspiration for 3PlayMedia started when we were doing some work in the Spoken Language Lab at CSAIL, which is the computer science department at MIT. We were approached by MIT OpenCourseWare with the idea of applying speech technology to captioning for a more cost-effective solution. We quickly recognized that speech recognition alone would not suffice, but it did provide an interesting starting point.
So from there, we developed an innovative transcription and captioning process that uses both technology and humans and yields an extremely high rate of accuracy for transcripts with time synchronization, so we can use them for captions. We’re constantly developing new products and ways to use the transcripts, largely with the input of our customers. We have over 700 customers who work across higher ed, media, entertainment, enterprise and government.
So our focus is to provide premium quality transcription and captioning services. That’s number one. We also can translate into many different languages. We have a number of interactive tools that I mentioned to bring the transcripts and captions to life and really adding another way of consuming that content. We also have a number of integrations in API to make the workflow as easy as possible.
So a little more about this process we’ve developed. So we use a multi-step review process that delivers more than 99% accuracy, even in cases of poor audio quality, multiple speakers, difficult content, and different accents. Typically, 2/3 of the work, you could say, is done by the machine. The rest is done by transcriptionists. So if the draft from the speech recognition engine is 65%, 70% accurate, that gap between the 70% and the 99%, that cleanup process is done by a human.
So this actually makes our process more efficient than a number of other providers. But more importantly, it affords our transcriptionists the flexibility to spend a little more time on the finer details. So for example, we will diligently research difficult words. We’ll research names, places. We’re able to put a lot of care into making sure that the correct grammar and punctuation is in place.
We’ve also done a lot of work on the operational side of the business, such as making it possible to match transcriptionists’ expertise and solid performance to certain types of content. And we have about 600 transcriptionists on staff. They cover a broad range of disciplines. For example, if you send us tax-related content, we’ll match someone with some financial background or someone who’s done a lot of math and financial work.
So without exception, all of the work is done by a trained transcriptionist, here in the United States. Every transcriptionist goes through a very rigorous training program before they even touch a real file. And they go through a number of audits and checks throughout the process.
One thing we’ve found is that no matter how hard we try– and we really push really hard on the quality– there is a chance we can’t get certain proper nouns, or certain vocabulary can be difficult. So we’ve built the ability for you to make changes on the fly. So once a file is complete, you can actually go in, see if anything is misspelled. You can decide if you want to redact something, change something, modify the paragraph breaks.
You can make all these changes and simply press Save. And those changes will update all of the different output files that we create for you. That’s transcripts, closed captions. You don’t have to reprocess anything at all. It’s just done automatically for you. And all the time codes and caption frame breaks, all of that’s accounted for. So it’ll regenerate right after you save the file.
We’ve built a number of tools that really try to make this process self-service or even automated for some people. But really, a lot of our success as a company is based on the fact that we are constantly in touch with our customers. And we’re really building relationships, and we’re giving our customers a lot of attention. We expect to walk people through the account tools. We expect to build those relationships.
And it’s through those conversations that we learn about what other features are worth developing, because to us, as much as we want it to be a simple upload-download-publish process, the reality is there are different pieces to the process that can always be made better. And we want to know that.
So in terms of just getting set up and going, the account set-up process is very quick. You can pay by credit card. You can be invoiced. You can add different users with different levels of access. So there are definitely different security measures in place. And you have full control, when you’re the administrator, as to who has what type of access.
Once the account is set up, there are a number of different ways to upload content to us. We have a secure web uploader, FTP. We have an API and then, as I mentioned, a number of integrations with the leading online video platforms and lecture capture systems– so Brightcove, Mediasite, Kaltura, Ooyala, Echo360, and a number of others. So if you’re using one of these platforms, the process becomes extremely quick.
We really aim to make the captioning workflow as unobtrusive as possible. We give you the ability to automate much of the workflow if you have those tools. And the captions that we provide are really compatible with most video players. We basically will give you probably about 15 different caption formats so that you have whatever you need, depending on how you’re publishing your video. Another thing to note is that the account is completely web-based, so there’s never software to install. You should never have to go through a big set-up process to do any work with us.
I mentioned the integrations. These are out-of-the-box integrations for any 3Play Media account. What that means is, with a quick set-up process, you’re able to make captioning requests for your videos from your video platform. Those videos would get sent to us, and then we’d send the finished captions back to the video platform and have them associated with the correct video. So as soon as you publish those videos and the captions are ready, they just show up and work. And you have very little to do.
One service that is actually worth mentioning is this idea of transcript alignment, meaning if you have a transcript already, you can upload that along with the video. And we’ll synchronize the text to create the closed captions. This is an automated service, so the turn-around is extremely fast. And it’s also available for Spanish content, if you have a Spanish transcript.
After you’ve uploaded content and it goes into processing, we have different turnaround options that you can choose from when you upload your content. Basically, you would log back into your account, and you can download any of the different transcript or caption formats that we make available. You can download entire batches at once. You can download one file at a time. And we store these for you.
So you can come back and download different versions anytime you want. We just process it one time, and you have access to all of these. And certainly we have different methods to delete files if you wanted to remove them for security purposes. All that can be done as well.
Once files are complete, you have the option of translating them into pretty much any other language. You’d get both transcripts and subtitle files back for the languages that you’ve selected. And there are multiple quality and price options as well for the translation. One thing that’s important is that you can submit a style guide, and you also have the option of editing the subtitles after the translation is complete.
The Captions Plugin, which I mentioned before, is a free tool that lets you add closed captions or multilingual subtitles to almost any video player online. It works with a number of video players that, as I mentioned, up until recently didn’t even support captions, so that was really useful. And it makes the video searchable. So there’s actually a Search function. If you look at the magnifying glass icon here, that actually allows you to search the caption text and jump to a point in the video, all off of the captions.
The Captions Plugin, it can be embedded on a web page with just a single line script. It’s very easy. And it just ties into the media player that you’re already using. The text, we would host the text for you. You’d still be hosting or delivering your video the same way you normally do. So it’s really easy to use.
The interactive transcript that we offer is a very similar concept in terms of installation and compatibility. It basically expands the view to see more of the transcript at any given time. So the text would be highlighted as it’s being spoken. You can click on a word, jump to that part of the video.
There’s even a feature you can turn on that allows someone to highlight a segment of text and share a link directly to that exact segment. So it takes someone to the page and start playing a video from that segment. So for social media, this is a great tool to start sharing content.
We recently launched a separate service called CaptionsForYouTube.com. This is designed specifically for YouTube users. It’s incredibly easy to use, very much streamlined for people who have YouTube channels. And you really just sign in with your YouTube account. And you can pay with a credit card.
It’ll show you what videos you have on your YouTube channel. And you can pick which ones you want to have captioned, and then we’ll post the captions back for you. It’s the same captioning service that you get with 3Play Media in the background. But the UI and the process is just really, really focused on YouTube.
So that’s the presentation. There are a number of links here that have some resources that might be useful. We’ll definitely take a minute to look at some of the questions that have come through. And we’ll be back in in just one minute.
All right, so we have a few good questions we want to address. So first, before I go too deep, we are definitely going to be posting this online, so you will receive an email with a link to the recorded version. That’ll, of course, be captioned. So we have a couple questions about technical content and whether we can handle really, really technical, software-type content.
We do quite a bit of work for a number of technology companies, very large software companies and hardware companies. So we’re really used to pretty complex content and technological concepts. We do a lot of that work. And there are also a number of ways that we can really improve the quality of the output. For example, we can take glossaries or vocabulary lists ahead of time and incorporate that into the process.
There are even ways to do that on a file-by-file basis. Let’s say you have a webinar. You could upload a PDF document of the slides. And that’s something that we would use in the process as well. So we’re quite comfortable with pretty complex content. And that goes for enterprise, product-type content, as well as academic content.
There’s some questions about mobile. And that’s been something that’s been an evolving process for captions. We do offer the option to encode a video with the captions that we’ve created. So if you upload a video to us, we can actually encode the captions right into that video, meaning you would download a new video file with the captions being part of that video file. That’s one of the safest ways to deliver captions for mobile devices right now.
Different devices, different operating systems, and different media players all handle mobile differently, which obviously adds some complexity to all this. For example, YouTube, on most mobile browsers, does support captioning at this point. So it really depends on what it is you’re using and how you’re delivering it. But there usually is at least an option.
Just to clarify what we do in terms of translation and different languages, our core transcription and captioning is based on English content. We offer a Spanish alignment option. That’s really the only non-English solution we offer for the source content. From there, we can translate from the English or Spanish into other languages. So we do offer translation and subtitling through our platform.
So there’s some good questions about using captions for SEO and social media. So just to clarify how this is important for SEO, basically search engines can only ingest, really, text-based information. So a video can only be indexed based on what you, as the publisher, give Google beyond the actual video content, meaning title, tags, description. That information that goes on the page in a text form can definitely be indexed. The captions or transcript would just simply supplement that even more and really complement what you have in terms of title and tags.
The reason why that’s important is that you’re increasing your chances of Google really understanding what content is on that page. Google will not ever anytime soon be able to really know what’s inside that video without you giving it more information in a text-based way. So that’s where captions and transcripts make a huge difference when it comes to just pure Search Engine Optimization.
The next part is, one, content consumption in general. The more you’re able to make the content consumable to a broader audience, you’re just increasing your chances of it being consumed and being shared. That’s a huge part of it. And then the other part of even the search engine algorithms that are out there, user engagement is a huge part of the search or the indexing algorithm that Google uses and, in general, a great measure for how successful your content is.
Engagement can absolutely be helped by a transcript or captions, because you’re enabling people to consume the content however they want. So really, improving that user experience is a big part of this. And that is measured by time on site and things of that nature.
There are some questions about different caption formats. So there are a number of different caption formats, some used more than others with web media. Some of the common formats that you’ll hear about for web media are SRT, DFXP. One of the emerging standards is WebVTT, and there are definitely others. The media player will dictate which caption format you need to use.
And then there are some other formats that are more standard broadcast or even DVD formats, like SCC and some others, that are making their way into the streaming and digital world. And one of the reasons for that is that that’s what just easier for these studios and networks to provide to get the captions up online. And there was a decision that was made. The balance is, what’s easiest, and what will get captions up in the most technologically sound way?
And they weren’t exactly a perfect match. So the decision was made to, let’s just get captions up for the captions that we have. And for many people, it was just easier to use the existing standard and adapt to that, rather than try to get everyone to adapt the captions to another standard.
So what we are seeing is some use of maybe some antiquated standards, if you will. But it’s a way to ensure that content’s actually getting captioned. And that was what was viewed as more important.
So if you’re distributing content to some of the large distribution options, such as Netflix or Amazon, sometimes they’ll ask for an SCC file, as opposed to one of the more digital-friendly formats or even a SMPTE text file, which is really an XML format, which is very, very friendly to work with. So that’s what we’re seeing. The challenge, obviously, is that SCC is a more complex format to work with. So there are different tools that can help, but you’re forced into that standard which is a little bit harder to work with.
OK, we’ve gone over a few minutes, and so I just wanted to thank everyone for sticking around. Thank you for the questions. As I said, we’ll be putting a recorded version up online. And please do feel free to reach out with other questions. We’re happy to talk about any of this in more detail. So thanks for your time, and I hope you enjoy the rest of the day.