Quick Start To Captioning – Webinar Transcript
JOSH MILLER: All right, welcome and thank you for attending this webinar on closed captioning. My name is Josh Miller. We have about 30 minutes to cover the basics of captioning. We will try to make the presentation about 15 minutes, maybe 20 tops, and leave the rest of the time for your questions. The best way to ask questions is by typing them in the Questions window in the bottom right corner of your control panel. We’ll keep track of them and address all questions at the end of the webinar. And certainly feel free to email or call us any time after the webinar. If questions come up, we’re happy to discuss offline. And for anyone following on Twitter, the hashtag for this webinar will be #3PlayCaptioning, as shown on the screen here.
So today we’re going to give you an overview of closed captioning for web video, including some the applicable legislation. We’ll talk about the services we provide and go over the process and workflow step by step, so you have an idea of what it really takes. So first, what are closed captions? We’re going to take it from the beginning.
Captioning refers to the process of taking an audio track, transcribing it to text, and synchronizing it with media. Closed captions are typically located underneath a video or overlaid to just over the video. In addition to spoken words, captions convey all meaning and include sound effects. This is a key difference from subtitles that we’ll talk about.
Closed captions originated in the early 1980s by an FCC mandate that applied to broadcast television. And now that online video is rapidly becoming the dominant medium, captioning laws and practices are proliferating there as well.
Some basic terminology– so first, captioning versus transcription. A transcript is usually a text document without any time information. On the other hand, captions are time-synchronized with the media. You can make captions from a transcript by breaking the text into small segments, called caption frames, and synchronizing them with the media, such that each caption frame is displayed at the right time.
Then captioning versus subtitling, which I referred to just now. The difference between captions and subtitles is that subtitles are intended for viewers who don’t have any kind of hearing impairment, but may not understand the source language. So subtitles capture the spoken content, but not necessarily things like sound effects or other relevant noises because, most of the time, those people can actually hear everything that’s going on. And in most web video, it’s possible to create multilingual subtitles in addition to closed captions.
The difference here with closed versus open captioning– and we don’t hear open captioning quite as often with web video– the difference here is that closed captions can be turned on or off by the user. So there’s often a little CC button that you’ll see on a video player. Whereas open captions are usually burned into the video and can’t be turned off in any way. So they are always being displayed on the video. And as I said, most web video will use a closed captioning standard so that the viewer has control over whether those captions show up or not.
And then post production versus real-time. Post production means that the captioning process occurs offline, and usually after the content has been recorded or created. Whereas real-time captioning is done by live captioners and will be, basically, streamed in conjunction with the video content. And certainly, there are advantages and disadvantages of either process, depending on what it is you’re trying to do.
So real quick, how are captions used? Although captions originated with broadcast television, nowadays captions are being applied across many different types of media, especially as people become more aware of the benefits with captions on the internet and as laws become increasingly more stringent. Since every video player and software application handles captions a little bit differently, we’ve created a number of how to guides that you can find on our website. So certainly, feel free to ask us give us if you have a particular question about a particular video player, but we do have quite a bit of information on our site to walk you through this process.
A little bit of information about some accessibility laws. Section 508 is a fairly broad law that requires all federal electronic and information technology to be accessible to people with disabilities, including employees and the public. So for video, this means that captions must be added to any video content on the website. For podcasts and audio files, a transcript is usually sufficient.
Section 504 entitles people with disabilities to equal access to any program or activity that receives federal subsidy. So web-based communications for educational institutions and government agencies are both covered by this. Section 504 and 508 are both from the Rehabilitation Act. And many states have even enacted similar legislation to sections 504 and 508 under different names.
The 21st Century Video Communications and Accessibility Act was signed into law at the end of 2010. This law was talked about quite a bit for a while. This expands closed caption requirements for all online video that previously aired on television. And as of right now, it is specifically focused on content that was on television. This law’s often referred to as the CVAA And there is talk of more legislation around video captioning beyond just television content as well.
So a bit of an update on the CVAA law– as of right now, any content that was pre-recorded and not edited for internet distribution should have captions online. This goes for previously aired content, as well as any content that goes up now, on an ongoing basis, is supposed to be captioned.
And as you can see, there are certain milestones for different types of content and when a steady-state captioning capability is supposed to be ready to go. And this is really for the content producers to have the captions ready to go. Most of the distribution channels, whether it be YouTube, or Hulu, Netflix, and even most websites, such as the network websites, most of them can support the captioning. It’s just a matter of making sure it’s published with the captions.
So although the primary purpose for captions and transcripts is to provide an accommodation for people with hearing disabilities, which is certainly critical, people have discovered that there are many other benefits, especially with the tools available on the internet. So first, captions improve comprehension and remove language barriers for people who know English as a second language. We actually hear this quite a bit with some of the educational organizations that we work with. Captions compensate for poor audio quality or noisy background and allow the media to be used in sound-sensitive environments, like a workplace or a library. So if you can’t listen to the audio, or if you can even put headphones on, there’s really no other way to follow along if you don’t have captions.
From a search engine optimization point of view, captions make your video a lot more discoverable. Since the search engines can’t actually index a video file itself, it requires text. And certainly, once your video has been found, captions allow it to be searched even more deeply and reused. So this is especially important with long-form video. For example, if you’re looking for something in a one-hour lecture, you can quickly search through the text instead of having to watch the whole thing. So you can search through the text and jump to an exact point a video if you use the time-synchronized nature of the captions in a certain way. We have tools to do that.
And then finally, transcription is required as the first step towards translation. So if you are trying to reach a global audience, and you do want to make your content available in other languages, the first step would be to create an English closed captions file that can be turned into any other language.
So real quick, in terms of caption formats, there are many different caption formats that are used with different media players. Each media player might have a slightly different spec or different requirement for caption file formats. The image here shows what a typical SRT caption format looks like. You see here three different caption frames with the starting and end time of each frame, along with the corresponding text. SRT, for example, is the common format used for YouTube.
So I’ll give you a little bit of a background on 3Play Media, who we are, what we focus on. 3Play Media started when we were doing some work in the spoken language lab at CSAIL, which is the computer science department at MIT. And we were actually approached by MIT OpenCourseWare with the idea of applying speech technology to captioning for a more cost-effective and more efficient solution.
We quickly recognized that speech recognition alone would not suffice, but it did provide and interesting starting point. And so from there, we developed an innovative transcription process that uses both technology and humans and yields high quality transcripts with time synchronization. So we’re constantly developing new products and ways to use these time-synchronized transcripts, largely with the input of our customers. Whether it be search tools, or connecting pieces of content to the time-synchronized text, and therefore the video, we’re very interested in hearing what your thoughts are and how it can be made more useful.
So our focus is to provide premium quality transcription and captioning services. We can translate into other languages. We can take an existing transcript and add timing information to enable all the same tools. And we have some unique interactive tools that use our time-synchronized transcripts to enhance search navigation. We actually talk about that in another webinar and have quite a bit of information on our website about that.
So we use a multi-step review process that delivers more than 99% accuracy, even in cases of poor audio quality, multiple speakers, difficult content, or accents. So if you think about the speech technology aspect, that usually covers maybe about two thirds of the transcript. That’s basically two thirds is often accurate. So the rest is done by trained humans who, therefore, clean up all the mistakes. And they do it in a very efficient way, based on the interface and software that we’ve built. It actually affords our transcriptionists the flexibility to spend more time on some of finer details of the transcript. For example, we’ll diligently research difficult words, names, and places. So we can put more care to ensure that the correct grammar, and punctuation, and speakers are all identified properly.
So one thing that we’ve found is that no matter how hard we try to get everything right, there are going to be certain proper nouns or vocabulary that can be difficult to get exactly right the first time around. So we’ve actually built in the ability for you to make a change on the fly yourself. So if a name is misspelled, or if you decide you even want to remove a sentence, you can go in, make that change, with the video right there as a reference, save it, and then all those changes immediately propagate through all the output files. So there’s never a need to reprocess something just for a small change.
While we’ve built many tools that are self service, or automated, and we really do try to make the process as easy as possible, a lot of our success as a company is based on the fact that we give our customers a lot of attention and really listen to suggestions and what they have to say about how the process could be made even easier. So we expect to walk people through the account tools. We enjoy building those relationships. It’s through these conversations that we learn about other features that might be worth developing. So we really do value your feedback. We take it very seriously.
So a little bit on the process itself and how you’d actually do this. Getting an account set up is very quick. You can pay by credit card. You can be invoiced. We’re very flexible in terms of the payment process. You can even have different levels of user privileges set up for– you can have multiple users per account, and you can easily set who has access to what.
When it comes to uploading files, basically you would upload the video content to us. We do need the actual video file. And then there are number of ways you can do that. You could do it through the secure web uploader within the account. You could use FTP. We have an API. We also have a number of integrations with some of the leading online video platforms and lecture capture systems. So Brightcove, Mediasite, Kaltura, Ooyala, Echo, Tegrity– all of those have a specific, out of the box workflow that you can use to make the process easier.
And really, what we’re trying to do is make the captioning workflow as unobtrusive as possible. So we’re giving you the ability to automate as much as possible. And the captions themselves, the tools we offer, are all compatible with most video players, so one thing I should note also is that this whole account system we’re talking about is all web based. There’s no software to install at all.
So when you upload your files, you’ll actually be prompted to specify whether you need full transcription and captioning, or if you have a transcrip and just need the timing added, so if you just need it to be aligned to the video file. In either case, you’ll get access to multiple captions and transcript formats. You’ll have access to the interactive transcripts. So in terms of the output, there’s no difference. It’s just a matter of what you’re starting from.
And as I mentioned, with every file you’d have access to many different output formats. You never have to specify ahead of time. You would be able to download one or many formats very, very quickly whenever you want. It’s completely on demand once that file has been created. And you have full access to that at all times.
One thing we recently added is a fully integrated translation workflow as well. So now, if you are going to be adding subtitles to your video, you can do all that right from your account system. So as soon as that file’s complete, you can very easily select which language. There are multiple price and quality options for you. And what we’ve also built in is the ability to add specific preferences, so stylistic preferences such as tone, formality, any kind of vocabulary that would be relevant, in terms of translation. So maybe there are certain words you do or do not want to translated a certain way– all of that can be conveyed to the translators through this translation profile that you would set up. And then from there, once the file is back and translated, we also have an editing interface for the translated file. So that you’ll always be able to make those changes very quickly, and, again, those will update all the formats as soon as you hit that save button.
The captions plug-in we offer is a free tool that lets you add closed captions or multilingual subtitles to pretty much any video. It works with a number of video players that don’t support captions as well, such as Vimeo. It makes your video searchable and SEO friendly when published properly. So it’s a great tool to kind of get the best of both worlds, in terms of the accessibility aspect, but also the high value SEO boost in searchability.
So to install that plug-in, it’s really just based on an embed code that you would pull from your account system. It’ll automatically communicate with the video player so that the captions show up very easily. All of the captions data by default is hosted by 3Play Media, whereas the video will also be hosted the way you normally would host your video. And certainly, if you wanted to, you could self-host the captions and plug-in as well.
So the captions plug-in the works out of the box with a number of video players, as I mentioned. As of right now, all the players I mentioned before– Brightcove, YouTube, Kaltura, Vimeo, Ooyala, JW Player, Flowplayer, Wistia, those are all supported, as well as a few other. HTML5, certainly. And that’s something that we are adding to as well. So whenever there is another video player that’s getting quite a bit of use, we’ll certainly add it for accountability purposes.
So we’ve included a few URLs here that might be helpful. As I mentioned before, we have a number of guides on our website as how to add captions to different types of video players. So that’s the first link. And then our support site has quite a bit of documentation about our tools, how to add captions to different video players, how to set up some of those integrations, and, certainly, how to add the captions plug-in, or what we call the interactive transcript, to a video player. So you should definitely feel free to check those out. The support documents are public. You don’t an account to access those. So we’re just going to take a minute to aggregate the questions. And we will be back shortly.
All right, so first I’m going to talk about– there’s a question about how we handle specialized academic content or really, for that matter, any kind of complex content. And this is a good question that comes up quite a bit. So there are few ways we attack this. First, I mean, quality in general is really important to us. So one of the first things that we’ve done is, every transcriptionist that we have on staff is here in the US. So that’s actually the first steps towards getting high quality and being able to recognize more difficult vocabulary. So that’s one thing.
Next is, in terms of any kind of difficult vocabulary that’s domain-specific, if you have reference materials or vocabulary lists those can absolutely be used in the process. One way would be to just literally send us that whole list. We can incorporate into the instructions set that will get used throughout the process for the whole project, or for a batch. The other option would be, on a file by file basis, there are ways for you to add keyword, vocabulary, speaker names, or anything like that that’s relevant to that specific file. So you can do that through the account system.
Next, in terms of the actual transcription process, we a kind of unique way of doing it. Because we are using speech technology. We do have that full human review, which is a pretty in depth scrubbing and editing process. And then we have a third step, actually, where there’s a QA step on top of all that, which is kind of in addition to what most firms would do.
So what happens in that QA step, we have this concept called flags. So if the person doing that first pass isn’t sure about the word that they are going over, they’ll flag it, which means the QA person, therefore, it knows to try to research that word. If the QA person still can’t figure out what that word is, they’ll leave it flagged, which means you, as the customer, can go into that editing interface and actually highlight any of those flags to quickly edit them. And in that editing interface, when you click on a word, you can actually jump to that part of the video and review it very quickly.
So obviously, our goal is to not leave words flagged. But in certain cases, the reality is that we just may not know the content as well as you will. So we’ll do everything we can to get it right, but we also try to make it really, really easy to quickly tweak a word that really should be reviewed by an expert.
And then the last part that we can do is, we actually can filter content based on the domain that it falls into, and try to route that to people with some more of that expertise. So we might not have biochemists on staff, but we probably do have people who have taken more biology classes than others, and we can take advantage of that. So that’s how we handle more complex content.
In terms of editing files, this is a question related to editing files and in regards to also having integration with one of the video platforms set up. Right now, we have a number of integrations that allow you basically automatically send us a file to be captioned. And then we can automatically send those captions back to the appropriate place so that they show up when you publish your video.
When you edit, you certainly can edit any file, still. You still have full access the account system and all the different file types. When you edit, the files that have been changed will need to be updated in the system that you’re using. In most cases right now, it does require one or two steps to do that. And we can walk you through how that happens. But most often, it’s not automatically going to update. If you’re using our plug-ins, it will automatically update. So I try not to be confusing because there are certainly cases where it will or won’t update, but if you think about as integration requires an extra step. If it’s a plug-in from 3Play, it will automatically update.
Some questions about file formats and video editing software. We offer most caption file formats at this point, pretty much everything except for broadcast formats. And we do even offer some broadcast standards. In terms of video editing and DVD publishing, we do offer the standard caption formats for that. And then kind of depending on how you’re ultimately going to publish that, we’ll dictate which formats you ultimately should use. But more likely than not, we do have the caption format that you would want to use.
And also, a couple of people have asked whether this will be available afterwards as a recorded webinar. This will be posted on our website. We will send out a link to the recorded version of this webinar with captions so that everyone can view it. And that will be public, so you can certainly share if you want, as well.
There’s a question about pricing. Everything is duration-specific, or usage based, if you will. So you really only pay for how much content you have us transcribe or caption. It will be prorated to the exact duration of each file. And all of the fees are based on that duration. So we offer different turnaround options, so our standard being four business days, also offering one or two day options. And we’re even playing around with the same-day option.
So there are number of options for you. Again, everything will be prorated, though, whether it be the one day or the four day. And the pricing is actually available on our website. So if you want the breakdown, I would encourage you to take a look there.
So we are going to wrap this up. Thank you all for your questions. Certainly feel free to reach out to us with any other questions if they come up. We’re happy to be a resource through this process, so we look forward to speaking with you. Thanks again for joining us.