Plans & Pricing Get Started Login

Quick Start To Captioning – Webinar Transcript


JOSH MILLER: OK. We’re going to get started. So welcome, and thank you for attending this webinar on closed captioning. My name is Josh Miller, and we have about 30 minutes to cover the basics of captioning today. We’re going to try to make the presentation about 15 minutes, and then leave the rest of the time for your questions.

The best way to ask questions is by typing them into the questions window in the bottom right corner of your control panel. We’ll keep track of them, and address them all at the end of the presentation. Also, certainly feel free to email or call us any time after this webinar. My contact information is on the screen right now. And for anyone following along on Twitter, the hashtag for this webinar will be #3PlayCaptioning, as shown on the screen here.


So today, we’re going to give you an overview of closed captioning for web video, including some of the applicable legislation. We’ll talk about the services that we provide, and go over the process and workflow step by step.

What Are Closed Captions

So we’ll take it from the beginning. What are closed captions? Captioning refers to the process of taking an audio track and transcribing into text, then synchronizing that text with the media. Closed captions are typically located underneath a video or overlaid on the lower third of the visual. In addition to spoken words, captions convey all relevant meaning, including sound effects. This is a key difference from subtitles.

Closed captions originated in the early 1980s by an FCC mandate that applied to broadcast television. Now that online video is rapidly becoming the dominant medium, captioning laws and practices are proliferating there, as well.

So some basic terminology. Captioning versus transcription. A transcript is usually a text document without any time information. On the other hand, captions are time-synchronized with the media, as I mentioned. You can make captions from a transcript by breaking up the text into smaller segments, which are usually called caption frames, and then synchronizing them with the media, such that each caption frame is displayed at the right time.

Then captioning versus subtitling. The difference between captions and subtitles is that subtitles are intended for viewers who do not have a hearing impairment, but may not understand the language. Subtitles capture the spoken content, but not the sound effects. For web video, it’s possible to create multilingual subtitles.

Then closed captioning versus open captioning. The difference here is that closed captions can be turned on or off, while open captions are burned into the video and are always displayed. Most web video examples that you may have seen are using closed captions.

Post production versus real-time. Post production means that the captioning process occurs offline, and usually takes a few days to complete, while real-time captioning is done by a live captioner or court stenographer. And there are certainly advantages and disadvantages of each process.

How Are Captions Used

So real quick, how are captions used? Although captions originated with broadcast television, nowadays captions are being applied across many different types of media, especially as people become more aware of the benefits with internet-based content, and as laws become increasingly more stringent. Since every video player and software application handles captions a little bit differently, we’ve created a number of how-to guides which you can find on our website under the Resources page.

Accessibility Laws – Section 508

So a quick overview of some of the accessibility laws. One of the laws that comes up quite a bit is Section 508. Section 508 is a fairly broad law that requires all federal electronic and information technology to be accessible to people with disabilities, including employees and the public. For video, this means that captions must be added. For podcasts or audio files, a transcript is usually sufficient.

Section 504

Section 504 entitles people with disabilities to equal access to any program or activity that receives federal subsidy. Web-based communications for educational institutions and government agencies are covered by this, as well.

Sections 504 and 508 are both from the Rehabilitation Act, and many states have enacted similar legislation at the state level, mirroring the section 504 and 508 at the federal level.

21st Century Video Communications and Accessibility (CVAA)

The 21st Century Video Communications and Accessibility Act was signed into law in October of 2010, and expands the closed caption requirements for all online video that previously aired on television. This law is often referred to as the CVAA. And there are laws to expand beyond network television in discussion right now as well.

CVAA Update

The Video Programming Accessibility Advisory Committee, known as VPAAC, was created by the FCC to oversee all recommendations for implementing a quality captioning experience across streaming devices. The VPAAC is in the process of releasing its final report and recommendations to the FCC so that the final rules can be established.

And what you see here is a timeline for content owners to implement the processes to adhere to the new captioning rules. So at the minimum, publishers will have six months to figure out how to make sure any content that had aired on television with captions will also have captions on the internet. So the new FCC rules will apply to broadcast and cable networks, and any other TV station that makes content available online.


So we’ll talk a little bit about the benefits of closed captioning, especially with web video. So originally, the purpose of closed captions was to provide accommodations for the millions of people who have hearing disabilities. But people started using captions for many different reasons. And there are many benefits beyond accessibility in the formal sense, especially in the context of online video.

And we often like to talk about access and accessibility to content being much more than just about the ability to hear or not hear. It’s really about being able to find the content you’re looking for and consume the content that you’re looking for.

So captions improve comprehension and remove language barriers for people who know English as a second language. Captions also compensate for poor audio quality or a noisy background and allow the media to be used in sound-sensitive environments like a workplace. From a search engine optimization point of view, captions make your video a lot more discoverable, because search engines are able to index what is actually being spoken, whereas they can’t index anything within the video other than titles and tags.

Once your video has been found, captions allow it to be searched and reused much more efficiently. This is especially important with longer-form video. For example, if you’re looking for something in a one-hour lecture, you can quickly search through text instead of having to watch the whole thing. And we actually provide a number of search and interactive tools that we will have on display in another upcoming webinar.

So lastly, transcription is also the first step in translating to other languages. So if you are working with a global audience, this is going to be one of the steps you need to take.

Caption Formats

There are many different caption formats that are used with specific media players. The image at the top of this slide shows what a typical SRT caption file looks like. Here we have three caption frames on display. And you can see that each frame has a start time and an end time and then the associated text.

Once a caption file is created, it needs to be associated with a corresponding video file. And the way to do that depends on the type of media and video platform or video player that you’re using. So for sites like YouTube, all you really have to do is upload that caption file to the particular video page, and the association is done for you.

Many other video platforms make the process very easy. And we recently launched a captions plug-in that works with many web video players to simplify the process as well. It even works with video players such as Vimeo, which actually has no caption support at this time. And we’ll talk a little bit more about that in a few minutes.

Company Background

So some information just about 3Play Media. The inspiration for 3Play Media started when we were doing some work in the spoken language lab at CSAIL, which is the computer science department at MIT. And we were approached by MIT OpenCourseWare with the idea of applying speech technology to captioning for a more cost-effective solution.

We quickly recognized that speech recognition alone would not suffice, but it did provide a starting point. So from there we developed an innovative transcription process that uses both technology and humans, and yields high-quality transcripts with time synchronization. So we’re constantly developing new products and ways to use these transcripts, largely with the input of our customers.

Overview of Services

So a little bit about the services we offer. Our focus is really to provide premium-quality transcription and captioning services. That’s really the core of what we’re doing. We also can translate into many different languages. And we have some unique interactive tools that use our time-synchronized transcripts to enhance search and navigation, and just the general user experience when it comes to consuming video content. But like I said, that will be discussed in a separate webinar.

Accuracy & Quality

So we use a multi-step review process that delivers more than 99% accuracy, even in cases of poor audio quality, multiple speakers, difficult content, or accents. Typically about 2/3 of the work can be considered to be done by a computer, or an automated process. And the rest is really done by transcriptionists, where there is that human cleanup process to make sure it’s high-quality.

This makes our process more efficient than many other vendors. But more importantly, it affords our transcriptionists the flexibility to spend more time on the finer details. For example, we’ll diligently research difficult words or names and places to really try to make sure we’re delivering the best possible transcript file as we can. We’re also putting in more care to ensure correct grammar and punctuation.

We’ve done a lot of work on the operational side of the business, such as making it possible to match transcriptionists’ expertise with certain types of content. We have about 300 transcriptionists on staff. So they really do cover a broad range of disciplines. For example, if you send us tax-related content, we can actually match that content with a transcriptionist who has a financial or quantitative background.

And without exception, all of our work is done by professionally-trained transcriptionists here in the United States. Every transcriptionist goes through a rigorous training program before they touch any file. They also go through background checks on occasion, and enter into confidentiality agreements.

Captions Text Editor

One thing we found is that no matter how hard we try, certain proper nouns or vocabulary can be very difficult to get exactly right. So we’ve actually built the ability for you, the customer, to make changes on the fly. If a name is misspelled or if you decide you want to redact even an entire paragraph, you can actually make that change yourself in this interface within the account.

And then when you save it, your changes will immediately propagate through all of the output files. So there is no need to reprocess anything, and you can make any change you need as soon as you need to.

Customer Support

While we’ve built a number of tools that are self-service or automated, much of our success is really based on the fact that we give all of our customers a lot of attention. We expect to walk people through the account tools. Some of them are a little bit more advanced for what used to be a more simplified process for transcription and captioning.

So we really do enjoy building those relationships and getting feedback on what we’ve built. And it’s those continuous conversations that allow us to learn about what other features might be useful. So we really do want that feedback and to learn about what else could be useful.

Account Setup

So a little bit about how the process works. How do you get started? Getting an account set up is a very quick process. In general, payment is very flexible. You can pay by credit card within the account, or you can get invoiced. The account system also has a number of security measures built in. So you can set privileges for different types of users, and make sure that access really is restricted properly.

Uploading Files

Once the account is set up, the next step is to upload the video content to us. There are many different ways to do that. You can use a secure web uploader within the account. You have unique FTP credentials. Or we have an API that you can use. We’ve also built integrations with the leading online video platforms and lecture capture systems, including Brightcove, Kaltura, Ooyala, Mediasite, Echo360, and several others. So if you’re using one of these platforms, then the process is even easier than what we’ve just described.

We really aim to make the captioning workflow as unobtrusive as possible. So we give you the ability to automate a lot of the workflow. The captions and tools are compatible with a number of different video players.

Final Output

After you’ve uploaded the content, it goes into processing. Standard turnaround is four business days. And we have service levels for one and two days as well, if necessary. And you can log into your account, and download pretty much any transcript or caption format that you might need right from that account at any time.

The transcript and caption files are stored in the account indefinitely, so you have full access whenever you need them. You can also use that editing interface that I mentioned to make changes at any time. That is a full-access type tool. There are a number of other features in the account system that we can certainly talk about separately, as well.

Captions Plugin

One thing that we recently launched is our captions plugin. This is a free tool that lets you add closed captions or multilingual subtitles to any video. It even works with video players that don’t support captions, such as Vimeo, as I mentioned. It also makes your videos searchable and SEO-friendly. So we’ve tied all that in, and it is an embeddable plugin.

So to install the captions plugin, you are going to end up inserting some embed code into your web page, just like you would with your video player. The plugin will automatically communicate with that video player. And then the captions data and search are all hosted by 3Play Media. So it’s pulled in over our API. And if you needed, you could also self-host the plugin. So that’s an easy option as well.

The plugin works out of the box with many different video players, as I mentioned, even players like Vimeo, YouTube, and then certainly paid platforms like Brightcove, Kaltura, Ooyala. All of them are compatible, as well as some of the open-source players like JW or Flowplayer. So it’s very flexible and really does give you a number of options when it comes to publishing.

The caption plugin also has search built into it, if you want. So if you see on this image here, there’s a magnifying glass on the right. You can actually search by keyword, and that’s what the popup shows. You can search by keyword and see where that word appears in the entire file, and then jump to that part of the file, something that’s a bit unique for captions. And then if there are multiple languages available, there would be an arrow to allow the user to select which language they want to see the captions in. And they can do that on the fly.

The plugin also has SEO support. So there’s a way to make sure the text of the captions are part of the embed, and therefore in the HTML, which is important. And therefore they will get indexed by search engines.

So that’s what we wanted to go through today. We have some time for questions. So go ahead and type any questions you have into the questions window. We will take a minute to aggregate them, and then start answering them.

We’ve also included several URLs here on this page, for use for resources on our website. And certainly, if you wanted to register, all the relevant links are here, as well as my contact information. So please feel free to get in touch with us if you do you have any other questions. And we will be back in one minute.


All right. Thanks for your patience. We’ve got a number of good questions here. I’m going to start with– there’s a question about pricing. So real quick, our standard rates start at $150 per recorded hour. So everything is based on the duration of the content itself. And then that gets prorated to the exact duration of every file. So even if it’s two minutes, it would be $5 to transcribe and synchronize. And that would include all the different formats you actually saw on that screen, as well as all the tools we offer.

So the only fees that might come into play are if the file needs to be expedited, there’s a little bit of an extra fee, or for content that’s especially difficult to transcribe, whether it be background noise or the microphone was too far away. Then there would be an additional fee. But basically, we’re talking about $150 per recorded hour.

There’s a question about live events. Unfortunately, we do not offer support for any live video or live streaming. All of the content that we work with is based on recorded content.

In terms of SEO, the rules about SEO in video, really, it comes down to general SEO practices. And to get something indexed, it needs to be on the page and in the HTML for a Google bot or any other search engine to recognize it. For video, this is especially difficult, because there’s no way to see what’s inside that video. So in that case, you want to basically offer what is the equivalent of that video, which is really a text transcript. And that will capture what is being said.

So there are a number of ways to add the text to the page to make sure it gets indexed properly. And it really depends on publishing preferences. In some cases, it might make sense to publish the text visually as part of the page. In other cases, it might make sense to not make it show up, especially if you are using captions or one of our interactive transcript plug-ins. In that case, it becomes redundant.

And there is a way to get the text into the HTML so that it’s not redundant, and so that it is legal according to Google and other search engines’ rules. We have a number of ways to do that. We can walk you through it. It’s actually built into the plugin that we offer, because we have a plugin builder with different options. As you choose to embed that into your page, you can actually pick from certain options. And SEO is one of them.

So questions about translation and multilingual subtitles. Basically, the starting point would be this time-synchronized English text file, or the captions file. From there, what we’re essentially doing is maintaining those timecodes for each caption frame, and translating the text from English to another language for each frame. We would then be able to ingest that back into our system for use with a number of our tools.

Or if you just want to use the caption support and subtitle support of the video player you’re already using, each player has a specified way of either linking to multiple tracks, or we’ll have documentation on how to combine those tracks into one track so that it can reference multiple languages.

In general, with web video, there’s a question about how those captions are associated with the file. In general with web video, the captions remain as a separate file. And the player knows how to recognize that there is a caption track, and can provide the proper user experience so that the viewer can either turn the captions on or off.

This is different from, say, DVD or other video production techniques, where sometimes the caption track has to be actually embedded into the core video file. So it’s important to recognize that there is a bit of a different process there. We would actually say that the web process, where it remains as a separate file, is actually a lot easier and a lot faster for publishing purposes.

There’s a question about some of the bigger platforms and common platforms that we see and certainly work with. For online video platforms, we work with Brightcove, Kaltura, Ooyala, Wistia, Limelight, YouTube, Vimeo, KIT Digital. And then lecture capture– we work with Mediasite, Echo360, Panopto, and Tegrity. It’s possible there are even some more that we work with, but those are the ones that we work with quite a bit.

Someone asked about how some of the integrations with the video platforms work to simplify the process. So for example, with Mediasite or Echo, Brightcove, Kaltura, even a number of others, the way it works is there’s a quick setup process where you essentially authenticate one account with the other. So you can actually link your 3Play and, say, Brightcove account, or your 3Play and Mediasite account.

And from there, you literally just have to press a button next to each file saying which ones you want to process. That file then gets ingested into our system. And we transcribe it, synchronize it, and then make the captions available. And in certain cases, those captions will automatically post back to the right place. And in other cases, you have, then, the option of either getting the captions out yourself or publishing an interactive transcript. But really, the entire process is dramatically simplified so you never have to reupload any video files.

Some more questions about pricing. We do offer volume discounts, just to clarify that. I mentioned the starting point of $150 per hour. Like I said, that’s a starting point. We absolutely do offer volume discounts from there. We don’t distinguish pricing based on the type of content at all. We actually accept any type of content. The only times we would distinguish is if the audio is such that it’s just really, really difficult to transcribe, because of more the recording as opposed to the actual content.

There are some questions about iOS and how that works with compatibility. The captions plugin and interactive transcript that we’ve built are JavaScript-based, which means that they are iOS-friendly. They also are more dependent on the actual video player that’s being embedded with them. So they have to be able to link to the video player. And that means the video player has to be embedded on the page, as opposed to an iFrame.

And many video players right now, to be HTML5-friendly and iOS-friendly, are using an iFrame method. So we are looking at ways to work around that so that we can still offer the same interactive tools.

In terms of just captions, if you’re using the caption support of an existing video platform, it really depends more on them than it will on us. So the captions will display without a problem as long as they’ve built in that support for HTML5.

One question, just to clarify– so for live events, while we don’t offer services for live events, if you do capture that content into recorded files and then post them, that’s absolutely something that we could work with. The key for us is having an audio or video file that can be uploaded or ingested into our system to work from. That’s the key input that we need.

And we accept a number of different video formats. Pretty much anything, as long as it’s not a proprietary format, we can accept. So any of the .mov, .mp4, .flv, .wmv, any of those formats that you’re familiar with, will be no problem at all.

And with that, we are actually out of time. So thank you very much for joining us today. Thank you for your questions. Please don’t hesitate to reach out if you do have further questions. We’re happy to go into much more detail if that would be helpful. So thanks very much.

Interested in Learning More?