« Return to video

Captioning the 3Play Way [TRANSCRIPT]

JOSH MILLER: All right. So we’re going to dive in and talk about captioning the 3Play way. So my name’s Josh Miller. I’m one of the founders here at 3Play Media. My contact info is here. Please feel free to reach out.

So we’re going to talk a little bit about who we are, why caption content, and really how we think about captioning content and what are the tools that we’ve made available and what we think is important. I’m going to try to go through the slides quickly and spend most of the time really doing a live walkthrough, and then certainly have some time for questions.

So who is 3Play Media? We’re a captioning, transcription, subtitling, and now audio description company really focused on video accessibility. We have about a little over 2,000 customers across a number of different industries– so education, e-learning, corporate, and certainly media and entertainment. We were spun out of MIT, and so we’re here in Boston, and have been around for almost 10 years.

So why caption content? There are a number of benefits to captioning content. Certainly the most obvious and most traditional one is this accessibility need. So there are 48 million Americans who are deaf or are hard of hearing. That’s a very large percentage of the population. And so there are legal requirements to go along with that in many cases.

There is more and more data coming out about how comprehension is very much affected by having text, really benefited by having text, along with the video. So this could be because you’re in a sound-sensitive environment, so that’s the flexibility piece. But maybe English is not your native language, and therefore, captioning actually helps quite a bit.

One of the nice things about having transcripts and captions is it also aides with video search and SEO. So if you’re putting marketing-oriented content out and you want to make sure you’re optimizing it, it definitely makes sense to think about search engine optimization and how you can use text that you’re creating from captions for search engine purposes.

And certainly if you want to reach a larger audience, and translate, and create subtitles, the first step is creating captions. And then what you’ll do from there is take those captions and translate them into other languages.

Again, going back to that marketing use case, once you have a piece of video, there’s a lot you can do with it. And whether it be actually marking or even in the classroom, having the text of what was said is really an enormous amount of content, and you can do a lot with that.

So we’ve been doing more and more research on the effect of captions. You know, what does it mean? What do people think about it? And so this data comes out of a study we did with Oregon State. It’s on our website. It’s free to download. And I definitely encourage you to take a look. The link is there.

What it found was that nearly all students found captions helpful, and certainly not all of the students were deaf or hard of hearing. This was a normal student population. 75% of students were using captions as a learning aid with their video. And the number one reason wasn’t about an impairment issue or any kind of disability. It was about focus. It was helping them focus on content.

So at 3Play, we’re all about making this easy. That is really what it all comes down to. How do we make it easy? And how do we really do a good job at it as well? So we have an online account system. We give you turn-around options, lots of ways to automate the workflow so that you’re not figuring out ways to transfer files, and then a number of tools to address those search questions as well.

So in terms of captioning, we are using speech technology. Not all companies do this. We’re kind of unique in that regard. We’re using speech recognition as a starting point. We don’t believe it can really stand alone for captions, but we do believe it can add some value.

And so what we’ve designed is a process to take the draft that we get from the speech-recognition engine and clean it up really efficiently. So that cleanup process, that full scrub that happens in step two, is doing things like correcting mistakes, putting in speaker changes, and also putting in the non-spoken elements that are required for captions. So there’s a lot that happens in that cleanup process.

And then in step three, there’s another human step, which is a QA, or a quality review step, that is directed kind of by what happens in step two.

The people who do the work– there are these transcriptionists– are all over the US. They are all US-based, and that is something that we actually take very seriously. They’ve all gone through a very rigorous certification process, and they continue to get evaluated.

What’s really interesting is their stories. These are people who really come from all walks of life. They have all kinds of different skill sets, and that lends itself really well to the work that’s happening here, because we have all different types of content. And so people really can ultimately get matched up with content that they’re either really interested in or know really well, which helps with quality as well.

Part of making it as easy as possible is tying into existing media platforms. So this list might even be outdated already. We continue to add to this. This is really important to us. We started doing this type of work years ago, and this, to us, is really central to making it as easy as possible.

And so these are all out-of-the-box integrations that can be set up in a matter of minutes so that you can then pass video files from your video platform to us and then captions back to video platform with minimal work.

And so I mentioned there’s the account system. I’m going to go into this in more detail, so I’m going to go through these slides pretty quickly. There are lots of different upload options, lots of output format options as well.

So we offer, I think, over 50 different output formats, when you think about all the different caption formats and transfer formats. You can even import existing captions so that our account system could operate as kind of a centralized management console, if you will.

And we can also create captions from existing transcripts. And so we have an alignment option that creates the timing from the existing text that you already have and synchronizes it with the video that you have.

In the world of media and entertainment, it’s really important to be able to support caption placement so you don’t cover text or graphics in the lower third. So we support that as well.

I’m not going to talk about it too much today, but we did launch audio description last year. This is a service for blind and low-vision users. This is where we actually are creating an audio track where there is a voice explaining what’s happening visually on a screen for someone who actually can’t follow along. Happy to talk about that as well. Many of the accessibility requirements are expanding now to include audio description as a requirement, so happy to talk about that if there are some questions about that.

I mentioned that captions and transfers lend themselves to video search. So we have a number of interesting plugins and tools that actually allow you to search across a video, search across an entire media library. And it’s essentially using the same underlying data, that time data, the text data. And so these are very easy to install plugins to latch on to your video and make it searchable.

All right. So I’m going to jump in to another screen here.


Excuse me. And so what you see here is what we call the My Files page of the 3Play Media account. And this is what you’ll see when you initially log in. So I’m going to start by going through getting media into the system.

So for starters, there are a number of different ways to upload content into 3Play for captioning. So you could upload videos right from your desktop, from your computer. You could paste links into the account system. So if you have a YouTube link, or even better, a direct link to a media asset. So let’s say they’re stored on an Amazon bucket, you could paste those links right in.

What we call Linked Accounts, so these are already linked video platforms here. And if I say I want to create a new one, I can pick from any of these different platforms. And it’s going to prompt me to provide certain credentials. So if I click on YouTube, for example, It’s going to actually prompt me to authenticate with my Google or YouTube account. Or if I click on Brightcove, I’m going to be prompted to put in certain credentials that I get from my Brightcove account, and there are instructions here on how to do that.

So let’s say I link up my YouTube account, just to give you a quick view. I would then see all the content in my YouTube channel and be able to quickly choose which ones I want to have captions.

And so what will happen, just so you can see, if you’ve already captioned the file, I’ll get a prompt to say, are you sure you want to caption this? This has already been captioned. So we track everything very carefully to make sure you’re not reprocessing files.

And in many of the cases for these linked accounts, these platform integrations, you can actually make the caption request from the media platform. So in cases like Brightcove, or even Vimeo, and Wistia, and Kaltura, you can actually make the request right from that platform.

In some cases, there is a button. In some cases, there’s a tag concept, but it is very, very easy. And that way, you might not ever even have to log into 3Play Media if you don’t need to. You could actually do everything from the platform interface.

From there, there are options from cloud storage. So you could actually pull content in from a Box account, Dropbox, or Google Drive. And we can also post captions back to your account as well.

And then certainly FTP. So every account is provisioned with unique FTP credentials, so you can use that as well. What’s not shown here, and I’ll quickly show it, is an API. So we have an API. You can create a unique API collection for any reason at any time.

So you can create your own automated workflow with different options, and we have a number of different parameters that can be used for things like turnaround, or metadata. And you can really choose what you want to make available or what you want to take advantage of.

So now that we’ve gotten our content into the system, let’s talk about what it means to have files finished. Actually, let me back up real quick and show you what it means to choose the different turnarounds. So I can put in a dummy link here for a second and show you.

There are a number of different options when I go through the upload process, and all of these options are also available over the API or through a number of the different integrations. So I’m going to stick with our English transcription service. I can pick what turnaround I want, and it’s going to tell me when to expect it back by.

And I can choose whether to put it into a folder or I can create a new folder. And finally, I get a summary. So then I submit my order, and I’m off to the races.

So now if we go back, we see what having all the files here. Now, no matter how we upload, whether it be from our desktop, or from a video platform, or from a cloud-storage account, the files will show up here. So you’re going to have this centralized repository of everything that gets processed.

And you can see on the left here there are different filters for different types of services that you can filter down on to organize your content, and then there are the folders as well. So you can scroll through and find what you’re looking for. I can create a folder here. I can also move content to a different folder very easily.

Another option here is a tag concept. So if you use Gmail, it’s the same idea. You can tag files with different tags. So it’s really meant to allow you to organize your effort however you want.

So if you have multiple people working within the same account, you can have different ways for organizing stuff. If you’re a school and you have different classes, maybe a folder per course makes sense. Or if you’re a media company, maybe a folder per show. And so there are a lot of different ways you can view things.

So let’s take a look at a file. So what you’ll see here is a text transcript of the file. On the right is all the information about the services that were ordered, metadata on this file. And so if SMPTE settings matter, they’re all here as well. So there’s quite a bit of information about the file.

Now, if I go back here– and let’s see if I can find one that came from YouTube, maybe. Ah, here we go. So what you’ll see here is if a file came in from a video platform, you’ll see which platform it came in from, and you’ll see that it’s linked, and you’ll see the video ID.

Now, let’s say you have an existing platform integration active. You could actually populate this information– so all this metadata is editable. You could populate that video ID manually if you wanted to, and then actually click a button to link it to one of your active platforms, and then actually repost it. So there are ways to link files that maybe came in a different way.

So we’ve got this file here. Here’s our transcript. There are a number of different options. I can preview the captions. I can download captions. I can get a link to the captions. I can also edit them.

So what you’ll see here is this is the text. It’s all synchronized with the video. So if I click on this part, this text here, it actually jumped to part of the video. And I’ll pause it. And the captions are shown as a preview as well.

I can go into an Edit mode, and I can go in and make any change I want. And when I make these changes, if I hit Save, it will update it. And then Finalize, it will actually start reprocessing everything.

And why that’s important is that captions have a number of different rules with regards to number of characters per line or time on screen, and we’re actually going to recalculate the optimal caption frames for you based on changes you might have made. So even one little character in a word might make a difference, and we want to make sure that you’re getting the right captions. So it will take a few minutes at most, but it will be pretty quick and then you’ll be able to re-download anything you want.

So speaking of download, once I’m ready to download the captions, I can click this Download button. And you’ll see all of these different caption formats that I can download. If I click one of them, it will just download it on the spot. So a number of different formats, transcript formats, a keyword cloud if I wanted. So all this is available for every file that’s processed. You can download any of these as many times as you want, whenever you want.

What I’ll show you real quick is this idea of Favorite Formats. You can pick a number of file formats that you know you’re going to be using quite often. And why that’s important is that if you’re opening one of these files, there’s this quick Download button. So rather than having to go to another page to pick what file formats you want, you can actually pick one right now and it will download, just like it did before. So you can kind of save yourself a couple of steps. So that’s what that’s all about.

Another thing can do once you have your file here is order other services. So you could order a translation from here, you could order encoding from here. And so encoding is the idea where I want a standalone digital file with the captions actually encoded into that one file. Because as you saw the Download options before, I’m downloading just caption files, and that’s usually the way it works for most web video.

So I’m going to go real quick to some of the settings here. So I mentioned the Favorite Formats already. One of the other settings that come in handy quite a bit are these– what we call Cheat Sheets or Glossaries.

So if you know that you have specialized vocabulary or you have sports content with roster [INAUDIBLE], lots of specific names, you can actually upload content for us to use as resources, whether it be a list or whatever it may be. And it could be a PDF, it could be just some text. And you can upload that for a specific folder, a whole project, or even file by file when you’re uploading.

So this is information that we use in the transcription process. It’s made available to all the transcriptionists themselves so that they have a resource to make sure they’re getting everything as accurate as possible.

In the Transcription Settings here, this is where you can choose what the speaker identification would be. There’s no additional charge for picking how you want speakers identified. And what this is showing you is kind of what’s the fallbacks.

If I choose number, we’ll actually try to get the speaker’s name if it’s appropriate and it’s easy to figure out. Otherwise, this is what the fallback would be. Same thing in the classroom, so we’d be able to have a standardized way of identifying speakers.

This Flag Setting concept is basically– this explains what are we going to do if we really can’t figure out what’s being said. And maybe it’s a unique word or something like that. We’ll actually flag it, and it would look like this. It would look like this, so you have this flagged word here with some question marks and brackets, which get highlighted for you in that Editing interface. So it’s really easy to quickly find those words. It’s unusual for files to have a lot of flags. And again, you get to choose how you want us to handle that.

I’ll quickly show you Audio Description Settings here, only because it’s a new, pretty cool service. You can actually choose from different voices and the speed. So if this looks fast– 200, 250, 300 words per minute– this is actually, based on feedback we’ve gotten, pretty slow for a blind or low-vision user who’s used to screen readers. So that’s something to keep in mind.

And then there are a number of samples here. So if you make changes to the settings, you can actually hear a sample, just so you can get a sense for what it might sound like.

I mentioned FTP settings before. They’re shown here as well. So you can grab them anytime you want.

The Translation Profile is something that we encourage people to fill out if they’re going to translate content. This is basically where you can provide a bit of a style guide to the translator. So things like tone or specialized vocabulary, it will be very helpful for the translator to have as they’re doing their work.

For alignment, and I mentioned the Alignment Settings, so the– or, sorry, the Transcript Alignment for captioning. This is an automated process, meaning we’re using speech technology to take your video and your transcript that you provide and automatically synchronize the text to the video. So depending on the audio characteristics of your video, this may work wonderfully, and it may have some challenges. And so if there’s a lot of music or other sounds in the video, you might want to select one of the review options so that you can actually take a look at how it comes out before you actually consider it final. And there’s even an option to cancel if you don’t think it’s working very well.

I’m going to go to Manage Users and then come back to Plugin Templates. So we have this concept of an account and then projects. So this dropdown here shows all the different projects we have in this one account.

And so Super Users, when I add them, will have access to all of the projects in a given account. And every account starts with one project. And if you want to add projects, great. You don’t have to. You can operate with one project forever if you wanted to.

The reason why projects might be useful is that from a billing perspective, it’s a nice way to organize everything, so you could have an invoice be set out at the project level. It also allows you to control user access, or even the content, and organize everything.

So when I select what roles this person has as a Super User, they’ll have access to all of the different projects. And if I add a Project User, they’ll have access just to that project.

So let’s say I had six projects. I could add someone to two projects in that account. They will only have access to the specific projects we’ve added them to. So each project will be kind of like a replica of everything I’ve walked through so far. It will just have different content and different users.

So quickly going to Plugin Templates, this is where we could create that interactive search capability that I mentioned before. And so this is just a sample here. And so if I click Play, you’ll see that the video will, as it comes up, the text will follow along as it’s being spoken. And I can click on a word to jump to that part of the video.

I have the search capability. And then I’ve got all these different options to choose from. So I could choose a different style, in terms of the skin itself. I can make it collapsible, and I can have it keyword toggle.

So I’m going to update the preview, and you can see I’ve changed the skin. Also, I would choose which media player I’m using, because that’s part of what in the background code that matters.

Here’s that keyword toggle, and so it’s going to actually increase the size of the keyword in this file. And I’ve made it collapsible, so I have this Hide Transcript option.

So when I save this, and I’ll show you with some existing templates already. If I open one of these files and I hit Publish, I have this Publish Plugin option.

And these are all saved templates. We have a number of different saved templates here. But I could basically pick my saved template, and it’s going to give me the code to put in here. So I would basically paste the video player code and then this code, and it renders that interactive transcript experience.

So now, looking at some of the other options here, there are a number of different modules also that you have options for. And we keep these off by default so that there isn’t too much in the account system all at once.

And so if you need these services, by all means, turn them on or ask for access. In some cases, you can just turn them on yourself. So things like Caption Import you can actually turn on yourself, and you have this option to import caption files anytime you want. So if you already have caption files, that works well.

So I’m going to quickly show you the Playlist Search tool as well. And so, again, it’s a simple line of code, and this is going to actually create an entire searchable experience across an entire library. So I could search for captions. I’m going to have to try a different example. So I’ll do that.

So basically what I’d be able to do is I’d be able to search across the entire library and it will show me results of which files have the words I’m searching for. It also gives me this search– this interactive transcript like I showed you before– and it’s all publishable with this single line of code.

So it works with a number of different media platforms, and it’s really easy to use. This is one of the few tools that has an additional fee, just to be aware of. The interactive transcript alone does not have an additional fee. So if you’re just looking for a simple, interactive search experience, that might work just fine.

Just to quickly go over some reporting options for you, I mentioned the different projects in here. I can select different time windows and then basically have a view of the project itself, and even export CSV files. So let’s load the Webinars project. Let’s actually expand this window a little bit.

And so I see in the last year how many files did we process, what users have access to this project. And I could actually pick which projects I want to then export a CSV file that gives me all this data based on this time window.

Similarly for billing, you’ll see there’s a whole repository of invoices– the Current Invoice, where files land until the invoice gets closed out at the end of the month, which invoices need to be paid, and then which are paid. So you’ll always have access to these invoices to review if you ever need to.

And you’ll see that every invoice is completely itemized, so you see exactly what you’re paying for all the time. And again, if you needed to get a CSV report of the invoice itself, you’d be able to do that.

So the last thing I’ll show you real quick is also the idea of notifications. So by default, there are a number of different notifications you might receive, such as an Upload Fail or a File’s Completed.

So you can choose which type of alerts you get, how often you get them. And this is unique to each user, so each user can choose what’s best for them. For billing users, they’ll get these billing options. For non-billing users, they won’t.

And so what I mean by frequency here, is this once a day setting means I’m going to get a daily digest of all the files completed in the last day. I could make it much more frequent if I wanted. Or if I’m uploading a number of rush files, I could actually get notification immediately when they finish if I have a workflow where I need to know right away.

So that’s the nuts and bolts of the 3Play Media account system. This is what we would walk you through and help you get up and running with. We’re always looking for feedback and other tools that might be useful in the captioning process.

Almost everything you see here is based on user feedback. And so we’re constantly trying to build out a system, and a workflow, and an architecture that will really support the captioning process for our users. And so this is the 3Play way of captioning video content.

All right. So there are a few questions here. First, about using a linked account, how are the captions actually added? Are they added directly to the video or is it a separate caption file?

It is a bit unique based on the platform itself and the media player that you’re using, because each player supports, or might support, a different caption file format. Almost all streaming players or web video players these days are operating with the idea of what’s sometimes referred to as a sidecar file, which basically means we’re not going to post a video back. We’re just going to post a caption file back.

So if your platform supports SRT files or VTT files, that’s all we’re posting back. We’re not changing your video at all. We’re just essentially associating a caption file with your video for you. And then, based on the metadata that we’re also associating with the caption file, when you play that video, it knows to display the captions as well. So it gets associated nicely and it’s actually a much more flexible, more lightweight way to publish video in a streaming sense, especially because you might be delivering to many different types of endpoints, whether it be mobile or a desktop. It actually works much better this way.

Kind of related to that, there’s a question about actually providing burned-in captions if needed, which we can do. So that’s getting into the idea of our encoding options that I mentioned briefly.

So if you do want a standalone video with captions burned in– this could be captions that act like captions, where the user has the option to turn them on and off, so with QuickTime, or the captions are actually burned in as open captions, meaning the captions are always displayed, and that works really well for some social video platforms, like Instagram. We do offer that services as well, so that’s slightly separate from the core captioning. It’s actually a separate step. So by default, that’s not what we would do unless you specifically request it. So it is a separate step in the process.

With automatic transcript alignment, there’s a question about what formats should be sent, and could you still use some of the other services and tools? So we try to keep it as simple as possible here. Basically a video file and then a transcript. And ideally, that transcript is a plain-text transcript.

And we have actually some documentation on our site. I didn’t talk too much about it, but we have lots of documentation on our support site, which I would certainly encourage people to take a look at. And that should walk you through everything.

But there is a specific document on best practices for the transcript itself that we would recommend for uploading for transcript alignment. Once that file is into our system as captions through that service, all the same other additional options are available, so caption encoding, interactive transcripts, translation, all of it is there. And so you can absolutely use any of them.

There’s some questions about how the captions are displayed and fonts being used. Usually, we let that be dictated by the video player or the delivery mechanism itself. There are cases where we can override that and dictate the fonts, or the size, and things like that.

There are dependencies there, though. That means the caption file format being used has to be able to support that. Not all do. And the media player being used, or the delivery mechanism, has to be able to support that as well. So if those all line up and you wanted to change the way the captions were displayed, we could absolutely have that conversation.

Well, thank you all for taking the time to meet with us today, and we look forward to hopefully speaking with you soon.