Demo of Captioning Integrations and Workflows
JOSH MILLER: Great. So as I said, feel free to ask any questions. Just type them into that chat window, and I’ll keep an eye on that. So for everyone in the room, my name is I’m one of the cofounders of 3Play Media. I’m going to walk you through some of the integration points that our system has with other lecture capture systems and video platforms.
Whenever you have the chance, if there are specific systems you’re using– I know there’s a group on campus that uses Tegrity but if there are any specific systems you want to make sure I go deeper into, if you want to just type them into that window, I’ll make sure I go deep. I’ll quickly go through just kind of the basic workflow so that you have a sense for what’s going on. And then, certainly, fire any questions at me, and I’ll do my best to answer them.
So basically what you see here is Echo360 and Camtasia. Perfect. So what you see here is the screen when you log in. What you see right away is the ability to upload. You can look at your files. You can use this tool called Clipmaker.
So real quick, what you’ll see here if we hit My Files is these are all the files that have been uploaded into the system. And you can scroll down. Let’s say there are hundreds of files in here. There’d be an option to switch pages. We can search here by title or keyword or video ID, if we want to locate a particular video.
So the reason why I show you this is no matter how you upload, whether it be a manual upload or through one of the integration points, the files will end up here. And that’s important because it gives you a few other options as well.
So let’s say you’re using Echo360. You’re using the integration. We send the captions to Echo360. If for any reason you needed a caption file for a different purpose, you could come in here and get it. Which means we can download pretty much whatever format you need.
So all these different caption formats, transcript formats, these are always at your disposal that you can download any time once the file’s been processed. Even if it goes to another system, you’d still have access to this interface with the ability to download other formats.
So now let me go through. I’ll go back. Let’s go through the upload process. So basically, there are a number of different options to actually upload the content. You could manually upload right from your computer. You can actually send links. So if you have content on a server and you have links to each file, you could actually paste those right in to a field here. I’ll go back.
What we’re actually going to focus on, though, is the site you have a linked account. This is what we call it. This is the idea that you’re using another platform and you’re hooking 3Play up to that platform. So by default, or when we start, we’d actually add a new linked account. And we can choose from any of these platforms. We’re actually adding a couple right now. We’re about to add Panopto, and we’re about to– the audio’s getting choppy. Slow down for a second. Great.
So when we first start out, we’re going to have this option to add a new linked account. What you see here are multiple linked accounts already. So you can actually have multiple linked accounts. That’s fine. You could be hooked up to YouTube. You could be hooked up to Echo. That’s no problem at all. You can have multiple linked accounts per 3Play account.
So now, let’s say we go to Echo. What this does is it actually walks you through some instructions on what you’re supposed to do to enable the integration. And all of these linked accounts– let me go back for a second.
All of these linked accounts, the way they work is there’s usually about a five minute set up process, maybe 10. See, you can use Mediasite as well. That’s great. So the Echo and Mediasite integrations operate very similarly. And I’ll explain.
Basically, there’s usually about a 5 to 10 minute set up process. That set up process, the idea is that you’re actually creating that linkage between Mediasite and 3Play Media, or Echo and 3Play Media, so that our system can recognize a request. And that means we can then pull content in and post captions back as soon as they’re done.
So that set up is just basically enabling all that. And that only has to be done one time. Once that set up is in place, there’s basically a button or a link to request captions for any presentation by default. You wouldn’t have to go through any set up process ever again.
So if we go back to Echo, what you’ll see here is there are a few things to do. You’re going to actually take– these are credentials that are unique to your 3Play Media account. And each account has it. It’s very different. And then you’ll follow these instructions.
All of the instructions– and here’s on our website, you’ll see the integrations that are available. They all lead to the support site. So what you’ll see here is this ability to– scroll down a bit. So these are all some of the different integration points.
And we can even go here to Echo. And there’s a step-by-step guide for exactly what you’ll want to do. And this is the process we’re going through right now, is link your Echo account. So now, there are actually screenshots. It’s a little bit easier to go through here. We’ll pick up back to these credentials.
And the one thing with Echo that you’ll need to do is you’ll need to download this JAR file. And if we look here, it’s actually right here off your accounts. You just click on this link. You’ll download the JAR file. And then you would add that to your Echo server and then also enter these credentials in.
Once that’s done, basically, you’re all set. That means if we look here, here’s what it looks like. You’re adding a publisher– a closed captioning publisher. You’d be able to enter these credentials in because this is all built into the Echo interface. And then once you hit Save, you’re all set.
So now, on every presentation, you’ll see an option to submit captions. And then, here we go. We have documentation on exactly what that looks like. So now, here if you’re in your Echo system, and you want to caption a particular presentation, you basically would just select which one you want. Go to Actions. And add the publisher, which in this case is 3Play Media.
So the nomenclature is a little bit unintuitive, in the sense that it’s add publisher, but what you are basically doing is adding a– think of it as adding a service type. And because you’ve enabled us as a service type, you’d be able to see that option. So enable 3Play Media service onto the Echo system, and that would basically start the process for captioning.
So the process for Mediasite is very similar. I’ll walk you through that, and then I’ll talk about what’s actually happening. So you’ll see here there are instructions. And what it’s going to do is here are the credentials you need to take from 3Play. And you’ll basically go into Mediasite, follow these instructions of where to go, and you’ll be able to paste those credentials right into your Mediasite account.
And then again, same idea. You can, in a folder of content, you can basically pick which files you want to have processed.
So what happens in either case– from either Echo or Mediasite– when you select a video to be captioned, you’re basically telling the Echo system or the Mediasite system to push that video file to us. So that video file comes into our system, and because you’ve already set up the credentials and you’ve set up that integration, it knows exactly where to go. It goes into your account.
And so as soon as that request is made, within a few minutes, you’ll actually see– let me go back to what it will look like. Within a few minutes, you’ll actually see the file on the My Files page. It’ll show up here, and it’ll show that it’s processing. So that will go in to process. We’ll take that file. We will create the transcript and the closed captions.
And then as soon as file is done, we have the proper metadata from that file to post it. To basically send it back to Echo or Mediasite, to the right place. So as soon as that file’s back in the system, it’ll show up as closed captions for the viewers.
So I believe in both cases, both Echo and Mediasite, they both operate using closed captions. So the file that we send technically remains as a separate caption file that’s associated with those presentations so the viewers can turn them on and off as they choose. And that’s what makes them closed captions. They don’t actually get burned into the video itself, which would make it more of an open caption set up. So that’s basically how those work.
Camtasia’s a little bit different. Camtasia, we don’t have a direct integration quite like that where you can automatically send files back and forth. We definitely have– let me go back here– we definitely have a workflow built out so that you can absolutely caption Camtasia content. So let’s go here and we’ll walk through that. So where do we have it? Right here.
So you would have to export the video from Camtasia and then upload it. So if we go back to this upload process, you basically just upload it from your computer. I believe that’s how you’ll do it. Yes. So you basically download the video. And then you can upload it into the system, upload into 3Play, and then you would download the SRT or SMI caption file, which is one of the files we offer any time. And you can just import that into Camtasia.
So it’s all certainly compatible. So I’ll walk you– just to talk through the upload process and the different options. So let’s– I’m just going to type in Test. And what you’ll see is you can choose what folder you want to add it to.
Now, this is a test account, so the costs are all zeroed out. But basically, when you upload, you’ll always have the option of different turnaround times. I believe with Mediasite, you have the choice between standard and rush. With Echo, I believe there’s only a standard option, and that’s just because of the functionality that they’ve built into everything. But if you know you want your content. We know that this course needs to be back in two days every time, we can set your account to do that. That’s not a problem.
So basically, standard is four business days. Expedite is two business days. Rush is one business day. And same day is eight hours. So basically, you have all these options. The only caveat with that is the same– there is a limit on same day of a 10 minute file for the guarantee of eight hours.
So if it’s longer than 10 minutes, you can choose same day. That’s fine. We’re going to do our best to get it back. We just can’t guarantee the eight hours anymore. But more often than not, it’s still back, honestly, within 8 or 10 hours. So it’s still pretty fast.
There’s no concern with an hour lecture for rush, certainly. So you can definitely have it back in one business today. And the four business day turnaround is pretty flexible as well in the sense that that’s a guarantee. It’s very possible that you’ll get it back in two or three days. We just wouldn’t guarantee it without choosing one of the other service levels.
I can also tell you a little bit about what we’re actually doing when we transcribe and caption the content. We’re basically taking your file. We put it through speech recognition first. And then we basically edit that. So we take it from the speech recognition engine. And we really built an editing platform. That’s really what we focus on is this idea of taking a draft from the speech recognition and inserting a human into the process to clean that up really efficiently.
So we’ve built a unique interface specifically designed for that process. And the human editor is going through, making any corrections necessary, putting in punctuation, identifying speakers, putting in any non-spoken cues that are relevant to make sure that they’re true closed captions. Things like that, to make sure that you end up getting a very, very accurate time coded document.
It then gets another human QA check based on a number of different flagged words that are meant to be researched more carefully. So what we end up getting is a time coded document. It’s literally a text file that has a time code for every single word. And what we get out that– I can show you what it looks like.
What we get from there is this core document that we store. And then we have all these templates for the different caption files and transcript files that we turn that initial document into. And then from there– so this is what it actually looks like. You’ll see millisecond time codes and words throughout the entire file.
And what that allows us to do is be really flexible and really thoughtful about how we create the caption frames and even what we do with these files. What it means if we have really accurate time codes for every word, it’s very easy to scale that back and basically have zero time codes or time codes at every certain interval. As opposed to inserting time codes the other way, this way makes sure it’s extremely accurate.
What it also means is if we need to change a template– let’s say the captioning standards change, and now, instead of 32 characters per line, everyone wants 36 characters per line. That’s very easy for us to change. And any changes we make would retroactively apply to all your other files as well. So you wouldn’t have to reprocess anything or even worry about having different-looking caption files. We can apply any kind of stylistic change or template change.
So WebVTT is the emerging standard for closed captions on the web. We offer it now, but the spec still isn’t fully developed. So as that changes, we can easily update the spec and all your files will be ready in that format. So it’s little things like that that can make a big difference. And we’re storing that core time coded document that allows you to make any change.
Another thing you can do in the system is actually edit the transcript. So if you ever need to make a change, you can actually go through all this text that’s synchronized with the video here. So if you click on a word or you hit the Edit button, un-select it, it’ll actually play the video from this point.
So I can review something more specifically. I can click this Edit button. Go in. Make any change I want. I then would want to hit Save Changes and then Finalize, so that it reprocesses anything that needs to be reprocessed in terms of caption frames. But that’s all you have to do,
And then this file would be ready to republish or redownload pretty much immediately. And all the time coding and everything would be taken into account for you. So this is just a tool that you can use.
Basically, in terms of the integrations, the nice part about using Echo or Mediasite– let me go back to this– is that the whole work load is really taken care of for you. It’s really just that set up process, and you can basically have things going pretty much automatically.
One kind of cool linked account option is this YouTube option. You basically can– I’ll show you what this looks like real quick. So this is our channel. We hit Save. So now we can see all the content in our YouTube channel and we can pull that in. And if I select any files and I click Upload, you’ll see that same upload process to walk through.
The other thing that’s really neat about this, though, is I can enable post back. I’m actually using someone else’s computer right now. So you’ll see that I can now choose which Google account to log in with. And what that basically allows me to do is I can basically give permission to 3Play Media to post captions back to my YouTube account.
So that means now we can have that same automated workflow for YouTube as we would with Echo. And we can pull that content in. We’ll transcribe it. Create the captions. It’ll show up on My Files. But as soon as those files are done, we’ll post the captions to YouTube for you and your YouTube files are all captioned now as well. And that’s something that’s pretty unique, as far as we know. We don’t know of too many providers that can do a round-trip integration for YouTube.
Is there anything else that you’d like to make sure I cover? Any questions about what you’ve seen here? So, great question. A question about text-only transcripts. So basically, when we process a file, we’re making all these different output formats available every single time.
And when you make an edit to a file, it’s going to update all these different files. So you’ll have options for different closed caption files, but all these are transcript formats down here. So plain text will be just a TXT file. Plain doc is a Word document. Stamped doc is a Word document with time codes at paragraph breaks or speaker changes. PDF is just a PDF. So these will all be available as well. So you wouldn’t have to really do anything other than just download the files that you want.
And you can even download multiple files at once or a whole folder. So if I click here, I can download the whole folder. And you can move files around to different folders very easily. I can select multiple files. See request download now has a four here. It recognizes that it has four files. And now I can choose which options. It’ll actually create a zip file for me to download with all the files in it.
And you’ll notice that those check boxes– if I go back. I went through a little quick. But I can select multiple formats all at once if I want and download all of those. You’ll get these three formats. If I hit continue, you’ll get these three formats for all four files that have been selected. And you can do this any time. Just because you download once doesn’t mean you can’t download again. You’d be able to download whatever you need.
And we have an API as well. So if you wanted to even build out a more customized workflow. So let’s say you know that files are being posted to Echo. That’s great. You can build out a kind of custom set up so that every day, at a certain time, you ping our API to download any new files in whatever format you want. That’s definitely possible to do as well.
And I should note, the transcription process that we use is definitely different. The synchronization and transcription is happening at the same time. And so we end up producing both plain transcripts and captions at the same time no matter what. The accuracy rate– we are using speech technology as the first pass, so a lot of people want to know what does that mean?
So the accuracy rate for the speech recognition alone is usually 60% to 70% accurate. You will never see that. We would never expose that level of accuracy. We would never consider that to be good enough. We really think about how can we make the human component of the process of getting a really clean transcript and caption file? That’s the goal, is how can we make that human component better.
So what you’re actually going to see is basically an extremely accurate caption file. The measurement we’ve seen is that our transcripts tend to be even more accurate than purely manual transcription. We have a measured accuracy rate of usually about 99.6% accurate, and we guarantee, certainly, over 99% accuracy, for what that’s worth.
Great, well, thanks for taking time today. We’ve got this recording, so we’ll post it up with a captioned version for people to view. Definitely feel free to reach out to me directly if you have any questions. My email address is just josh– J-O-S-H– @3playmedia.com. So feel free to get in touch if you have any questions.