Closed Captioning Workflows for MediaCore [TRANSCRIPT]
AIDAN HORNSBY: Hi guys. I’m Aidan from MediaCore, and welcome to today’s special webinar on closed captioning with our partner, 3Play Media. So presenting today will be my colleague, James Cross, from MediaCore, and Tole Khesin, 3Play’s VP of Marketing.
And just a quick look at the agenda. We’ll be running over the basics of captioning, including the common terminology you might come across and how captions are created, the benefits of captioning, and some pertinent accessibility laws and standards in the United States. And then we’ll briefly be touching on some relevant trends in video usage and looking at how those relate to captioning workflows and strategies at higher education institutions. And finally, we’ll take a very quick look at MediaCore’s new simple captioning integration with 3Play before closing out with a Q&A session.
So I’d like to hand it over to Tole, who’s going to talk to us about closed captioning.
TOLE KHESIN: Great. Thank you. So as Aidan mentioned, I’m going to talk about the basics of captioning– so, what captions are and how they’re made. Then I’ll discuss some of the applicable accessibility laws and talk through the different benefits that come with captioning.
I’d like to begin with some recent global and national data from the World Health Organization, Statistics Canada, and the US Census Bureau. So there are many interesting data and trends. More than 1 billion people in the world today have some sort of disability, which is obviously a very large number. Nearly one in five Americans aged 12 and older experience hearing loss severe enough to interfere with day-to-day communications. This is a really interesting statistic.
But one of the most interesting conclusions from these studies is that the number of disabled people is increasing rapidly, and disproportionately with population growth. And you might ask, why is that happening? And there are a number of reasons, but the main one actually has to do with medical and technological advancements. For example, the survival rate for premature babies has increased significantly, which is obviously great. But the side effect of that is that more babies are being born with disabilities.
We’re also coming out of a decade of war, and there have been a lot of developments there. For example, with modern armor, soldiers today are 10 times more likely to survive an injury than in previous wars. And again, this is a very good thing, but it also means that they’re more likely to sustain an injury, such as something that leads to hearing loss.
And so all of this points to the fact that disability is a critical issue, and that it will become even more prevalent in the years ahead. And captioning, being an important part of all of this, is why we’re talking about it today.
So we’ll take it from the very beginning. We’ll talk about what are captions. Captions are text that has been time-synchronized with the media so that it can be read while watching a video. Captions assume that the viewer can’t hear the audio at all. So the objective is not only to convey the spoken content, but also the sound effects, speaker identification, and other non-speech elements. Basically, the objective is to convey any sound that’s not visually apparent but that’s integral to the content.
Captions originated in the early ’80s as a result of an FCC mandate specifically for broadcast television. Now, with the proliferation of online video pretty much everywhere, captions are being applied across many different types of devices and media, especially as people become more aware of the benefits and as laws become increasingly more stringent.
So I’m going to talk a little bit about the terminology– so captions versus transcript. The difference here is that a transcript is not synchronized with the media. On the other hand, captions are time-coded so they can be displayed at the right time while watching the video. For online video, transcripts are sufficient for audio-only content, but captions are required any time there’s a video component. And that doesn’t necessarily mean that there has to be a moving picture. For example, a slide show presentation with an audio track would require captions, as well.
Captioning versus subtitling. The distinction here is that captions should assume that the viewer can’t hear, whereas subtitles are intended for viewers who can hear but can’t understand the language. So for that reason, captions include all the relevant sound effects, and subtitles are really more about translating the content.
And I should also point out that this is sort of the distinction in the US, and really in North America. This is the difference between captions and subtitles. But outside of North America, those two words are really used interchangeably, and they’re basically synonyms.
Closed versus open captions. So the difference is that closed captions allow the end user to turn the captions on or off. In contrast, open captions are burnt into the video and can’t be turned off. With online video, people have really moved away from open captions, for a number of different reasons– workflow, obstruction of video, having to maintain multiple versions of the video content. And certainly, closed captioning has become much more of a standard these days.
Post-production versus real-time captioning really refers to the timing of when the captioning is done. So real-time captioning is done by live stenographers, whereas post-production captioning is done offline and happens after the event has taken place. It usually takes a day or two to complete the captions. And there are advantages and disadvantages to each type of process.
Caption formats. There are a number of different caption formats. In fact, we produce more than 50 different formats of captions. And this really sort of has to do with the type of media player that you’re using. So you can see on the left here, there are the most common types of caption formats for broadcast and web video.
The image at the top right– that’s the one with a blue border– shows what a typical SRT caption format looks like. Here we have three caption frames, and you can see that each caption frame has a start time and an end time. So for example, the first caption frame there says, “Hi I’m Arne Duncan, the Secretary of Education.” And you can see that the starting time, when that caption frame will appear in video, and then when that caption frame will disappear. And then it’ll be replaced by the next caption frame.
In the bottom right corner is an example of an SCC caption format. So this is also a very common format used for broadcast and web media, but as you can see it’s a lot more complicated. It actually uses hexadecimal representation instead of plain text.
So once a caption file is created, it needs to be associated with the corresponding video. And there are basically three different ways to do that, depending on which publishing platform or which device you’re developing the video for.
The most common way is called a sidecar file. So basically what this is, this is a separate file that contains the captions. And what you do is you feed that caption file, along with the video file, to the video player or platform, and the platform will render the captions on top of the video. YouTube is an example. With YouTube, all you have to do is you upload a caption file for each video, and then the YouTube video player takes care of overlaying the captions on the video.
With certain platforms, like iTunes, you need to encode the captions into the video itself. So that means that the captions are inserted into the video file as a track. With encoded captions, you’re not going to have a separate caption file. It’ll just be one file. Usually it’s an MP4 file, and the captions will already be inserted in there.
And then the last way is open captions. This is something that we discussed in a previous slide. So basically with open captions, they’re burned into the pixels of the video itself. And so with open captions, you don’t have to worry about having a separate caption file. You don’t have to worry about even uploading or encoding the captions into the video. They’re in the pixels of the video frames.
But as we mentioned before, there are a number of disadvantages of open captions, mainly that they can’t be turned off and they may obstruct certain content on the video.
I should also point out that if you’re using a video platform like MediaCore, then this step of associating captions becomes trivial, because all of this happens automatically.
All right, so we’ll talk about the benefits of captions and transcripts. Accessibility for the deaf and hard of hearing is probably the most important benefit. In the US alone, there are 48 million people that have some degree of hearing loss, and captions are a very effective remedy to providing equivalent access.
Another benefit is that captions improve comprehension. The BBC conducted a study showing that 80% of people who turn on closed captions actually don’t have any hearing disability whatsoever. They do it because it helps them to better understand the content. It also allows people to view the content in sound-sensitive environments, like a workplace.
Searchability is another important benefit. We create a number of tools that leverage the caption data to make the video searchable and interactive. In a recent study conducted by MIT OpenCourseWare, 97% of students using interactive transcripts said that being able to search through the videos enhanced their learning experience.
Search engine optimization, or SEO, is another benefit. If your goal is to maximize the number of people who find your videos through search engines, then transcription can really help because it gives search engines a much deeper understanding of what the video’s about. The result is that it ranks higher in search results. There’s a study that was conducted by Discovery Digital Networks that concluded that videos with closed captions had 7.3% more views as compared to videos that did not have closed captions.
Reusability is something else that should be considered. So captions and transcripts can be repurposed into many other forms of content. For example, some of our customers have told us that their professors use transcripts from recorded lectures as a starting point to write a textbook or a journal. And then at a graduate class at the University of Wisconsin, a survey found that 50% of students use transcripts as study guides.
So if you want your video to reach a global audience, then creating translated subtitles is obviously something that’s very important. And having captions in the source language is always the first place to start. So that’s another benefit.
And of course, probably the main reason why people add captions is because they have to, because it’s required by law. And so we’ll talk more about that in the upcoming slides here.
OK, so in the US, there are three federal laws, each having a different area of impact. Sections 508 and 504 are both from the Rehabilitation Act, which was enacted in 1973. And it requires equal access to people with disabilities for all federal programs and programs that receive federal funding. Section 508 is a fairly broad law, and it requires federal communications and information technology be accessible for employees and the public.
Section 504 is slightly different. It’s basically an anti-discrimination law that requires equal access for disabled people with respect to electronic communications. For video content, this means captioning. Captioning is required. For audio content, transcripts are sufficient.
Both of these laws, both Section 508 and 504, apply to all government agencies and certain public colleges and universities that receive federal funding, such as through the Assistive Technology Act.
Another thing is that many states have enacted their own laws on accessibility that mirror federal laws.
So the ADA– that’s the Americans with Disabilities Act– is another very broad law that requires equal access to people with disabilities. It was enacted in 1990 and was broadened in 2008. Title II and Title III pertain to video accessibility and captioning. Title II is for public entities, and Title III is for commercial entities.
In recent years, Title III has had a lot of activity. There was a landmark lawsuit a few years ago where the NAD sued Netflix on the grounds that many of Netflix’s movies and videos did not have captions. And this was a really interesting case, because the way that the ADA works is that in order for it to apply, the entity needs to be construed as a place of public accommodation. And historically, this law has been used for physical structures, such as the requirement for buildings to have wheelchair ramps. And in this case, Netflix argued that it’s not a place of public accommodation. Rather, it’s basically a website that provides streaming video.
But the court ruled that Netflix did, in fact, qualify to be a place of public accommodations. And this had a very profound implication on the future, because many other types of organizations, especially those that are large enough to have a global economic impact, can also, and will likely also, be construed as places of public accommodation and will be subject to the ADA.
More recently, or actually very recently, the NAD filed lawsuits against both MIT and Harvard, and they cited the ADA as well as Section 504.
So the next one here is the CVAA, which is the 21st Century Communications and Video Accessibility Act. This is the most recent accessibility law. It was passed in October of 2010, and it requires captioning for all online video that previously aired on television. So this applies to publishers like Netflix and Hulu. It usually does not apply for educational or government or even corporate content, because it’s really only for content that aired on television and in parallel is also being published on a website somewhere.
OK, and then the last slide I wanted to talk about– this is the AODA. This is the Accessibility for Ontarians with Disabilities Act. This is probably the world’s most progressive law on accessibility. It was instated in 2005, but the most stringent aspects of it were phased in last year. The AODA regulates accessibility standards for government and business sectors in Ontario, Canada.
And as of January of last year, all new websites and web content must conform to WCAG 2.0 Level A guidelines, which I’ll talk about briefly in the next slide. The only exceptions really are small private organizations with fewer than 50 employees. So apart from that, everyone else needs to comply with WCAG 2.0 Level A, which does require closed captioning for video content.
So Web Content Accessibility Guidelines– that’s what the acronym WCAG stands for. So these were created by the Worldwide Web Consortium, the WC3 Web Accessibility Initiative. WCAG consists of a series of guidelines for making web content accessible. It’s important to know that WCAG is not a law. These are basically just guidelines set up by a nongovernmental organization.
There was WCAG 1.0, which was introduced in 1999 and was superseded by WCAG 2.0 in 2008, and it consists of three levels of fulfillment criteria. There’s Level A. And so looking at this through the lens of captioning, Level A requires captioning for all pre-recorded content, Level AA requires captions for all live content, and then level AAA requires also sign language for pre-recorded content.
WCAG has been adopted by many different countries throughout the world, and the EU also references the WCAG guidelines. And the other important thing is that Section 508 is in the process of getting a refresh. Originally, when the Section 508 standard was developed, the internet was really just getting going and online video wasn’t something that people were really thinking very much about. There wasn’t enough bandwidth, really, for people to use online video the way it’s currently being used.
And so the standards didn’t really take captioning into account very much, but the Section 508 refresh that’s coming into place right now, that is going to specifically reference WCAG 2.0 So basically, there isn’t 100% overlap, but there’s a lot of overlap. If you conform with WCAG 2,0, then you’re going to conform for the most part with the Section 508 refresh.
So the FCC last year came out with specific quality standards for captions, and there were four parts to the ruling. So previously, there wasn’t really a good rule to be able to distinguish high-quality captions from low-quality captions. And this was sort of the reason why this ruling came out.
And so there are four parts were– the first one was caption accuracy. Basically, the FCC says that the captions must match the spoken words to the fullest extent possible. That means greater than 99% accuracy. There was some leniency for live captioning, because that’s something that needs to be done, obviously, very quickly, and so they forgive some errors with that. But in terms of prerecorded content, they said that the accuracy pretty much needs to be flawless.
The next part was caption synchronization. And so basically, what this means is that the captions need to sync up pretty much flawlessly with the video. It’s not acceptable for caption frames to appear five seconds after the words were actually spoken on the screen.
The next one is program completeness. This means that captions need to exist for the content from the beginning to the end. There had been some complaints that sometimes the captions would drop off toward the end of a program. For example, there might be some content that would happen after the credits rolled, and that content would not be captioned. And so now, with this new ruling, captions need to exist from the very beginning to the very end of a program.
And then the last part of it has to do with on-screen caption placement. So typically, captions are placed at the bottom of the video frame. But sometimes those captions can obstruct critical content that’s in the video frame, especially if there’s other text that’s on the screen. And so the FCC rules require that if there is critical content in the bottom of the screen, then the captions need to be relocated, usually to the top of the video frame so they’re not obstructing that content.
So we have done a number of surveys, and we ask people this question. We ask them, what is your biggest accessibility concern? And this is sort of like an aggregate of those results. And so the options here are cost/budget, ease of use, resource time, and not sure I need to. And the second biggest contributor here is cost/budget, and that’s something that’s very understandable. Captioning is definitely a lot less expensive now than it used to be, but it’s still a pretty expensive thing, especially if you have a lot of video content. So we understand that that’s going to be a concern for people.
But what we found surprising is that the biggest reason that people check off is resource time. And the reason for that is that captioning, while on the surface it seems like it’s something that would be pretty easy to do, once you’ve actually dive into it and once you actually start thinking about how to add captions to video content in order to make it universally accessible across all the different media devices, it can become pretty complicated.
And that is really sort of the basis for our company. This is the reason why we do what we do. Our goal is really to just make that process as simple and streamlined as possible. And so we have developed a process that produces extremely high-quality captioning, but we also provide an online account system to allow you to track all of the jobs that have been requested and to see the progress of all of the caption requests. We provide a number of different turnaround options. We even have our fastest turnaround as two hours. You can actually have captions for your videos back within two hours. Our standard turnaround is actually four business days, and there is a range of turnaround options in between.
We’ve also put in a lot of development efforts in order to streamline and automate workflows. So in a lot of cases– for example with MediaCore, as James is going to show you in a few minutes– basically a lot of these potential headaches that we’ve been talking about just now are just a non-issue. Basically, with MediaCore you just select which files you want to have captioned, you press a button, and everything happens automatically behind the scenes. So that’s something that we’ve put a lot of effort into.
We’ve also made it possible to import your existing captions. Or if you have just transcripts without the timing information, you can import those as well and we’ll synchronize those to the video and create captions out of them.
And we also create a number of different video search tools, as I mentioned in one of the other slides. We create a number of tools, like an interactive transcript, which allows you to search through the video and click on any word to jump to that point in the video, or search across a large archive, which may contain hundreds if not thousands of hours of video content and jump to a very specific point in the transcript.
Great. So with that, that’s my last slide. I’m going to hand things off to James.
JAMES CROSS: Great, Tole. Thank you very much. Right now, video is absolutely eating the internet. Video has become huge in both the consumer and the educational world. So these are some charts from some Cisco research. In 2013, video took up just under half of all global internet environment. And by 2016 next year, Cisco predicts that it’s going take up 86% of all internet traffic. So video is absolutely huge right now.
As I mentioned, that’s not just happening in the consumer world. It’s also happening in the educational world, too. And a lot of this change is being driven by millennials and college-age students. So of all of the age groups that are consuming online content, the 18-to-24 age bracket are consuming more than anybody else. Everybody’s watching a lot, but they’re watching even more of it. And so these college-aged students are really driving massive video consumption in the educational world, too.
And for this reason, all of a sudden video has now become very, very central and core to the online learning experience. Lots of the universities and institutions that we work with here at MediaCore now see video as a mission-critical system. It’s just so core to all of the work they’re doing in online learning.
And I was actually at a large university today that had really ambitious goals to double registration for their online programs in the next two years. They actually see high-quality video content as being one of the key competitive differentiators in that space and as being something that’s really going to drive the growth of their online programs. So in 2015, video is absolutely huge in online learning. And it’s only going to get bigger.
And because videos have suddenly become so core to the whole online learning experience, it’s now more important than ever that every student is able to access that content, because it’s playing such a lead role in the learning experience. So in 2015, video accessibility– it’s always been important, but because video is so core to that whole learning process and because institutions are leaning so heavily on video content as part of their online and blended courses, it’s more important than ever that every student is able to access that learning content, regardless of any disability they might have.
And of course, when you look at the type of video content being produced across a typical university, there are lots and lots of different types of content being produced. From the marketing department, it might be producing high-quality promotional videos and things like videos of visiting speakers being shared publicly to raise the profile of the institution and its courses and staff. Then there might be lecture capture content that’s automatically captured in the classroom and that’s uploaded online as part of the online learning experience of students to access.
Then we’ve also got student-created content, which is something we’re really passionate about here at MediaCore. And so we’re actually seeing increasingly institutions allowing students to create their own video content to demonstrate and share their learning, and even as part of things like video assessments, as well.
Then we have instructor- and professor-created content– so instructors creating short pieces of the content to embed into their online courses and share with students, often as part of a flipped classroom or a flipped learning or blended learning approach.
And then finally, another trend we’re seeing is instructional designers and media teams actually creating high-quality video learning content that’s specifically designed for online video and is specifically designed to go into their online and distance courses. So we’re seeing lots and lots of different types of content being produced across campus. And of course, when you’ve got these requirements around video captioning accessibility, it’s really important to think about how the different types of content that are being produced across campus are captioned and made accessible to learners, as well.
Another thing that’s really important to consider is that most students are actually consuming video content on mobile devices and smartphones. So this is a chart from the end of 2014, and this is a chart that shows the video consumption across the web and which types of devices people are viewing videos on.
So at the end of 2014, video viewership on mobile phones and tablets was around 40%. And actually, I saw some news. The head of engineering at YouTube spoke at a conference yesterday and actually shared that now over 50% of all YouTube video views are happening on mobile.
And so it’s really likely that when your students watch video content, they’re not going to be doing it on a desktop machine. They’re going to be doing it on a smartphone or tablet device. And so it’s really important to then think about the student video experience when it comes to those devices. And In 2015, mobile video accessibility and the ability to access those captions is absolutely crucial, because it’s likely that students are going to be accessing that video learning content on mobile devices and smartphones.
And a little bit earlier, Tole mentioned that there are different types of caption formats. And some of those are more compatible with mobile devices than others. Here at MediaCore we convert our captions to WebVTT, which is an emerging HTML5 way of sharing captions and which gives us lots of control over the captioning experience on mobile devices, as well.
So we’re very much a mobile-first company. We recognize that students are going to be consuming video learning content more often than not on their mobile devices, and so it’s really important to us that we give those students a great video accessibility experience on their mobile devices, too.
So here at MediaCore, we’re an online cloud-based video platform being used by some of the world’s leading institutions to share media with their students across any device. We also link up to things like LMS platforms and authentication platforms, so we’re really highly adapted to the workloads and the systems that higher education institutions are using.
So MediaCore offers a cohesive, intuitive cloud-based platform that gives institutions all the cloud-based tools they need to, first of all, get content into the cloud, then to manage it in a way that kind of mirrors the way that higher education institutions need to manage that content and permissions and security and view analytics around that content.
And finally, we give institutions tools to then share that content to students across any device, including via the learning management system, [INAUDIBLE], or content management system, and also to any device that a student might be doing that content on.
So MediaCore works with some of the world’s leading institutions to help them really bring video into the learning mix and to build video into their online courses and blended courses and to safely share that video with students. And due to some recent legislative changes and lawsuits, we’ve heard from our institutions that video captioning is very much on their minds right now, and that they want to ensure that their content is captioned and that they’re able to put smooth workflows in place to allow and empower their instructors to caption content, and also for that student to request captions for content, as well.
And Tole mentioned earlier that when 3Play did some research and asked institutions and organizations what was stopping them, what was the biggest barrier to them providing captions for their content, it was mentioned that it was kind of resourcing and workflow that was the issue. When you’ve got thousands of videos being shared across hundreds of online courses, it makes it really difficult to actually manage captions, to see what’s captioned and what’s not, and also to manage costs around that captioning as well and to ensure a great student experience at the same time.
So for the past six months we’ve been working really closely with our university partners to look at how they want to approach captioning and look at the types of workloads that were going to work best for them. And so we’re really pleased to announce that, in the past week, we’ve actually launched a new captioning workflow that’s specifically designed for higher education.
And this allows instructors to request a caption for a video they upload. It actually puts the power in the hands of instructors at the point that they upload a piece of video learning content. We then give the institution some back-end management tools to vet those requests, and then to decide which type of captioning should be applied and to decide which of 3Play’s services should be applied to a different piece of content.
As Tole mentioned, there were lots of different turnaround times, and institutions need to start thinking, really, about the types of different content they’re producing and how quickly captions need to be made available for those different types of content. So it might be that a piece of student-produced content that’s going to be shared with a smaller group can be submitted to 3Play with a longer turnaround time than maybe an urgent event that’s happened or a visiting speaker that’s going to be shared very quickly or a really core piece of online course content. So we give institutions control over the services they use and the workflows they use to actually get that content captioned.
And then once the captioning is complete, the captions are automatically pulled back through into MediaCore and automatically made available with that video wherever that video is embedded, whether it’s in the MediaCore experience itself or inside of an LMS course [INAUDIBLE]. And in a moment, we’re taking a look at exactly how this works to give you a sense of that, also. But the workflow here is really designed in conjunction with our university partners, and we’ve worked with them really closely to look at the kind of workflow they need behind the scenes and that’s going to reduce the administrative burden of providing captions for that content in conjunction with services like 3Play.
And really backing all of this up is the new caption command center. This gives administrators within an institution and instructional designers a quick, at-a-glance overview of captions across the whole MediaCore platform. So you can very quickly see how many captions have been uploaded, what percentage of media across an institution’s MediaCore sites have been transcribed, and the status of different captioning jobs that have been sent to providers like 3Play, as well.
So I actually wanted to jump into MediaCore very quickly and give you a sense of how this works in the flesh. I’m going to jump through to a MediaCore site, and then I’m going to go into the admin panel here. Just to give you a sense, first of all, of how easy it is for an instructor to request a transcript of a video, and then how easy it is for an administrator on the site to then approve that request and see the status of transcription and captions across the site.
So I’m going to go to the upload interface. So right now, I’m a professor and I’m going to go ahead and upload a piece of video course content to share with my students. So I just click onto the file, and then I’m able to go ahead and upload a video to MediaCore. I’m just going to go and grab a video and upload that piece of video course content.
So that’s uploading to MediaCore. And then as I scroll down the page, you’ll see that there is a section for transcriptions. So as an instructor here, I can actually go ahead and request a transcription for a specific video.
So I’ve got two options here. If I have an existing caption file– so if I already have a caption file for this content, I can go ahead and upload that straightaway and MediaCore will just make that available instantly along with that video. But if I don’t yet have a caption or transcription file for this video, I can click the Request a Transcription button. And then when I get to publish that video, once that video finishes uploading, MediaCore is then going to recognize that as a transcription job request.
So I’ve jumped into the transcription command center now as an administrator. And first of all, I can see on the dashboard here all of the different statuses of captions across my site. So I can see that I’ve got two jobs that are pending approval, two requests from either students or instructors to actually have this content captioned. I can see a number of transcripts are in progress with 3Play’s captioning service. I can see 46 of my videos have been captioned across my site, and I can see that that represents 39% of the videos across my site have been captioned. So this gives administrators a quick, nice, at-a-glance view of how many of all of their video learning materials have been transcribed and are accessible to students.
Now, in the command center here I can also see all the requests that have come through from either students or professors for a transcript for a video. And if I want to, I can click into the video. I can check that it’s a high-quality piece of content with decent audio that’s going to reflect well when it’s captioned. I can even jump into the info panel here and see which instructor or student requested that transcript, as well.
So I want to approve this, and I want to send this to 3Play to be transcribed. So I’ve got three different options here, and I can send this video to 3Play to transcribe with any of their different turnaround times. So this is a high-quality piece of course content. I want to make this available very quickly, so I’m going to have an expedited transcription created. And then I can choose to approve or deny this request. And of course, I’m going to approve it.
That’s then being sent automatically to 3Play. 3Play are then going to transcribe that content in line with the profile I’ve selected here. Then once that’s been completed, the transcript is automatically going to be pulled back through to MediaCore and presented as closed captions along with this video wherever that video’s embedded. And that will just automatically update as soon as those captions have been sent back from 3Play.
So the institution would have a relationship with 3Play. 3Play would handle things like billings for transcriptions, and the institution would also have access to 3Play’s control panel on their end. But MediaCore, really, we want to facilitate the workflow and make it very easy for institutions to manage the transcription workflow, to send content to be transcribed, and then to really have confidence that the videos across their site are transcribed in the correct way.
Now, I’ve showed you a through MediaCore platform here. But of course, institutions don’t just use videos through their platform. They also use them increasingly through their LMS, as well. So this same workflow works great inside of the MediaCore LMS plugin. I’m showing you it in Canvas right now, but this would look exactly the same in any of the other LMS’s that we support, like Drupal and WordPress.
And I can upload a brand new video here. It’s the same workflow you saw just moments ago. I can go and upload a video directly to my LMS course, and I have the same option here. So I can actually request a transcript from 3Play for a video directly from my LMS. I request a transcript, and then as soon as that transcript is ready, those captions are automatically going to appear in the video that’s embedded in my online course. So we’re just trying to make it as smooth and as seamless as possible for institutions to make captions available with their videos and to give them the workflow they need to really manage that process across the institution.
Great. So that’s it for my input. If anybody does have any questions, we’d love to hear them. You can use the questions feature in GoToWebinar to ask any remaining questions you might have. And I think Aidan is going to take a look through the questions and ask Tole and myself the relevant ones.
AIDAN HORNSBY: Thanks, James. So yeah, we’ve had a few questions come through. But if anyone has any other ones, please submit them now.
The first one is, I think, one for you, Tole. So what is the best process for creating closed captions?
TOLE KHESIN: The way that we do it, we have a three-step process. When we receive a video, we first put it through speech recognition, which gets it to a point where it’s about 75% or 80% accurate. We then have a professional transcriptionist who will go through word by word and clean up the mistakes left behind by the computer. And then we also have a separate person, a QA person, who will go through and research difficult words and make sure we’re adhering to all the different standards that we have in place for transcription, double check all the grammar and punctuation.
And so by the time we’re done with it, it’s pretty much a flawless, time-synchronized transcript. We average about 99.6% accuracy, but the computer, the speech recognition process, has done 3/4 of the work. So it’s just a very efficient model.
AIDAN HORNSBY: Thanks, Tole. And just one more on the actual pricing model for captioning. How much does it cost to have things captioned?
TOLE KHESIN: Yeah. So I think, as James mentioned, basically the way that the captioning works is that you would need to just set up an account with 3Play Media. And the pricing is all based on what you use. It is based on the duration of the content. So if you upload a one-minute video, typically it starts off at $2.50. If you upload a one-minute video, it’s that much. If you upload a two-minute video, it’ll be $5. And it’s all based on what you use.
There are also volume discounts, so what a lot of universities will do is pre-purchase a bucket. For example, you buy a bucket of 100 hours and that locks in a volume discount. And if you go to the 3Play Media website, there’s a link there for pricing and you can see the entire pricing schedule there.
AIDAN HORNSBY: Thanks very much, Tole. And now one for James. It’s from an instructor who said, we’ve got hundreds of instructors creating and uploading content. What’s the best way to control those costs? I assume they mean on captioning that much content.
JAMES CROSS: Sure. Yeah, this is something we hear quite often. Because video has suddenly exploded in higher education, as I mentioned earlier there are lots of different types of content being produced across campus, including instructor-created content, but even student-created content, too.
And what we’d encourage institutions to do is to have policies in place around the different types of content and the accessibility for that content. So I’d look at things like the audience that content is going to reach, whether it’s high-quality course content that plays a core role in the learning experience or whether it’s maybe just a piece of content that’s going to reach a much smaller audience and have a much smaller lifespan, or a piece of content that’s going to be primarily just archived and not viewed on a regular basis by students or instructors.
So it’s really thinking about the different types of content that are being produced across the institution, and then thinking about the different captioning options available and deciding on which to apply to the different types of media, really.
And also, another thing we do through the MediaCore platform through our captioning workflow– which I didn’t get a chance to show just now– is we actually have a feedback mechanism for students, as well. So students, if they reach a video that for whatever reason isn’t captioned, they can actually request a caption through the same process from the front end of the MediaCore platform, and then the institution can decide what to do with that request and send it to 3Play or a different provider to be captioned.
So it’s a case of just really having a policy in place, thinking about the different types of content that are being produced, and thinking carefully about how those different types of content can be captioned.
AIDAN HORNSBY: Thanks, James. And now one more for you, Tole. Is speech recognition sufficient for accessibility?
TOLE KHESIN: [CHUCKLES]
Yeah, speech recognition is interesting. Video platforms have done a lot of interesting things with speech recognition. For example, YouTube is probably the best example. They provide– they call it auto-captioning for all of the videos that you upload. I think often that sort of gives you a gist of what’s going on, depending on how well the audio was recorded. And there are, perhaps, other cases where speech recognition in sufficient. But in education, it absolutely is not sufficient.
So speech recognition, as I mentioned before, typically produces, if you have very good speech recognition, maybe 75% accuracy. And so what that really means is that one out of four words is wrong. And the thing about speech recognition is when it’s wrong, it’s wrong spectacularly. So it’ll completely send you in the wrong direction if you’re relying on that text to follow along and understand what the content is about.
And especially within education, a lot of the content can be complex, have a lot of specialized vocabulary, professors can have accents. And all of those factors really tend to work adversely when it comes to speech recognition. So it’s really, really critical to have professional transcriptionists go through and clean up those errors so that we can reach an accuracy of a minimum of 99%. And as I’ve said, we’re often even much higher than that. But 75% is absolutely not sufficient for education.
AIDAN HORNSBY: Thank you. One more for you, Tole, around the services that 3Play are offering. Do you offer any live captioning for webcasts or special events, like commencements or guest speakers? Or could you recommend a service?
TOLE KHESIN: We do not do live captioning. Everything that we do is pre-recorded content. We have some very fast turnaround options, but we don’t do live captioning.
AIDAN HORNSBY: Fantastic, thank you. And one last one for you here regarding captioning laws. Are there laws yet addressing legacy content and when those should be brought into compliance?
TOLE KHESIN: Yeah. It’s a difficult question to answer because it really has a lot to do with the type of content and the type of sector that you’re asking about. So, for example, the CVAA, the 21st Century Communications and Video Accessibility Act, which was put in place in 2010, does have sort of legacy laws. I mean, basically, it says that any time a piece of content is put online, if it has ever aired on television with captions, then when it’s published online, it also has to have captions. So that is sort of a legacy rule in place.
As far as in education goes, a lot of that really depends on which state you’re based in. California’s probably the strictest state, and there’s sort of a spectrum of different laws around the country. And I know that in Ontario– Ontario, as I mentioned before, is probably the strictest. With the AODA, that has the strictest rules. That does have some legacy aspects to it built in.
AIDAN HORNSBY: Thanks very much. And now one more for James. So within MediaCore, once someone has transcripts for content, can they use that metadata for anything else besides closed captioning?
JAMES CROSS: Great. So this is actually something that we’re very interested in. As Tole mentioned earlier, it’s not just captions that these transcripts can be used for. It’s also things like SEO and in-video search. And so we’re actually working behind the scenes right now to develop some new ways of using that caption data to provide interactivity and things like [INAUDIBLE] search, as well.
So as we kind of watch this space on that one. And we’re really looking forward to using the additional rich metadata and that we now have captions to build lots of interactive experiences around videos for learners.
AIDAN HORNSBY: That’s great. Thanks, James. And now one final question for Tole. How does 3Play ensure high accuracy for complex or technical content?
TOLE KHESIN: That’s a good question. We process lots and lots of content that’s very complex and very technical. Oftentimes people have strong accents, there’s background noise, so that this is sort of our bread and butter.
We approach that in a number of different ways. First of all, we have very highly trained, professional transcriptionists who follow very rigid standards. For example, let’s say it’s a math lecture and the professor talks about a math formula. There’s a very specific way to transcribe that math formula. And we have pages and pages of those standards in place that everybody gets trained on so that the video, when it’s finished, those math formulas are transcribed exactly the same way every time.
We also have a multi-step process, as I mentioned before. So basically, the way that works, there are multiple people looking at each video. So if the first transcriptionist can’t decipher a word, then what that person will do is they’ll flag it and then there will be a QA person that will later research that word and try and find out exactly what it is. That really helps a lot.
The other part of it is that we have a very large pool of transcriptionists across many different domains. And what we can do is we can route content to people who are most familiar with it. That applies not only to the type of content, but we might have transcriptionists that are familiar with a certain type of accent and it makes it easier for them to transcribe that content.
And then lastly, we have an editing interface in the 3Play Media account system which makes it really, really easy to make quick fixes on the fly. So let’s say you find– because I mentioned even though our accuracy rate is about 99.6%, that still means that occasionally you’ll find an error. That might be a person’s name that we couldn’t research or couldn’t locate, or something might be misspelled. And sometimes there’s sort of gray areas, where you want something to be written a certain way and we did it differently.
So what you can do is you can just go into that editor and make that edit on the fly. And then as soon as you click Save, all those changes will propagate to all the final outputs. So that’s sort of the last safety that just makes it very easy.
AIDAN HORNSBY: That’s great. Thank you, Tole.
So I saw the questions, and I think we’re over time now. So we’re going to wrap things up. If you have any additional questions, if we just go to the next slide, we’ve got the contact details for James and Tole here. Please feel free to send them an email, any questions you may have.
We’ll also be sending out a recording of today’s session to everybody who registered. So thanks for attending, guys, and goodbye.
TOLE KHESIN: Thank you very much.
JAMES CROSS: Thanks, everybody.