« Return to video

Video Captioning for Accessibility – The Penn State Solution [TRANSCRIPT]


JOSH MILLER: My name’s Josh Miller. I’m one of the founders of 3Play Media. I’m going to give just a really quick overview of closed captioning and just a tiny bit about us, just so that we’re all on the same page and when we talk about different terminology, we’re talking about the same thing.

Then I’ll turn it over to Keith Bailey from Penn State. He’ll go through the solution that they’ve built, which is a complete media solution with closed captioning. And then Tole Khesin from 3Play will give a few demos of some of the new technology utilizing core captioning functionality and what else can be done with it.

OK, so real quick, I’m going to go through a quick overview of closed captioning so we’re all on the same page, then turn it over to Keith Bailey from Penn State. The bulk of the presentation will be from Keith. Then we’ll show some of the new technology we’re working on that utilizes that core captioning functionality.

So captions are really text that is time-synchronized with media content. Captions also convey relevant sound effects and any other actions are taking place on the screen that wouldn’t be able to be conveyed as easily unless it’s put in to text.


It’s really important to keep in mind.

Captioning originated in the early 1980s from an FCC mandate really focused on broadcast television. So captioning versus transcription. You’ll hear both of these terms used quite a bit today. Transcripts usually mean a text document that has no time data. So the text is not necessarily synchronized with the media, whereas captions are.

Going back to what I was talking about with captions conveying all relevant sound effects, subtitles, on the other hand, do not. Subtitles are really meant to provide a language indicator. So subtitles usually mean you have multiple languages showing up on the screen so that people can follow along, but that audience usually can hear all the relevant sound effects, so it’s not necessary to put that into text.

Closed captioning versus open captioning. Sometimes you’ll hear about this. Open captioning, you’ll see the captions on the screen at all times. There’s no way to turn them on or off. Whereas closed captions, by default, are off on the screen, and then the user can choose to turn them on.

And then post production versus real-time. Post production means it’s recorded content. The captioning takes place after the content is captured and then published, whereas real-time is what you see on the news or sports, where the event is in real-time, the captioning is taking place in real-time. And those really are two very different products and services that can be offered.

So really quickly about accessibility laws. Section 508 and 504 are the very commonly discussed ones. They affect a number of universities, certainly federal agencies. Any federally funded content that goes up online is supposed to be accessible. That is a federal law.

The most recent law that has come into play is the 21st Century Communications and Video Accessibility Act, which is commonly referred to as the CVAA. That’s the law that was recently passed a about a year and a half ago. And what that currently says is that all content that was broadcast on television, once it goes up online, it also has to have captions. And so the experience has to be as good or better than the captioning experience on television.

So a number of networks and content publishers are kind of scrambling right now, because they actually have to have all recorded content ready with captions online in a couple months. So that’s the first deadline. So anything that had been online already is going to have to have captions. And so this law will probably continue to evolve. There’s a lot of talk about what else will it start to include.

So really quickly, when we think about closed captioning, I think a lot of people often think about, well, for people who can’t hear the content that’s being spoken, that’s kind of the obvious application. And that’s a huge benefit, no question. You’re expanding your audience, and that’s really important. You’re also including the entire audience, which is important. So it’s inclusive.

But captions, when you have that text synchronized with video, especially when it comes to learning content, it’s really, really beneficial for a number of other people as well. So for people who speak other languages, they can now actually follow at their own pace. It’s easier for them to read, most often, than it is to hear a language that they’re less familiar with. It also provides everyone the ability to view that content anywhere. So if students are in a library or in a noisy environment where the sound really can’t be used, captions enable them to follow that content.

The beauty of the internet is that captions also make content searchable. This is something that can’t be done television, certainly, but with a computer, having that synchronized text with the right tools will allow you to actually search through all the content by what’s being spoken. Certainly, if the content is public– the internet is a text-based entity, really, and so having that text allows you to search across the internet for content. And without that text, there’s really no way to find the keywords that are within any video. And then certainly having the English first allows you to then translate into other languages, so if you do have a global audience, that’s something to consider.

Really quickly on the process– we’ve built what is really meant to be a very flexible captioning process. We do everything through the web. We have a number of different ways to upload, APIs to build custom processes. And that’s where you’re going to see from Keith today, is a complete solution that’s built around the API so that it can be customized to the exact needs of this Penn State group. And so we’re all about integrating with different platforms, making the process as easy as possible. But what you’re going to see today is really focused on what you see here, is the API.

A couple things that we’re also really, really focused on is making the publishing process as easy as possible. So what you see here is a captions plug-in. You’ll see more of that later. It has search functionality built in, switching between multiple languages if translations are available, subtitles are available.

And it can also be embedded with pretty much any video player. So this is actually a Vimeo player that’s on display here. Vimeo does not support captions at all, but this plug-in does work with the Vimeo player so you can actually add captioning to it.

And then we also have even more interactive text tools. So again, using that same transcription and captioning process, provide even more text for a viewer to navigate through a video and search based on synchronized text.

So with that, I’m going to turn it over to Keith. Keith Bailey the Assistant Dean of Online Learning at the College of Arts and Architecture at Penn State. He’s going to walk through the solution that they’ve built. One thing I should say is, we are open to taking questions during this presentation. The majority of the rest of this presentation will be Keith going through this, so if you do have questions, definitely feel free to raise your hand.

Presentation by Dr. Keith Bailey from Penn State University

DR. KEITH BAILEY: Well, thank you, Josh. That’s a good background to captioning, and we’ll talk a little bit more about exactly how we’ve implemented some of this in an academic setting and the importance behind making this happen. And is that a bit loud?

So I’m Keith Bailey, Assistant Dean for Online Learning at the College of Arts and Architecture within Penn University. My sole responsibility is to look at online learning within our own academic discipline. So much of this solution has come with the need to over the various years with working within the college. And I’m going to start broad. I’m going to start with kind of looking at Penn State as a whole, and then I’m going to narrow it down into exactly how we’ve taken kind of the transcription and the automation and the system we’ve built to help implement what we do in an online world within our own college.

So just a little bit about Penn State, just kind of give you the scope and the scale of what we are and what we’re trying to reach. We are 24-base-campus system and one virtual campus. So one that is purely online that we have to be thinking about proactively managing these types of needs. We are a land-grant University, state-related, state funding. So a very large system, 96,000 undergraduate students, so we have a lot of possibilities and a lot of demands that are potentially out there.

So then if we look down to the e-Learning Institute, a lot of people look at it as the e-Learning Institute, that we’re a university, we’re part of– we’re a college. We’re an institute within the college who has funded an institute purely with the goal of managing online course-ware within the seven academic disciplines in that academic college. So our primary goal is to work with faculty. We have learning designers on staff, we have multimedia specialists, and a technologist whose pure responsibility is to work with faculty to grow online learning for the University.

Then if we look at Penn State as a whole, when we talk about the 24-campus system, we have this opportunity to start re-utilizing our curriculum across the various aspects of the University. So one of the features is, you’ll notice UP is our [? mouth ?]. But then we have the 24 campuses, that when we develop an online course, wouldn’t it make sense to be able to share that course to other campuses?

Many campuses have different levels of resources, and we want to make the best use of our resources and share that curriculum. So we have this thing called the e-Learning Cooperative, which is a mechanism by which we can offer seats to other campuses, and then students can enroll in those campuses and it doesn’t matter what campus they’re coming from. So one of the Berks campuses can open up courses, and University Park can take seats in that. So the idea is to make the best use of it and allow students to take the curriculum the way that they want to take it.

And then we have the World Campus. So that’s our virtual campus. Actually, the University has a policy that says that any time we’re going outside of the traditional student that’s at any one of the campuses, we have to deliver through the World Campus. So it is a nice mechanism to have in place.

And then it’s actually, there’s incentives behind that. There’s a revenue stream that comes back for us enrolling courses and getting students from the World Campus back to the academic home. Thus the e-Learning Institute can fund itself through enrollments through that population.

So If we look at our college as a whole, this is kind of our portfolio that have. We primarily serve a general education, general arts population. Every student at the University taking an undergraduate course or undergraduate program has to fulfill six credits of general education.

So when we think about that across the 24 campuses and the World Campus, all students have to take six credits. So that means, there is our primary audience. So we developed online courses primarily to reach that goal.

We also became– one of the issues is not enough space. We have a lot of students, 7,200 new students every year. There’s not enough space to house all of them. So online education, in some ways, is another way to reach those same students. Summer enrollments as well– students go home, they want to take courses, they want to work towards getting their degree in four years, and they can take courses in the summer then.

We do have several degree programs. We have an MPS in Art Education that’s fully online through the World Campus only. A Digital Art certificate, and Music Education, we have a couple of courses that we offer in there. And then we have a new online program that we’re going to be offering in the area of geodesign. So 55 courses in all that the college manages and maintains. And we hit about 12,000 enrollments annually in those courses, so we do have issues of scale that we have to accommodate and accomplish every year.

So let’s start looking at the increasing need for accommodation at the University. Obviously, one of the publications, more recently, was the growth in online education. So as online education grows, obviously the demand for specific needs are going to grow along with that. And having over 6 million students taking one online course in a fall semester is a rather large trend, and something we need to be cognizant of so we can accomplish and accommodate those individuals.

We don’t have any sense that this is going down. It’s probably only going to go up. And as you probably within your own universities are talking strategically about how you move into online education, blended learning, hybrid learning, whatever it may be, but those are common discussions that are going on.

So then if we look at online enrollments at Penn State University, just through the World Campus and University Park in 2011-2012, we had 524,000 enrollments as a university. This isn’t just online, but 9% of that was online. So that’s 47,000 enrollments that were either fully online through World Campus or University Park.

So we start to look at that and we break that down, and half of that is actually University Park students sitting at University Park, resident students taking online courses. So that’s a very different unique audience for us. And we see that this number is actually going to grow as well, especially when we move into the summer months.

So then accommodations. So we’re breaking it down. It’s all about the scale of the university. In ’11 and ’12, we had 1,100 students that registered with our Office of Disability Services, that registered for an accommodation of some sort.

That does not mean that there’s a hearing impairment or a visual impairment. That means an accommodation. So six of those students needed interpreters or captioning across the system, compared to four the previous year. And then we have five blind students at the University, which also includes students that are at the World Campus, so fully-online students.

So then if we look at all of these requests, in 2010, we had 18 students in 52 unique courses. So now think about the impact of that in the College of Arts and Architecture, which says that you have to take six credits of art. That means we know these students have to come through our curriculum, so now we have to be thinking about how we accommodate that.

In 2012, 37 students, 82 unique courses. So we’re seeing a growth. Of those, the blind– we had 0 in Fall 2010, and it grew to 3 and 6 unique courses. And I think that the number of 5 has increased since then, but this is World Campus accommodations, too, so 3 of those blind students are actually matriculating through the World Campus. Deaf or hearing-impaired, 3 students, 11 unique courses, In 2012, 9 students and 27 unique courses. So we feel that that number is going to be growing as well, and we have to figure out the best strategies for accommodating this.

So we do have a very defined process at the University for how one goes about making an accommodation. If a student comes in and walks into a faculty member and says, hey, I need X accommodation, that is not the approach. There is an Office of Disability Services that they need to go disclose this to.

And if they qualify, a letter gets written for the type of accommodation that is needed. That typically gets handed off to the faculty member, and then it’s up to them from that point forward to implement that accommodation whenever that accommodation is needed, within reasonable and limited opportunities.

So then our Institute actually comes into play at that point, because what normally happens is that accommodation come to the faculty member, and they say, well, I’m teaching an online course. How do I implement that? So then they come to the e-Learning Institute and basically say, so can you help me fulfill these needs?

And that’s where we come into play. So it’s a very nice process, it works very well, and it gives us reasonable time, really, to be able to do this. But then the question becomes how to do it more proactively.

So if we look at it from an accessibility standpoint at the University, we have these Penn State Learning Design Standards. There’s 12 of them that have been defined by the University. And if we look at those, you’ll notice one of them, number 7 there, is accessibility.

And at the time that this came about– which I think was about a year and a half, two years ago– 508 compliance was kind of the standard, if you will, and that has since changed. But this became very important. So all learning designers are looking at these standards and saying, how do we fulfill this within each of our courses and be proactive on that?

So one of the recent things that I’ve been working with the World Campus on is we’ve put together an OCAQI task force, which is looking at how to implement accessibility standards on top of online courses, and do this more proactively. So much like my Institute exists, there are a lot of different groups on campus that focus on learning design, so the idea of trying to standardize on approaches is very important. And this group will hopefully be able to do that. And the real goal is to figure out best practices and strategies for online course-ware– not the public side, but more of the private course side of things.

So then, one of the benefits was that at first, it was coming up with what those standards were. But then the University came up with 8069, a new policy, in 2011, about three months after we started this task force, that standardized on the WCAG. And we went with the 2.0 standard at a compliance level of AA.

So that helped define about half of what the charge was. What are we going to do? How are we going to approach it?

But then as soon as we went in there and we look that at these standards, glazes went over every instructional designer’s eyes, because trying to figure out how to implement each one of those standards in a course became very cumbersome. So we’ve been working on this task force for well over a year now, trying to develop a strategy and a plan to make things transparent and communicate to our key stakeholders and students with disabilities exactly how well these courses meet the standards, and knowing that we want to continue to improve the standards. So this is hopefully going to help create the guide by which we disclose how accessible each course is.

And say, in the arts, say someone comes in and then they have a visual impairment, do they want to get their general education credit taking an art history course, which requires a lot of visual looking at images, and then getting access via those images? Or do they want to take a music course, a jazz course? So the idea is that we’ll be able to offer sample lessons around these courses so students can make the best choice that the can possibly make. Not telling them what to go in, but give them the opportunity to make a choice.



AUDIENCE: Are your courses template-based, or are they per instructor, like each instructor gets [INAUDIBLE]?

DR. KEITH BAILEY: Give me a couple more screens, and I’ll hit on that in a minute. But that’s an excellent question, and that is a challenge that we have. And we could probably have about a five-hour debate, just on what the faculty want and what you want as an administrator.

So basically, this policy also says that anything, two years back, need to be compliant. So then all of a sudden, that becomes a challenge, especially when you’re talking to faculty that are used to working with something like Dreamweaver, don’t understand anything about tab orders and things like that, and understand how screen readers work. And then older sites, we need to be working towards compliancy, but as soon as you update those, do those now need to be compliant?

So then as part of all of this, the University came up with these key blockers. And these were 12 of the key things that were like– these are a little bit easier to hit on. Let’s do these upfront, let’s get them into place, and then when these other requests come in, we accommodate those as is needed.

And notice down in the very bottom right here, closed captioning and audio transcription is one of those pieces. And there’s dollar amounts that come along with that, right? So every time you send a video out for transcription, there’s money that’s associated to it, there’s a lot of time associated to it, and it can become a very cumbersome proposition to have to implement. And that’s where we have come into play, to try and help streamline that for the University and for our own college.

So a lot of this stems back to our learning design approach within our own academic college and how we approach online learning. And ironically, as we’ve gone down this path, our approach was kind of validated as these accessibility needs came into play, because we were able to layer in accessibility demands much more quickly and easily not being within a traditional LMS. And relying on them to be compliant, we were able to make ourselves compliant.

So it starts with keeping our content from our LMS. So we’ve implemented our own content management solution, which allows us to have our own visual design, our own templates, if you will, that we can create as accessible templates. This is actually an instructional content management system, which implements a lot of those best practices that we saw earlier. Those 12 best practices are implemented in these themes.

Then students transfer immediately from there over to the LMS to do a lot of their communication. The discussion boards, the drop-boxes, the grading– all that stuff happens in the communication space. This is really a course website, if you will. But the benefit has been great in our ability to implement design standards.

So then if we take that one step further, and this goes to the actual theming or the templatization– which we don’t like to use. We like to use the word “frameworks,” because it feels a little better for faculty wanting to create courses.

But so the framework we can put into place, and it can now account for about 50% of the accessibility needs. And then we let the faculty member focus on what we really want them to focus on, their subject matter expertise, right? Let them work on the content.

And then the big debate comes in, is when they’re implementing this content, how do we make sure that it’s accessible? Well, if you don’t have an H3 tag, or if you use an H3 tag and you don’t have an H2 in front of it, it kills the screen reader. So I mean, those types of things are things, again, that we don’t want the faculty to have to think about. So implementing strategies and how do you put this in the place to make sure that it’s accessible, and we don’t have a lot of review that’s going on the back-end. Once all that’s done, the idea is that this layers together, and the student has their course experience.

Take this one step further. We have our content. A lot of it, especially in the arts, is very media-driven. So we use a lot of public information. We have a lot of purchased information– full streams and things like that to support our film music courses and our media courses.

And then we have our own private system. And you’ll see how I’m drilling down to the need for transcription, and then this private space which is a space that we’ve built out. So all of this material is kept separate from the content, and we layer that stuff in.

So basically what we’ve done is we’ve developed a suite of tools in an open-content system. We’ll call it ELMS Content, ELMS Media, and ELMS Studio. Studio is the interactive environment that’s specific to Arts and Architecture and studio-based learning. Not going to talk about that at all today. We can have a different conversation on that if you would like.

The Instructional Content Management System, you’ll see how those two pieces– the Instructional Content Management and ELMS Media, ELIMedia– those two things fit together to help produce that environment for us. So let’s look at the content really quickly.

It’s developed off of an open-source framework, Drupal. It’s PHP MySQL, basically. We’ve pulled together a lot of community modules. We’ve developed our own learning design modules that are specific to the instructional design flow. And then we create a whole bunch of themes, frameworks, templates– whatever we want to call them– but things that we can ensure are accessible but also give the course a nice look and feel that goes along with that course.

So as we went down this path originally, the idea was to give the faculty member the look and feel that they want, but the secondary thing that has happened is now we’re able to implement these global standards. And we’re able to determine that certain pieces of accessibility are met immediately by throwing these themes into place. Originally, what we thought faculty actually wanted was “not Dreamweaver.” They wanted something that they could make look a little bit better and not have to get into code, and that’s why the content management system was so important. But now the secondary benefits are that we can implement global standards at clicks of buttons, and the faculty do not have to think about that aspect.

And then going back to the content, now we have the ability to install a module. So someone out in the community has developed a module that you can go and say, WCAG 2.0 AA standard. Any time you submit that material, it’s going to scrub it, clean it, or look at it and tell you what is wrong. So we have the ability to either let it go and let the faculty submit whatever they want, or we can lock it down to a point where they can’t actually submit anything until it’s deemed accessible by this module.

So those are administrative decisions that we have to figure out. And I think one of my challenges next year– maybe I’ll be talking to you next year about this– it’s how we’ve gone about that process and engaged the faculty in the decision-making process, and allowing them to come to the decision that says how we want to implement something like this? The beauty is that we have it, and we have the ability to implement it. The question is, how do we want to implement it given a faculty-governed institution?

So again, this is an open distribution. It’s freely available. Our idea in higher education is that we want to give it back to the world. We want to grow the community.

And so the idea of opening this up to everybody is an important one to us. We would hope that coders would come in and start to develop more for us on this and with us, and we have an absolutely brilliant programmer in the Institute that focuses fully on developing this code. University of Wisconsin this is now starting to use this as part of their platform, as well, as part of their suite of tools.

So now let’s start drilling down into the media side and the transcription. The real need here was to eliminate duplication of media. We were uploading things–


–using LMS’s out there or uploading materials. The next semester comes up, you have to re-upload those materials. You get into assessment, you’re uploading it all over the place, and then all a sudden, the spaces become bloated with media and material.

So we wanted a place to keep that kind of separate, reuse that material, and be able to quantify where that was. But we also needed an easy way to embed it into our course material. The last thing we wanted to do was have our instructional designers or our faculty have to mess around with code and have to put something in, and then all of a sudden, the styling doesn’t work very well.

So the idea of having an embed string that you could copy and paste into the material and eliminate all of those needs was very important for us. Also, copyright and transcription– we wanted something that could manage all of that for us, and we didn’t have to manage that on the side. So copyright compliance and accessibility compliance became very important.

So the ELMS Media was the solution that we came up with. And again, it’s a Drupal-based solution. And it hosts more than just streaming video, though. It does images, audio, Flash. It’s an asset management system that can help quantify all of the digital assets that are used within the courses. It also manages our transcription and our captioning files now. It is front-end of Drupal, back-end of Flash– but that’s not to say it can’t be any other streaming server, and we are looking at that, actually, at the University, as to what that streaming server may be, but realizing that this is very specific to a learning design initiative.

So then the embed code was the other nice piece of how this all works. So let’s look at that just really quickly. And screen captures are probably the easiest way to look at this.

So this is a piece of media. This is a lecture file within the ELIMedia. You’ll notice that at the very bottom left there, there’s an embed code. So whenever that piece of media is uploaded, you automatically get this embed code. Anything that happens after the fact of that embed code automatically moves along with that embed code, so the transcription and everything comes along with that after it is done.

So here you’ll see the content editing window. Just paste that in, and off it goes. And once you’ve submitted that, you have it embedded directly within the course. And you’ll notice the caption file that’s in there as well. So this was likely put into the course, the captioning was done after the fact, and then once the captioning came into place, all a sudden the closed captioning became available for the student.

So the benefits have been– we’ve been able to become more compliant with TEACH Act, we’ve been able to use open resources, Creative Commons resources and things like that, and embed them very quickly and easily into the courses. We’ve been able to layer on accessibility needs with captioning and transcription. We’ve been able to make a whole bunch of different decisions based on need and demand very quickly and easily.

Every semester, we end up getting a request. And say, we need a theatre course. We need captioning for theater courses. So we have 50 videos, and off we go. We have to go get that done very quickly and easily. So this is allowing us to do that very, very fast.

Tagging– we’re able to find the materials very quickly, look back. The tracking back– we know exactly where these materials are being used in the course. Again, open distribution. This is the second piece of the suite of tools that we’ve developed, and it’s continuing to grow. We expect it to grow even more over time.

So now let’s look at the evolution of this transcription process, and going back to 2005, 2006, kind of the manual method, the brute force method behind the captioning process for us. So we had– this was a landscape architecture course. We had 39 Flash files in that course. We had 26 hours of lecture.

And basically what we had to do is, we took these files, we burned them to a disc, we sent them to a company. They transcoded it. They layered on open captions to each of those Flash files, they sent them back to us, and then we uploaded it and we put it in the course.

And as Josh mentioned earlier, open caption, you don’t have the ability to turn it on and off. So it was just there for all the students, and we weren’t giving them any control as to, do they want it? Don’t they want it? There was no transcript that came along with it, either. So A, it was very time-consuming. It was very costly– I don’t remember what the exact cost was on it, but it did take at least two weeks of time before we got it back. So expediency of solution became an inhibiting factor with this type of solution, and it was a very high-touch solution.

So that made us start to think about, what can we do? How can we make this better? How can we streamline it?

Accuracy and reliability was very important. If you have someone hand-transcoding things, accuracy and reliability start to drop a little bit. I mean, there’s going to be more people looking at things as you’re doing it, and there’s an increased number of opportunities for error. Rapid turnaround– there are times where we need this in a day. There are times that we can say, three, four days, not a big deal, let’s just get it done.

And then affordability. And affordability can come in a couple of different ways. I mean, you can look at the whole portfolio and determine how affordable things are, or you can look at it per-minute. I would suggest not looking at it per-minute, because you’ll gulp really deeply, but when you look at the whole scale of how all of this works, I’ll show kind of the cost-effectiveness of it for us, at least, and why my business decision has been to automate and not do in-house.

Volume– very important. We have a thousand videos in all of our courses, over that, and we know it’s going to grow. So we need to be able to do this, do a lot of them and a lot of them very quickly.

Multiple formats was also very important. We don’t know how we’re implementing on a day-to-day basis. We may need different types of formats. So we want to be able to do that.

And then the most important thing was the ability to integrate this into our media solution using the open APIs and be able to streamline it. So we have a tool that we want to use, but we have another company that we need to use to do that.

So our solution, really, was– I guess this was 2 and 1/2 years ago now– was we looked at 3Play Media and we started talking to them about ways to streamline this, meet our needs based on their solution. So the next step of this was 2011, when we first started looking at ELIMedia and then the potential to automate it. So we went with kind of a partially-automated solution, as I would say it. And basically at this point in time, we utilized ELIMedia to manage just closed captioning files. So we weren’t dealing with transcripts or anything like that. This was a high-touch type of a proposition in some ways.

So basically, how this would work– and you’re seeing screen shots of what ELIMedia is right now– basically, you would go into the system, and you would upload it, and you would classify it based on a course. You would title it, you’d put it all your copyright information, so forth and so on. Once that was uploaded, the multimedia specialists would then go to 3Play Media and would upload the file over there.

That would start a process where they would need to go in and kind of review where it was, what the status of that file was. So they would have to go back, and I think Theatre 105, one of our theatre courses, had 50+ videos that he had to go monitor and then had to go download. So once he saw that they were processed, you would go and you would download it, and you’d pick your format.

And as you can see here, we only picked one format at the time, because all we were worried about was closed captioning and giving the students the ability to turn that on and off. But we also knew that there was that transcription format. But at the time, that wasn’t the approach that we were taking.

So then once that was done, they would download the file, and then they would upload it and associate it to this media file. And so we go back to that embed code. Once that embed code was put into place and this captioning file was put into place, that automatically went back and implemented on top of that the video file.

So the learning designer didn’t have to think about how to add captioning to video anymore. They just had to know how to copy and paste an embed code. So that became very beneficial.

So of course, as the person running the Institute, I wasn’t happy with that solution. I wanted a little bit more. I wanted to streamline things. I felt that it was in some ways a waste of time for my media specialist to have to monitor and download and upload. So I wanted to see how we could fully automate that.

So we started doing that this semester. And basically we are starting to use the APIs so automate a lot of this. But realizing the importance of every time you submit something, if you fully automate this, it’s going to come back to the budget. Someone’s going to have to pay for it at the end of the day. So we now we needed a process by which you start to determine how you’re going to push things up, and that became more of a review process.

So this is kind of the approach or the steps that we take to do this now. Same thing– upload the media. Once we’ve uploaded, instead of having to go to another system and re-upload that other file, now you have the ability to classify it and say it needs transcription. So the person that knows the best as to whether it needs transcription and what’s actually in that file is the media specialist at the time.

So in our case, when he uploads that file, he would classify it as needing transcription. And then you’ll see underneath that you’ll see a couple of BMR 1, KDB 163, and then none. That’s where the classification that changes to Needs.

So now you can go in and now there’s an approval process associated to this. So once something is classified as needing transcription, we want the instructional designer and the media person to approve these before they ever get submitted, since there is a cost associated to the submission. So now we’re building a process and a workflow associated to this. And the new responsibility is for the learning designer and the media specialist to do that. So once we have a piece of media that has been approved by two people, and it’s classified as needing transcription, we can take the next step.

So that is, basically, to then classify those things that need it and have been approved to submitting to the transcription. In this case, based on the APIs, we have a pull-down menu that you select 3Play Media, you click Submit. At the end of that night, what will happen is it puts it in a queue, and at the end of the night, it will run a job that it will push all those files up to 3Play Media in the middle of the night. So now, a specialist no longer has to push that up.

So once it’s submitted, we get a classification in our own system that says Submit. Now, there are times, obviously, when we say, boy, we submitted that and we didn’t want to. The nice thing about having kind of a midnight queue where this pushes is that we can go back and change that before that job runs.

So then, not that we have to go to this step, but if we want to feel comfortable about the decision that we’ve made, we will see that file has been pushed up to 3Play Media. We go into our administrative system on that side and look at it, and we’ll see that it’s in progress. So we know something is happening. And the other nice thing is, you’ll see kind of quality levels associated to it. So we know how well we think this will come across at the end of the day.

And then the next thing is, once it’s processed– and the status on the right-hand side, you’ll see in 3Play Media, it says complete– again, at night, when it looks to push things up, it also looks to pull things back. So if it finds that things are completed, it will pull that back. It knows what the transcription is associated to, and it will automatically drop it in next to that media file.

But the benefit that we’ve taken advantage of here is to say, now we no longer just do closed captioning, but we do transcripts as well. So the idea is that when we take the media player and actually embed it in the course, we have the ability to add features into the course now, and not only give the student closed captioning, but now we have a download button that will allow you to download the transcript.

Now of course, the question is going to come up, what happens when the faculty member doesn’t want you to download a transcript? Well, the nice thing is that because we’re a content management system, we have control of setting permissions at various levels, and say, what do students have the ability to do?

Everybody has the ability to see closed captioning. Some have the ability for downloading transcription. So we can modify and adjust things as we need, but now we’ve quantified the asset and we have everything in place.

So here are proven results on this. First of all, we questioned it. We questioned the reliability of how well these transcription files were coming across. So we did an audit, kind of, on what we thought were ten of the most difficult audio or video files.

And we actually had a student go down and look through word by word as to how it all lined up, and we didn’t see a single issue. So we feel very, very confident that it’s a very reliable solution. We’ve had 80 hours worth of video already put in, about a two to three day turnaround. It’s really nice that all a sudden, one day, transcripts and caption files just show up.

It’s cost us $14,000 over the last two years to do that. And as I was saying, instead of looking at the per-minute rate– if I look at that, over two years, if I were to hire a graduate student to do this for about 20 hours a week, that’s about $13,000 a year. So if I start to look at a cost-benefit, I can have someone else or I could hire someone else to maybe make better themes or create new templates for our courses. So it’s giving me the opportunity to make other decisions.

We’ve automated the process through the APIs. Tole will get into some of the new features that they’re putting in. And now that we know that we’ve kind of synchronized between, and our assets are synchronized with theirs, we know we can pull these new features very quickly and easily as they come about.

You’ll notice we have almost 1,100 videos, 337 audio files. One of the decisions we’re making is that we’re just going to do all transcripts. And we’re going to go through our approval process, but we’re going to proactively do this and not reactively. We don’t want the student to come in and say, by the way, I need to get this done. We’re going to say, it’s automatically available for you.

And the other benefit of this is English as Second Language. We’ve had a lot of students come in and actually say, boy, thank you for providing that transcript. It really helped me get through this material. So there have been other academic benefits of this, and reaching multiple learning styles. So by doing a lecture or a video and giving them the alternate modes, it empowers the student a little bit more.

So our projected time savings– I asked our media specialist how long it took them per media file to do this. And he thought, in the whole scheme of all that process, that it was about 15 minutes. We feel that we will be saving about eight weeks of time in this automation. And for something that is as redundant as just monitoring something and downloading something, it seems to be a good decision to have our technologist work on something that took about a week to do, and will save an awful lot of time, and he can do what he does best. Yeah?

AUDIENCE: This $15,000, does that just include transcription, or the whole process?

DR. KEITH BAILEY: Transcription. It was the cost of getting those files transcribed.

AUDIENCE: I want to add another benefit to that, is that people who aren’t used [INAUDIBLE], you said the word “sweep” earlier, and I thought you said “sleep.” And if your slide had not been up there with the word “sweep,”– I kept thinking, what is sleep? But you finally put the word “sweep” up there. So it may help your population more than you’re thinking it’s helping.

DR. KEITH BAILEY: No, absolutely. I can’t agree with you more on that. And then the other side is that the ability to translate, to do languages potentially, can happen along with this. So when you have a large, diverse population, you may have the ability to switch back and forth between languages eventually. Yeah?

AUDIENCE: Just to clarify the cost, you said it was for the transcription. Not the closed captioning and transcription?

DR. KEITH BAILEY: No. OK, so the rate actually gives you access to all of those file types. So we get to choose, and we’ve chosen to implement using like the JW Player, and we want very specific file formats. So we get to choose what that is, but the nice thing is that we get to choose from kind of a portfolio of options, if you will. And then we just chose certain ones.

AUDIENCE: Have you found any other implications from your transcription service, as in medical transcription?

DR. KEITH BAILEY: As in medical transcription? No. I mean, I haven’t. It’s not my area. But I could definitely see that.

AUDIENCE: You all were doing the transcription, or are you still outsourcing?

DR. KEITH BAILEY: We’re outsourcing. So we push it into the system, and then it gets outsourced, and then it comes back to the system.

AUDIENCE: So a 100% rate on transcription is absolutely phenomenal.


AUDIENCE: Transpose that to the medical world–

DR. KEITH BAILEY: Yeah, absolutely. I definitely would see that as a benefit. Another thing that we’ve done with this, is one of the focus groups I’m running at the University is to look at like the future of instructional content management for the University.

And we had a day-long retreat where we had breakout sessions, and we recorded everything. And I remember when I was doing my dissertation and I had to get things transcribed, the cost associated to that just blew my mind. And with this focus group, I was able to take those files, just upload them, and it was a day-long of four different groups talking, and I have the transcripts now, so we can do a real solid content analysis.

And it actually pinpoints people’s names, as to who was speaking in it. So if they introduce themselves– I don’t know how it happens, but it would say, like, Sherry spoke here. And then every time she spoke, it would put her name associated to that chunk of material. So I mean, that was a huge benefit. And that’s something that wasn’t even course-related, but it provided an efficiency in another area for us.

So– how much time do we have left here? We have 11 minutes. OK, so I’m going to go through these very quickly so Tole can show some of the new features that hopefully, we’ll be adding into this system.

Media assets. So we have a whole bunch of things that are in play that don’t have transcription yet. So the question is, how do we go about managing this?

We’ve built kind of an audit system, if you will, to quantify all of the assets that are in that and look at exactly how we’re going to, what things we want to transcribe, and what ones we actually have transcriptions of. So we break it down by course. We look at the number of assets in there. If you see the [INAUDIBLE] the transcription over on the right-hand side, that will tell us A, the left number– or the Left 0, or the Left 24, whichever– is the number of files that have been classified as needing transcription, and the right side is how many actually exist.

So if we look at something like 23 of 6, that means we have 23 files that– oh, sorry. 23 files that exist, but 6 that have been classified as Needing. So there’s some cleanup that needs to go on, and we need to go through quantify all of this. And this is more of a reactive mode since we’ve layered this new process into place.

So basically what someone would need to do is look at what files actually exist. They need to go in, make it show that it needs transcription, and then they classify that. And then you will see on the right-hand side that we have downloaded caption files, but we have not downloaded media files. But then they’re going in and approving them again, so either they can come back out of the system or they get approved for transcription. And then the other nice thing is that we can go ahead and just submit those for transcription, then, as needed.

Then the final step here is the downloading of caption and transcription files. For those files that already exist, that we have already had the caption files– because if you remember, the partially-automated approach, we downloaded a lot of these already. So now we want to synchronize and get all of the transcript files. So we’ve built a process in to do that, and then just automatically pull it down so we don’t have to go in and download them individually.

So basically, again, going through the approval process, we want to make sure that it needs transcription. We want to make sure that it’s approved by both individuals. But then we will see that we have the caption files and not the transcript files. We want to take that asset, and then in ELIMedia, if you see that number up at the top, 267, that is the node idea of the content management system. And then we go and find that same piece of media over in 3Play, and we associate it to the video ID.

So the next night, when we classify these as processing, when that chron job hits at midnight, it will look for those things that don’t exist, and it will pull down those things and synchronize everything for us. So we actually have a student working on all of this. The nice thing is we have Excel spreadsheets that we can download all of the assets on both sides, do searches, and then we have to synchronize those videos.

The benefit to that is going to be in the future as new tools come out, and we know that those assets are synchronized, and new choices come in, and new options come in of download format– we’ll be able to click a button and synchronize all that and just have it all come down and implemented to our students. So we feel that we’re going to have more flexibility in our options as time progresses.

Anticipated benefits of this– we’re building more processes behind this to insure quality, to streamline efficiency. We’re hoping that this influences practice at Penn State as a whole. The original title of this was The Penn State Solution. It’s actually a Penn State solution.

We see that there are benefits, and we are working with the Office of Disability Services and other units to help do this. Student empowerment, giving options. I mean, I think at the end of the day, we want to give more learning options for students, and we want to give them the material that they need as they need it. And the benefits, well outside of accessibility, are going to be there for the students. So there are interactive transcripts that will give power, and then some of the stuff that Tole will show here actually will blow your mind, as to what kind of empowerment you’re giving students in the learning environment.

So with that said, I will pass it over to Tole and let him wrap up. And then we can talk more.


TOLE KHESIN: Thanks, Keith. Is that mic coming through OK? OK, very good. So my name is Tole Khesin, also with 3Play Media. And I just want to start out by clarifying the process and what we mean when we’re talking about automation and what happens on the back-end.

So when Penn State pushes content media files to us, what actually happens is first, we’ll take that video file and we’ll put it through speech recognition, which gets it to a point where it’s about 70% accurate. But then subsequently, we have professional transcriptionists that will go through and clean up the mistakes left behind by the computer. And we’ve built a whole platform that makes that process very efficient for them to do.

And actually subsequently, we’ll have a QA person who will go through and research difficult words, ensure the punctuation is correct. So at the end of the day, it’s pretty much a flawless transcript that’s time-synchronized word for word. So the core document that we create is basically a word, time code, word, time code, word, time code.

And from that, we will produce a variety of different derivative outputs. The most common one is a captions file, and we produce that in a variety of different formats that can be published with video players. And Keith talked a lot about the solution that they’ve built. And I just want to show you some other captioning implementations, a lot of which involve automated workflows.

This is the first one. This is MIT OpenCourseWare. This is actually our first customer five years ago. And it’s actually a pretty straightforward implementation that just involves closed captions. These are actually being published on a JW Player that’s streaming YouTube content.

And to make things easier, we’ve actually integrated with a number of lecture capture systems. This is an example with Georgia Tech. So they’re using a lecture capture system called Tegrity. And the captioning workflow is very much automated here, as well. Basically, instructors can select which lectures they want to have captioned, and then they automatically come to us. We process them and then post the captions back to Tegrity, and then they just show up right here underneath the–

A similar set-up– this is Colorado State University, and they’re using a lecture capture system called Mediasite, which Mediasite is actually, I believe, what’s being used to record this lecture here right now. And it’s the same kind of automated captioning workflow. You just select which presentations you want to have captioned. It automatically comes to us. We produce the captions and post them back, and they just show up right here.

And this is with University of Florida. And they are using Kaltura, which is another video platform. We have a fully-automated captioning workflow set up with Kaltura as well. But this is interesting.

So here, they are using the captions, but they’re also using– on the right-hand side, you can see an interactive transcript. And this is an embeddable JavaScript plug-in that supports HTML5 and will automatically communicate with pretty much any type of video player that you happen to be using. And what it does is it sort of extends the capabilities beyond just captioning.

And there’s no additional cost for this, because the timed– we’re really just leveraging the timed text that’s already been created. And so what it does, here– let me just play this– it highlights words as they’re / I can search through the video here using the timed text. I can click on any word to jump to that exact point in the video.

Another example here, this is, actually, I’ve got a Penn State site. This is another plug-in that we make here. It’s a captions plug-in below here. And what makes this really powerful is that this is an embeddable plug-in that’s easy to put on any web page. works with most video players.

And it even works with video players that don’t support captions. Josh mentioned this earlier– this is a Vimeo player, here. There’s no way to get captions on Vimeo. It just natively does not support it. But by adding this plug-in on the web page, it’s very easy to add captions on there.

And they’re even searchable, so I can, for example, search for a word. And it’ll show me where that word was spoken within that video, and I can click on any section and it’ll jump to that exact point in the video.

This is another example of an interactive transcript. This is EducationUSA. And this also shows you how you can easily switch between different languages. So we have a number of languages in here. I can switch to Arabic, for example, and it’ll immediately switch out. And it’s all interactive as well, so I can click on any phrase. It’ll jump to that point in the video. The internet here’s a little bit choppy.

This is sort of taking it one step further. This is an example of a site called MIT150 Infinite History Project. So here there are hundreds of interviews with MIT luminaries talking about a pretty broad range of topics, so hundreds of hours of content, and if I click on one of these films, here, it’ll pull up that video and it will start playing.

So below here, this is the interactive transcript that you’ve already seen. You can search through it, click on any word to jump to that point. On the right-hand side, this is another plug-in that you can install. And what it’ll do is it’ll let you search across the entire video library.

So for example, if I search for something like linguistics, it’ll show me where that word was spoken within all of these different interviews. And these orange segments show you that that’s a hit. So if I click on this, it’ll expand that section of transcript.

And there you go. The word “linguistics” was spoken. And so at this point, if I click here, it’ll switch out videos and jump to that exact point in the video. And again, I just want to clarify that this really doesn’t require any additional work, because we’ve already transcribed those, we’ve already created this source of data that’s feeding all of these plug-ins.


TOLE KHESIN: Yes, these are all 3Play Media plug-ins. Yeah. And they’re actually very easy to install. It’s just a few lines of code on a web page, and they automatically communicate with the video player.

And then the last demo, I want to go back to MIT OpenCourseWare. Because they were our first customer, we bounced a lot of ideas off them. And this is sort of the– this hasn’t been released, yet, but this is actually what we’re working on for the new MIT OpenCourseWare.

And what this allows you to do here– so this is the interactive transcript on the right that you’ve already seen. And so it works the same way, except that on the left side, this is a gallery plug-in. So what it’ll do is actually highlight the sections of this video where the word “lisp” was spoken, that I just searched for.

And the neat thing here is that what you can do, as a student or as a professor, is I can add these as favorites. So let’s say I add these sections as favorites, these three. These will actually get stored in a cookie on my local computer, so that when I come back tomorrow, they’ll already show up, and I can jump to that video and play this section of it.

And what this allows you to do, as a student, is to bookmark parts of the video, but not just within one video. You can actually create these video bookmarks across an entire course, or even beyond. If I click on the keywords, it’ll actually show me what the most prominent words were that were spoken within this video and where they appeared in the timeline. I can click on any point to jump to that exact point.

And then some other neat features here. So what I can do here is– let’s say this is an hour-long lecture, but there’s a really important 30-second nugget halfway through that lecture that I want to be able to share. What I can do is I can just highlight, for example, this paragraph, and then I click on the Share button.

And what that lets me do is it’ll create a unique URL which I can share through Twitter, Facebook, or just through email. And when other people click back through on that URL, it’ll bring them back to the page and play just the section of video that I highlighted. So it’s another way of clipping and reusing the content and the timed text that’s already been created.

So with that, I want to thank you all for attending. That wraps things up.



DR. KEITH BAILEY: Are there any questions? Yeah?

AUDIENCE: Is the cost associated with just [INAUDIBLE]?

DR. KEITH BAILEY: You transcribe, and you get all of it for the same cost. I mean, there is no breakdown of cost.

AUDIENCE: I mean, earlier, you were saying about closed captioning, that there’s a cost associated with transcription, so we had to [INAUDIBLE].

DR. KEITH BAILEY: Oh, no, no. That was our process, that we were only pulling the closed captioning, even though the transcript was available. So it’s the same–

JOSH MILLER: Yeah, it’s all-inclusive.



AUDIENCE: How prepared were you for resistance from faculty to have a potential off-color comment [INAUDIBLE] to print? Or did that happen? I would be worried about that.

DR. KEITH BAILEY: Well, yeah. That is definitely a concern. I mean, we have decisions that we have to work through.

One of the ones that recently came up, which I found was interesting, was a student asked for the transcript of everything of a lecture. And the faculty member actually came in and said, no, I can’t do that, because I’m looking at publishing this text as a book, and I can’t give up my transcript as a result. My take on that is that it’s not a book. It’s a word-for-word of what you spoke, and that we’re not giving away, really, intellectual property.

The flipside is that I’m willing to hand over all those transcripts to her and say, here’s a nice foundation. Build off of it for your book. So I mean, but those are things– I think as we do this, there’s going to be a lot of unintended questions that are going to come up as to how we use it and what we do.

Now we do have a copyright agreement that our faculty sign, and that says that whatever is produced as a package of that course is the University’s to do what they want with that. It’s part of their academic work for the University. So the question is, do we have ask them to do that? Can we just do that?

Ethically, we want to talk to them and we want to work through this with them. And I think that’s the goal here, is to figure out the best approaches. So one size does not fit all. Yeah?

AUDIENCE: Just going off that question about something that’s in there that you don’t want to be there, can you go back and edit after all this has been done, to remove video and transcription?

DR. KEITH BAILEY: Sure. They have a great editing tool that’s live and dynamic, that you can go in and do that. Now if we take pieces of the video out, we may have to do things a little differently. But to edit some of the text itself, yeah.

The other piece that’s interesting, going back to what we do release– if we take a clippet of a movie, and under TEACH Act, we have the ability to clip a movie, circumvent copyright, and put it into place only for those students that are supposed to see it for the duration that they see it, we can’t give up the transcript on that. Because actually, we’re giving out the script of the movie. So we have to hold that back to the student– only that student can get that thing for that point in time. Right?

So now we have to start making decisions. And that’s where roles and rights associated to the system have to come into play, that we have all of these features at our disposal. The question is, how do we implement them, and how do we negotiate that with how we implement it with our faculty? Yeah.

AUDIENCE: What are you doing with YouTube videos?

DR. KEITH BAILEY: We have our own server, so for our stuff, we put it all through there. Now if we open up stuff in Open Education, we’ll push things into Vimeo or YouTube or something like that.

AUDIENCE: I mean, like, an instructor wanted to use YouTube videos [INAUDIBLE?

DR. KEITH BAILEY: Oh, they need to be captioned? Actually, we haven’t really run into that yet. Yeah, No I mean, again, that’s going to be one of those, how do we deal with that when we run into it? And I do speak an awful lot with our legal counsel to determine best approaches and what we feel they can protect if someone were to come at us. Yeah?

AUDIENCE: Historically– and I’ve been [INAUDIBLE] for 35 years– if a faculty member develops something, it won’t [INAUDIBLE]. And if faculty [INAUDIBLE] take it with them if they want to go to another job, but they also had to leave it for the continuing people. And part of the [INAUDIBLE] system, we had a [INAUDIBLE]. Does that also extend to things you develop [INAUDIBLE]. WU has not come up with an answer for that, yet. What do you all say?

DR. KEITH BAILEY: All right, so we have a copyright policy in place. The intellectual property of the faculty member is the intellectual property of the faculty member. No ifs, ands, or buts about it.

When university resources are being used to produce something, that’s when it starts to become a little grayer, right? So the Institute gets involved, helps produce videos, does transcription, creates a theme, things like that– now it’s a packaged material. It’s something that they didn’t fully produce themselves. That becomes the package of the University.

Technically, the way the policy stands right now, they can’t take that to another university. It’s a non-compete clause, really, at the end of the day. You can’t take the whole package and take it with you.

You take your intellectual property with you. Like the transcript– if they download the transcript of that and they take that with them? That’s theirs. That’s perfectly fine. They go publish off of it. That’s what they’re supposed to do as an academic within the institution.

Now my real answer to this, probably, is go to open education. Create it as an open course. It doesn’t matter what university you’re at anymore. You can teach it, right? Use the resources of the university to create an open package that you can transfer to any other university as you see fit. That’s another answer to it. Yeah?

AUDIENCE: The cost structure in the 3Play–

DR. KEITH BAILEY: What was that?

AUDIENCE: What’s the cost structure for 3Play Media?

DR. KEITH BAILEY: It’s– yeah, we’ve got to close out here so the next people can come in. It’s per-minute.

JOSH MILLER: We’ll get around to it.

TOLE KHESIN: Yeah, we can talk about that offline.

DR. KEITH BAILEY: Well, thank you very much, everyone. We’ll be around if you want to ask questions.

TOLE KHESIN: Thank you.