[Webinar Transcript] Closed Captioning Best Practices and Legal Requirements for Digital Delivery of TV & Film
TOLE KHESIN: Welcome, and thank you for attending this webinar on closed captioning best practices and legal requirements. My name is Tole Khesin, and I will be the moderator. We have four speakers that will be presenting for about 45 minutes. And we’ll leave the rest of the time for your questions.
The best way to ask questions is by typing them in the questions window in the bottom right corner of your control panel. We’ll keep track of them and address them all at the end. A captioned version will be posted tomorrow. And you will receive an email with a link once that’s ready.
So a little bit about the speakers that are going to be joining us today– Sean Bersell is the vice president of public affairs at Entertainment Merchants Association, which represents retailers and distributors of home entertainment, including internet-based movies and games. Dae Kim is a video engineer at Netflix, where he leads the time text effort on the encoding and operations side. Josh Miller is a co-founder of 3Play Media, which specializes in captioning and subtitling services. And Claudia Rocha is the operations manager at 3Play Media, where she manages a staff of over 700 transcriptionists.
So as for the agenda, we’ll begin with a basic captioning overview, just to get everybody on the same page in terms of terminology and some other basics. We’ll do a legal update on some of the recent and upcoming legislation that impacts captions. Then we’ll hand things off to Sean Bersell from EMA, who will talk about best practices about caption certifications, formats, and frame rates.
Then Claudia from 3Play Media will discuss standards for transcription and captioning. And then Dae from Netflix will talk about best practices and industry trends about styling, onscreen placement, and some of the emerging standards. And we’ll leave the rest of the time for Q&A. So with that, I’ll hand things off to Josh.
JOSH MILLER: Great. So as Tole mentioned, we’re going to go through just a quick overview of the definition of closed captioning so that we’re on the same page. Captioning refers to the process of taking an audio track, transcribing it to text, and synchronizing that text with the media. Closed captions are typically located underneath a video or overlaid on top. And we’ll talk, obviously, more about that part in particular.
In addition to spoken words, captions convey all meaning and include sound effects. And this is a key difference from subtitles. Closed captions originated in the early 1980s by an FCC mandate that applied to broadcast television. And now that online video has become a pretty significant distribution channel, captioning laws and practices have proliferated there as well.
First, captioning versus transcription– a transcript is usually a text document without any time information. On the other hand, captions are time synchronized with the media. You can certainly make captions from a transcript by breaking the text up into smaller segments, which we would call caption frames, and then synchronizing those frames with the media. And the key point there is that each caption frame has to be displayed at the right time based on when the people are speaking.
Captions and subtitles are also different, and this is an important distinction. The difference between captions and subtitles is that subtitles are intended for viewers who do not have a hearing impairment. But they may not understand the language of what’s being spoken. So subtitles will capture the spoken content but not the sound effects the way captions would. For web video, it’s certainly possible to create multilingual subtitles as well.
The difference between closed and open captions is that closed captions can be turned on or off by the viewer. The viewer has full control on their own. Whereas open captions are burned into the video and cannot be turned off at all. Most digital media players at this point can support closed captions, and that’s certainly the preferable method.
And then post production versus real time– post production means that the captioning process occurs offline. It may take a day or several days to complete, whereas the real-time captioning is what you see on the news and sporting events, and it’s done by live captioners. And certainly there are advantages or disadvantages of either process depending on the type of content you’re working with.
There are many different caption formats. We’ll talk about a few of them specifically. Sean will get into those details. And a lot of it depends on the workflow or the media player that you’re working with. The call-outs on the right give a quick glimpse as to how different caption files can be built. An SRT file is a very basic web format that just has a start time and an end time for the caption frame, and then the text that goes within that frame. The time codes in this case are based on the media file run time.
And then you see an SCC file, where the time codes are based on the SMPTE time codes. So there are definitely some differences there. And also you see that the text is based on a hexadecimal coding standard, as opposed to plain text. An SCC file will also support additional styling parameters, including caption frame placement, which an SRT file cannot support. And there are different gradients of what kind of styling different files can support depending on the particular caption file.
There are a number of benefits beyond the obvious need for providing people who have hearing impairments with the ability to follow the content. We think a lot about some of the web benefits with the business we’re in, in terms of search and navigation. But really, some of the even more powerful benefits are the fact that you can allow people who maybe speak English as a second language to improve comprehension and remove language barriers, just by having the text there. Captions can compensate for poor audio quality or a noisy background. Certainly if you’re in an airport, you see the TVs always have the captions on, because it wouldn’t be realistic to hear what’s going on.
One of my favorite stories is someone saying, when they found out that we do captioning, they got all excited because they had just had a baby. And he and his wife now watch all their television with captions on so the baby can sleep and not get disrupted. So you never really know what might be a really key benefit that you hadn’t thought about. But captions can really benefit a lot of people.
From a legal perspective, this is something that’s actually been talked about quite a bit. The 21st Century Video Communications and Accessibility Act, which is often referred to as the CVAA, is the law that’s been getting quite a bit of attention. And that was signed into law in October of 2010. So the CVAA expands closed caption requirements for all online video that previously aired on television. As of right now, full-length titles that aired on television with captions do need to have captions online.
But the law does not yet apply to clips, which we’ll talk about. There are also a number of exemptions for the online requirement that we’ll talk about as well. But for the most part, if content aired on television with captions, it’s supposed to have captions when shown online as well.
The Americans with Disabilities Act, the ADA, covers federal, state, and local jurisdictions. It applies to a range of domains. The two that are most applicable to the online video space are public entities and commercial entities, including places of public accommodation. So the ADA is the law cited in two recent lawsuits regarding the lack of captioning. One was against Netflix and one that is still ongoing against Time Warner, specifically CNN. The case law suggests that large video repositories that are readily accessed by the public can be considered places of public accommodation and therefore must be accessible to everyone.
These are obviously digital environments. They’re not physical environments. So it’s certainly not necessarily the most intuitive interpretation. But it’s really important, because there certainly could be further implications for other video sites and enterprises as online video becomes such a predominant method of delivering content.
So the following is a timeline for content owners to implement processes to adhere to the new CVAA captioning rules. The milestone that just passed made it such that any content that had aired on television with captions must also have captions online within 45 days of being posted online. So that lag time is actually scheduled to shrink over the next couple of years, meaning the content that gets posted online will have to have captions added pretty quickly in order to be compliant.
The decision on clips, as I mentioned, has been deferred. Clips are defined as smaller segments of content that either come from a larger show or film or went straight to the web. There are a number of interest groups debating the captioning requirements of clips. And so the timeline for when clips will be required to have captions is still up in the air. One thing to note is that user-generated content is not included in this law at all.
The FCC released a new ruling on caption quality for video programming in February of this year. The CVAA text states that online video that previously aired on television must have captioning of at least the same quality as when programs are shown on television. So this means that television captions are setting the baseline for what’s acceptable. For the first part of just text accuracy, this means the captions must match the spoken words in the dialogue in their original language, which is, in this law, English or Spanish. And they must not be paraphrased. That’s something that’s important.
Accurate captions must convey tone and intent, and nonverbal information, such as sound effects, audience reactions, music, must be conveyed through the captions to be considered accurate. Synchronicity is very important as well. The captions must coincide with what is being spoken as much as possible, and it needs to show up at a speed that can be read by viewers. For content being edited for rebroadcast, captions would also have to be edited for accurate synchronization.
And one thing to note is that captions should cover the entirety of a program or film. There have been complaints through the FCC where captions have actually fallen off, or just ended part way through a show.
The FCC states that captions should not block important visual content on the screen or other information that is essential to understanding a program’s content when the closed captioning is on. For example, a news segment might feature an interview that displays a graphic with the name of the person speaking, and therefore the caption should be moved to a location on the screen that does not obscure that graphic.
The FCC mandate on user control started on January 1 of this year. This part of the law focuses on the media player functionality, which means any video programming distributors and media player providers must comply with this part. With the increased functionality, the viewer should be able to control the look and feel of their captions, including the font type and size, color, and a number of other aspects as well. They should even be able to add shadowing or make the text raised. And this, actually, makes it a pretty cool functionality for viewers who might have partial vision impairments, so that they can design the optimal reading experience to follow along.
YouTube was very early to release this type of functionality. And so if anyone’s interested in seeing what this looks like in action, YouTube is a great place to go. And just look for a video with captions and you’ll be able to see this working very well.
With all the new requirements, it actually does make sense to consider some exemptions. There might be some cases where it’s unfair to force people to add captions to their content. There are a number of reasons why that might be the case. A number of them are financial. So it might not make sense to force a company that’s still new to airing content or programming content, or if their revenues are under $3 million per year. It’s this idea of a financial hardship, or economic hardship.
And until recently, churches and religious broadcasters were exempt from the more general closed captioning requirements. But that exemption was actually rescinded by the FCC in 2011. So that means that all these faith-based organizations who were previously exempt, just by being a faith-based organization, are now subject to the same requirements and the same exemption criteria you see here as everyone else. So they’re considered just another programmer.
So I’m going to turn it back over to Tole.
TOLE KHESIN: Great. So I’m actually going to hand things off to Sean Bersell from the EMA, Entertainment Merchants Association, who will further discuss best practices.
SEAN BERSELL: OK. Thank you Tole. I’m going to talk about the best practices developed by EMA’s Closed Captioning Working Group. The Closed Captioning Working Group was developed to create a better understanding of closed captioning among our members, and develop appropriate best practices for compliance with the legal requirements that Josh just talked about, and to identify other best practices for video programming distributors.
And as Josh mentioned, there are a number of legal requirements. And we also were confronted with the fact that people were using a lot of different protocols for file formats and how to handle the legal requirements in captioning. And there are some difficulties in converting captions that have been developed for television over into captions for internet-delivered video. And also, we wanted to promote greater consistency in the development of captions.
So we put together a working group consisting of Amazon, Best Buy, CinemaNow, Google, Microsoft, Netflix, Rovi, Vudu, and with help from MovieLabs. And they developed a best practices document that really addresses three things. One, the certification– if somebody doesn’t include closed captions with the video file, why they didn’t. if they fall in one of the exemptions that Josh talked about.
The second area that we’ve been developing best practice is in file formats. And then the third is frame rate, and how to convey the information about the frame rate for the captioning files. I should mention that these best practices have not yet been finalized. We’re still talking about the file format.
So the first issue is certification. And as Josh mentioned, the video programming owner has to provide a closed captioning file to the programming distributor. And they have to, under law, establish a mechanism for communicating whether a particular video requires captioning. And that distributor must make a “good faith effort” to identify the programming, using that mechanism.
And the distributor may rely on the certification from the owner that a video is not subject to captioning if the content provider provides a clear and concise explanation of why captioning is not required. So our best practice for this is, if the content provider does not provide a closed caption file, that video programming owner should include a certification in the avails and the metadata that closed captions are not required and why.
I should explain that avails are something particular to the video world, and it’s content availability metadata. It is the set of metadata that describes the attributes of a product, when it’s available online, how long it’s available online, et cetera. There’s a couple dozen fields in that. And that’s to say it’s a subset of the larger video metadata.
We have identified six reasons, and they’re on your screen, that should be in the metadata or the avails. Either it was never broadcast on television, it is only aired on TV without captions. The third one is that it hasn’t appeared with captions since the law kicked in and became applicable.
The fourth one, it’s not full-length programming. The fifth one is kind of a catch-all, that for whatever reason it doesn’t fall within the category that requires captioning. For instance, it’s user-generated content. And then the sixth one is it falls in one of those many exemptions that Josh talked about. So that’s the process that we’ve developed for communicating if captions are not provided, why they’re not provided, so that people are compliant with the law.
The next issue is file formats. And you have to convert, as I mentioned, the captions that were developed for broadcast into captions that can be used for internet-delivered video. And then they require editing after that.
That can be difficult. It can be done either manually or by using software to extract and reformat the data. But even with software, it can be challenging, particularly when you’re dealing with a legacy format. And it’s desirable to have the file delivered in a format that’s relatively easy to use, and easy to extract and format. And you also want to, to the degree possible, preserve the original captioned presentation.
So the group has centered around two file formats. There’s not a preferred one. Some video programming distributors prefer one way, others prefer it another way. The first way is to deliver the closed caption format in SCC, which is Scenarist Closed Caption. That’s kind of the de facto standard for the conveyance of CEA-608 data.
We’ve said that’s acceptable. People can use it. It has a lot of flexibility. The only thing that we’ve said is make sure that you use a .SCC file extension so people know what it is.
The second option that has been identified is SMPTE Time Text. And SMPTE is the Society of Motion Picture and Television Engineering. They have developed this format, which is called SMPTE Time Text. And under the law there is a safe harbor if the content provider provides compliant captions in SMPTE Time Text.
So content providers really like to do that. Because then they’re in compliance with the law. And what we’re finding, actually, is sometimes they will provide the SMPTE Time Text file, and the programming distributors say, well, that’s nice, but I prefer it in SCC or some other format. And so they’re providing it in both formats so that they can be compliant with the law.
SMPTE has a lot of options in it. And so the recommendation at this point is that SMPTE must be constrained by a defined profile. And these profiles will have styling constraints, layout constraints, extension constraints, encoding constraints, structural constraints, and parameter constraints– a variety of different constraints in there. And what we’re finding is different retailers or video programming distributors will have different constraining requirements.
So what we’ve said is, that’s fine, as long as the video programming distributor communicates clearly how they want that, what their preferred profile is. We’ve also said you can use other features of SMPTE Time Text beyond the ones that are in the constraints. And again, if you do anything different, make sure that you communicate that very clearly.
This is the open issue that we have for the best practices document. And we’re still working on coming to closure on the file format. But that’s where we have pretty much landed at this point.
The third issue is frame rate. Now, caption files will come over separately from the video data file. And ideally, the caption frame rate should match the native frame rate of the source. However, as Dae can explain, often they do not. And synchronization can be a problem.
Some video content distributors require that the file be synchronized when it’s delivered to them. So they say, don’t send it to us unless the captions and the video are already synched. Others will take unsynched caption files.
And so for those that do not require synchronization prior to delivery, what the group has said as a best practices, the file can be submitted in any frame rate in which it was created, so long as the frame rate is clearly indicated. And for SCC files, the file name should indicate whether the file is drop-frame or non-drop-frame. And then there’s a protocol for the time code, depending on whether it’s drop-frame or non-drop-frame.
And that is the best practices that we’ve developed. As far as the certification and the frame rates, that’s finalized. And as I say, the open issue is the file format. So I’ll turn it back to Tole at this point.
TOLE KHESIN: Thanks, Sean. So now I’m going to hand things off to Claudia Rocha, who is going to talk about standards for transcription and captioning.
CLAUDIA ROCHA: Great. Thanks, Tole. So I’m going to talk a little bit about standards for what I’m calling transcription, and then afterwards, captions. And when I use “transcription” here, it’s more to describe the actual act of transcribing the audio, as compared to a transcription that would have no time stamps or anything like that.
So right now, there’s no sort of universally accepted style guide for how the audio should be transcribed. So we have standards that we use here and that seem to be applied throughout the captioning universe. I think the first one that’s very basic and that is applied everywhere is that spelling should be at least 99% accurate and that in the text, both uppercase and lowercase letters should be used. It’s easier to read, and it’s better for reading comprehension.
The other thing that we like to do is when multiple speakers are present, it’s good to show when there’s a speaker change if everyone’s onscreen, and if someone’s speaking offscreen, to identify them somehow with a label. So you might be using, when speakers change, a dash to indicate that there’s a new speaker talking, if everyone’s onscreen, or a double carrot. And when they’re offscreen, you might have the name of the person actually speaking and then the text that they say, just to help with comprehension for the person who’s reading the captions.
And the other thing with speaker labels is that you want to make sure, if you are labeling someone who’s offscreen or on the telephone, that you’re not actually revealing a plot point. So if there’s a mysterious caller calling in on a horror movie, you certainly wouldn’t want to say the name of the murderer if at the beginning it’s just a mysterious voice. So you always want to be keeping the plot in mind as you are transcribing the audio content, to make it so that it’ll be most comprehensible for the person reading the captions without actually revealing plot points too early.
The other thing that you want to make sure you do, which distinguishes captions from subtitling, is that you want to capture non-speech sounds. So if there’s music playing, you want to put that in. If there’s laughter in the room, you want to be including that. Sounds like that we typically set off with square brackets to show that it’s a sound. And again, sound effects that are pertinent to the plot should be included. But you don’t need to capture every single background noise that’s happening in the film or show, as that can be distracting.
So an example is if someone’s sort of walking along the street, and you see them, and they happen to have their keys in their hand, and you hear the keys jingling, you certainly don’t need to put in that sound effect. But if someone’s in a room and you hear offscreen that keys are jingling to open a door, you want to include that because that’s part of the plot, and the person onscreen may then react to that. Another example is if someone’s in a busy bullpen in a newsroom, so there are telephones ringing everywhere, you don’t need to capture every telephone that rings in the background, unless it’s affecting the plot as it happens. So that’s something to keep in mind.
You also want to make sure that you’re using punctuation for maximum clarity. That’s pretty self-explanatory. But you don’t want to put in a descriptor. So if someone’s shouting, you don’t want to put, “SHOUTING, Hi.” Then that’s taking up a lot of caption frame. If you can convey that with punctuation, so just “Hi” with an exclamation point, then that’s much better. And that’s more clear for the person who is reading the captions and also trying to watch the program at the same time.
The other thing that you want to do is if someone is speaking with an accent throughout the whole film, you don’t necessarily want to be transcribing their speech phonetically. If someone has a thick accent, if you’re transcribing every word phonetically because it isn’t, quote unquote, “proper English,” that becomes very hard to read. So you still want to be adhering to normal spelling of words. If someone does happen to put on a fake accent for a scene or a line or two, you can denote that in parentheses to say that they’re speaking with a Scottish accent or something like that, to convey that it’s not the typical way that they speak.
The other thing that Josh mentioned earlier was that you want to be as close to verbatim as possible. So you want to be capturing as much of the content as you can that makes sense. So for a scripted show, you would be including every “um” that a person says, every stutter, every stammer, because it’s intentionally said. The movie’s been edited that way.
There’s a little more leeway for reality shows and documentaries that have interviews with people. Because people don’t speak proper English. And if they’re saying “um” every other word, and that’s in the captions, it becomes very hard to digest that content. But any scripted show definitely should be verbatim. And even any reality show or documentary show should be as close as possible to verbatim without capturing every single stammer and stutter that makes the captions difficult to read.
And then finally, some quick standards, actually, for how the captions should be presented. The font style should be a non-serif, such as Helvetica medium. Each caption frame should hold one to three lines of text onscreen at a time. So it shouldn’t fill up the whole screen, because you still want to see the action that’s happening on the screen.
Each line should not exceed 32 characters. The minimum duration should be at least one second so the person has time to actually read the caption. And then if there’s extended sound effects, like there’s music playing, a full song is playing, you don’t want to keep up that whole caption on the screen the whole time. So you want that to drop off after four or five seconds.
And then each caption frame should be replaced by another caption, unless there’s a long period of silence. So again, same thing with that music tag– if someone speaks and then there’s 15 seconds of silence, you don’t want their caption frames to hang out onscreen throughout that whole silence. It makes it look like they talked for longer than they did, and it’s unnecessary. All caption frames should be precisely time-synched to the audio, so they should appear when the person speaks. And when they start the next sentence, the new caption frame should come on, or the next part of that sentence.
And then finally, caption frames should be repositioned if it obscures onscreen text. So if there’s burned-in subtitles, for instance, we have the ability to then move those caption frames that we’d normally have on the bottom up to the top of the screen to not obscure the onscreen burned-in subtitles. And that’s it. I’m going to pass it back to Tole.
TOLE KHESIN: Great. Thanks, Claudia. So now I will hand things off to Dae Kim from Netflix, who will talk about some additional best practices.
DAE KIM: Thanks, Tole. I would say personally I agree with Claudia on 3Play’s style choices, like brackets over parentheses, mixed case, transcribing slang. But there aren’t any conclusive studies saying one style choice is really better than another. So at Netflix we try to not focus too much on stylistic elements, and we’re really more worried about the technical quality of a file.
I’d say priority should be understanding the different file formats, how they relate to one another. In my experience with the major caption programs out there, no one program really does all file formats well. So one program could do SCC really well, but they don’t really support SMPTE Timed Text in a good way. Another program might support SMPTE Timed Text well, but not so well with EBU-STL.
So that technical understanding is where I think we really should focus on first. In terms of new technology, like a WebVTT and EBU-TT, they’ll be standardized soon. Netflix, we will support those as delivery formats eventually. I’m sure Google, Amazon, and the others support them as well.
The one thing we learned at Netflix is we want to get ahead of these emerging standards, so we’ve made an effort to work directly with the EBU group on EBU-TT, working directly with the editors of WebVTT, just so we can be more tightly aligned with the standards. So once again, that’ll be really important in terms of how the specific caption program you’re using supports those file formats.
Anyone that works internationally, there’s been a lot of noise in the European countries trying to implement their own version of the CVAA. So understanding their laws and how their file formats relate to the US formats, again, is going to be pretty important. And that’s it for me.
TOLE KHESIN: Great. Thanks, Dae. So we will begin the Q&A portion of this webinar in just a minute. We’re going to go offline for just a minute to aggregate the questions. And please feel free to type the questions into the lower right corner of the control panel. And also, on the screen you’ll see some resources that may be helpful.
There’s some resources here about the recent FCC rules, the CVAA captioning requirements, the EMA best practices and closed captioning. I can see that there was a question about the EMA draft specs. So those are available in that second link from the top, as well as the standard for the SMPTE Time Text format. So we’ll be back in one minute.
OK, great. So we are back here. We have a bunch of questions to go through. So first question is to Dae. What standards do you pay attention to with respect to caption quality?
DAE KIM: So when it comes to quality, we focus more on is the transcription, like, correct? Is it the correct spelling, correct grammar, is it in sync? I think we’re in a unique place, because some of our captions are recycled from what was already created for broadcast, and some are created specifically for Netflix.
So for the ones that are created or recycled from broadcast, we don’t really judge the quality too harshly, other than hey, is it easy to read? Does it follow the video? Do we think this is what was actually broadcast on TV? As long as it meets technically our sync, I think we’re OK with that.
For captions created from scratch, we have a very minimalist style that I can share with you. But like I said, we try not to get too deep into stylistic elements. But it’s more the technical stuff, really, that we focus on.
TOLE KHESIN: Great. Thanks. There’s actually another question here for you. Another question for Dae. So who should be responsible for charges when a content provider wants to charge for the caption files?
DAE KIM: When a content provider wants to charge for a caption file. Is this a caption file already created? Is it a caption file that’s created from scratch? Is it one that’s archived at a different company? It depends, really.
TOLE KHESIN: Yes. I think I’d interpret that like if the captions need to be created.
DAE KIM: Well, I would say that’s based on whoever’s their negotiator, really. I can’t really speak too much on Netflix’s business practice, but I know in the past we’ve subsidized caption creation by saying, we’ll give you a little more on upfront license fees. But it’s really on you to find a caption company. You write the checks, and you get those captions created. I can’t say we always do that, but that’s been one approach that’s worked fairly well.
TOLE KHESIN: OK. Thank you. Another question, Dae, again for you. So since the FCC’s starting to look at requiring more CEA-708 support, what file format do you recommend for dealing with CEA-708 Digital Closed Captions as opposed to 608?
DAE KIM: I don’t know that the CEA is pushing for more. I don’t think that they’re actually saying, hey, we want captions created with 708 features. I don’t think they’re saying, hey, use all 708 features. I think they were just saying, hey, let’s find a baseline, and 708 should be our baseline. In terms of supporting 708, the only viable option now that’s an official standard is TTML. And I know WebVTT, once that’s standardized, will also support 708.
I personally have not seen any. And even for caption files created today or delivered for shows that were broadcast a few months ago, these are all still 608. So I really can’t speak to 708. I’ve not seen it myself.
And I’ve asked around, and no one’s really– none of the caption providers have been asked to create to 708 with those features. And no one’s really seen them in production. But I would say generically, TTML or WebVTT once that’s out.
TOLE KHESIN: Great. Thanks. So this question is maybe best for Josh. Videos made for corporate websites that never aired on television, is captioning required for that video content?
JOSH MILLER: So in the case where the content never aired on television, it basically means that the CVAA does not apply. So there’d be no concern from that perspective. In terms of other laws, such as the ADA, there are others, such as Section 508, it really kind of depends on the audience and the situation. So it’d be a little tough to comment on that part. But it definitely would not fall under any kind of CVAA requirements.
TOLE KHESIN: So there’s a question here for– I guess Dae, maybe you should take a shot at answering it. And Sean, maybe feel free to weigh in as well. So the question is, what do you recommend for the presentation of captions? Should we have a classic white text on black background, or other ways?
DAE KIN: I would say that’s, once again, back to more just personal preference. Everyone just assumes for a closed caption, that’s white text with black background. That was never a hard requirement, or that was never any specifically– it just sort of happened. Netflix personally, we were thinking about taking every captioned file and just defaulting it to a specific color, regardless of what the caption file source says.
And that’s something we’ll have to run through legal and run by our product teams. But I don’t think it really matters. Me personally, sure, white text on a transparent background looks the best, but I don’t know if there’s any one best practice way to do it.
JOSH MILLER: One thing to add to that, also, is that some of that falls under the user control requirements of the CVAA, in that technically there’s kind of a basic place to start, as Dae mentioned. But then, really, like he said, it’s kind of viewer preference at some point. And so the user control piece of the CVAA addresses that, in really giving the viewer the ability to kind of style it however they want.
TOLE KHESIN: The question here, I think Sean, maybe this one you’re best qualified to answer. So the question is, when captions are not available, do we need to convey those six exceptions to the subscribers?
SEAN BERSELL: Yes. If the caption file does not come with the video file, it should convey why they are not being conveyed, and one of those six exceptions there. So yes.
DAE KIM: But I guess to add to that, this is one of the operational challenges with the CVAA, where, so let’s say a caption’s delivered to Netflix, or a movie’s delivered to Netflix today. And the CEO says, you’re not getting any captions because this movie was never on US television, so it’s exempt. But if Netflix has it for three years, and let’s say two years from now, that same movie is on TV with captions, all of a sudden, you as the provider have to come back to us and say, look, here’s the new caption file. We’re going to update our [? rec ?] to you, and we’ve got to get it to you in a way.
So I think properly cataloging what’s available when, and when it falls out of the CVAA, falls into the CVAA, will be a huge challenge for each of the distributors. And it’d be nice if there was like one giant, universal database for this. I don’t think that’s ever going to happen– or not any time soon in our lifetimes.
So these are one of the challenges that the CVAA is kind of forcing on us. And I don’t know if it’s always as easy as we’re going to deal with it once on delivery, and then we’ll forget about it. Because things might change for you.
TOLE KHESIN: Great. Thank you. There’s a question here, how do you think this new law, assuming they’re referring to the CVAA, will affect online learning at the government level? And I think Josh, maybe you could take a shot at that one.
JOSH MILLER: Yeah. So again, the CVAA won’t apply to online learning if the content went straight to the web. And if the content had been on television, and then goes to the web, even if it’s for government learning, the same law would apply. If it is straight to the web, then there are other laws that would be applicable, such as the ADA, and Sections 504 or 508, which, depending on the type of content, the type of the site may apply. And those are laws that have to do with federal funding and where the content’s coming from. But the CVAA would not necessarily apply if it’s online learning content straight to the web.
TOLE KHESIN: OK, so there’s a question here, Sean and Dae. It’s a general question. So who is checking or finding out if captions really did air on TV or not?
SEAN BERSELL: This is Sean. My understanding is that would be the video programming owner, whoever has that– the owner of that content. If it’s not on the video programming distributor to make that determination, the video programming owner should communicate that to the distributor. And under the law they have to have a mechanism for communicating and making sure that that is properly conveyed, and that the distributor can rely on what the programming owner tells them. But the onus to determine whether captions are required are on the video programming owner.
And the distributor can rely on a certification from the owner, whether captioning is required or not. But as a practical matter, there may be times when the distributor will want to do their own good faith inquiry to make sure that they’re compliant.
TOLE KHESIN: Great. Thanks, Sean. So there’s a question here for Claudia. If music goes on for a while and caption drops out, is it recommended that a “music stops” in brackets element be inserted if pertinent to the plot?
CLAUDIA ROCHA: Yeah. I mean, I think it really depends on why the music’s stopping. I think if there is a scene that’s transitioning from one scene to the next, and there’s some background music that comes to the foreground, you put the tag in. The scene changes, there’s no need to put in that the music stops.
Obviously if it’s a sudden, abrupt stop because something dramatic happens, or if there’s actually a scene in the movie where someone turns on the radio and starts listening to music, and then turns it off, you would want to caption that portion to show that the music stopped. But if it’s just background music that comes on, that you know because it’s a big, long stretch of background music, you don’t need to then say when it stops. You can assume it stops with someone starts speaking again.
TOLE KHESIN: Great. Thank you. There is a question here for Josh. “I have a small distribution company with revenues less than $3 million. Based on the closed captioning exemptions that you show, we are not required to create caption files for content that has never aired on TV. Is this accurate? Many digital platforms that we are working with say that caption files are required and may remove content that is not closed captioned.”
JOSH MILLER: So I’ll take a stab at this, and then Sean or Dae, you may want to– Dae especially may want to weigh in. So the law was kind of written in a vacuum. That’s something important to consider, that not every single situation is being evaluated here. So if you were to take the content that went on television and then put it on your site online online, then technically you are correct. You would be exempt based on the revenue threshold.
However, based on historical events, most of the larger distributors and digital platforms are taking a stance that they need to have captions with all their content for a number of reasons, and that they’re actually susceptible to laws outside of the CVAA. So the exemption is specific to the CVAA. And so they’re dealing with some other restrictions as well that have to be addressed.
And that’s why that stance, as far as we understand, is being taken. So that’s something that you’d have to discuss with them directly. I don’t know, Dae or Sean, if you want to add anything to that.
DAE KIM: Yeah. I would say, depending on who you’re working with, you need to take into account that they might not be requiring captions specifically because of the CVAA, but just because of their own internal business practice. Like for Netflix, we were sued by the NAD, National Association of the Deaf. So we agreed that all of our content would be at least subtitled.
So even if a movie isn’t part of the CVAA, where it was never broadcast on US TV, we would still require captioning or subtitling be provided. So I don’t know, it might be because they just want it just because they want it for their service, and it might not be CVAA specific.
TOLE KHESIN: Great. Thank you. There’s a question here that’s probably best suited for Claudia. So it says, “I’ve seen in some documents that song lyrics need to be provided for captioning if pertinent to the plot. Are those handled differently?”
CLAUDIA ROCHA: So I think actually whether they’re required or not is still up for debate. I think there’s a difference between knowing that someone got the rights to use the performance and whether or not they got the rights for the written lyrics, to display those. So from our perspective, whether or not you even caption the lyrics, if someone onscreen is singing, a character is singing, we would caption those. But otherwise, we would actually notate just the name of the song, if known, and the artist performing, and not necessarily always capture those.
That being said, if we are captioning lyrics, we do tend to, rather than put the whole lyrics in square brackets, we would put a tag at the beginning to state that someone is– a tag in parenthesis to demonstrate that they are singing those words, rather than putting the text in brackets. We traditionally treat brackets as sound effects, whereas lyrics are still actually spoken.
DAE KIM: And I would say remember that sometimes when a movie’s broadcast on TV with a specific song, there might be some legal fallout with the artists. So when it’s then created for distribution, they’ll replace that song out with something else. So I’ve seen it personally myself where we had captions with song lyrics for a movie. And then when we play it back on our video, it’s a completely different song. So that’s something that you as the owner would probably have more control over.
TOLE KHESIN: So there is a question here, what is the difference between a clip and a segment? Dae or Sean, do you guys have a sense for what the threshold is, or what the definitions are?
SEAN BERSELL: This is Sean. Yeah, that’s a really good question. And in the regulations that accompany the CVAA, it does go into that in some detail.
It’s been a while since I’ve looked at that. But essentially, it’s whether this is self-contained, if you view it as a beginning, middle, and end, rather than a clip, just be a portion of the overall program. But I would refer you to the regulations that went along with the law.
TOLE KHESIN: Thanks, Sean. So another question here, is there a requirement to have captioning available on mobile devices, such as an iPhone or an iPad? And I think Josh will take a shot at answering that.
JOSH MILLER: So in the January 1 milestone, one of the parameters was that a video playback device of any size should be designed to support accessible video, if technically feasible. So the answer is it should. Yes, it should be able to support captions.
Certainly there are more dependencies in that situation, in terms of the device manufacturer, the operating system on the mobile device. And so there are a lot more players involved in making sure that will work. But assuming the functionality is there, the captions should be provided to work that way.
TOLE KHESIN: Thanks, Josh. Great, so we are nearing the end of the hour. And unfortunately we weren’t able to get to all of the questions. But please feel free to reach out to us offline. I wanted to thank the speakers, and everyone in the audience. And as I mentioned before, a recording of this webinar with captions will be available tomorrow, and you’ll receive an email with a link to watch it. Thanks again, and hope you enjoy the rest of your day.