« Return to video

Understanding Closed Captioning Standards and Guidelines [TRANSCRIPT]

LILY BOND: Welcome, everyone. And thank you for joining this webinar entitled Understanding Closed Captioning Standards and Guidelines.

I’m Lily Bond from 3Play Media, and I’ll be moderating today. And I’m thrilled to be joined by Jason Stark, who is the project director at the Described and Captioned Media Program, or DCMP, which sets forth guidelines for captioning and description to ensure the high quality required to provide equal access; as well as Cindy Camp, who is on a strategic planning team at pepnet 2, which provides training to increase the education, career, and lifetime choices available to individuals who are deaf or hard of hearing.

Their presentation will take about 45 minutes. And then we’ll leave 15 minutes for Q&A at the end. And with that, I will hand it off to Cindy and Jason, who have a wonderful presentation prepared for you.

CINDY CAMP: Thank you, Lily. And hopefully everyone is now seeing my screen. And I’ll be going through the slides here. As Lily said, the title of our presentation is Understanding Closed Captioning Standards and Guidelines. We think this is very important, because while captions are very essential for access for students who are deaf and hard of hearing, if they’re not high-quality captions, then they really aren’t providing that level of access that we want. So we want to go over some of the standards to help everyone understand how to get the most out of the captions.

Also Lily mentioned that I work with pepnet 2. We are federally funded project. And she gave you a little bit of information about this. I’d just like to say that if you have any questions related to serving students who are deaf and hard of hearing at the post-secondary level, please feel free to contact us. We are here to help in all those situations. And I’ll turn it over to Jason.

JASON STARK: Great. Thanks, Cindy. I wanted to give a brief summary of DCMP. Our mission is to promote and provide equal access to communication and learning through described and captioned educational media. And we have an ultimate goal for achieving accessible media to be an integral tool in the teaching and learning process for all stakeholders in the educational community. And this includes students, educators, school personnel, parents, service providers, businesses, and agencies.

And we do this in line with the US Department of Education’s Strategic Plan for 2014-2018 by committing to a couple of key goals. The first is assuring that students– we serve students who are early learning through grade 12 who are blind, visually impaired, deaf, hard of hearing, or deaf-blind– that they have the opportunity to achieve the standards of academic excellence.

We advocate for equal access to educational media as well as the establishment and maintenance of quality standards for captioning and description by service providers. We also provide a collection of on-demand described and captioned educational media.

We furnish information and research about accessible media. We act as a gateway for internet resources related to accessibility. And we also are involved in adapting and developing new media and technologies that assist these students in obtaining and using available information. So I’m thrilled to be here with Cindy to you talk about caption quality.

So I want to start out with just a very general definition of captions. Most all of you have seen them. But a general definition of captioning is the process of converting the audio content of a television broadcast, webcast, film, video, CD-ROM, DVD, live events, or other production into text and displaying that text on a screen or monitor. Captions not only display the words as a textual equivalent of the spoken dialogue or narration. But they also include speaker identification, sound effects, and music description.

But what are captions really? A key part of that definition is that captions rely not only what is said but also what is being communicated. In other words, captions are not just a straightforward translation of the spoken words.

So who benefits from captioning? Obviously, the answer that comes to mind to most people are individuals who are deaf or hard of hearing. But, of course, captions also benefit those for whom English is their second language, individuals in noisy environments, emerging readers, individuals with learning disabilities, and, in fact, all of us. Everyone can benefit from captioning.

So why is quality captioning important? Cindy spoke a little bit about this earlier. When captions are not high quality– meaning they are not correctly synchronized with the audio, are not properly formatted, or contain grammatical or spelling errors– an individual who is deaf or hard of hearing will not have full access to the content of the video.

So now we are going to play for you a small little video clip. Caption quality’s obviously important across the board. DCMP focuses on educational media. And quality and accuracy is critical in the classroom. So we’re going to attempt to push a short video clip to your screen.


– –to graduate and the first to be appointed to the faculty. While earning an advanced degree, he was put in charge of the school greenhouses. A distinguished botanist at the college described him as a brilliant student, the best scientific observer he had ever known.

On April 1, 1896, Booker T. Washington, founder of a small college for Negroes in Tuskegee, Alabama, wrote Carver a letter.

-“I can teach our people how to read and write and how to build a wall. But I don’t know how to teach them to plow and plant and harvest.”

JASON STARK: OK, great. As you saw there, there were a couple of pretty big errors, which obviously in the classroom– this video was actually marketed to schools, was closed-captioned, which was great that the producer decided that was an important thing to do. But unfortunately, they didn’t follow through and actually review the words that were added to their video and miscommunicated the message.

So we’re here today to talk about quality standards and specifically those as part of the DCMP’s Captioning Key. The Captioning Key guidelines were first published back in 1994 and have been a reference for captioning of both entertainment and educational media targeted to consumers at all levels, children through adults.

In the 20 years since the initial publication of the Key, DCMP continuously monitored consumer feedback, captioning research, studies, and reports. And those results have been incorporated into revisions into the Key. The guidelines are a key for vendors performing captioning directly for DCMP. But the information is applicable to all others that provide captioning of all media at various levels. So we’re going to talk about a few of the major standards and guidelines here today. The full Captioning Key document can be found online at www.captioningkey.org.

So I want first go over the five key points that make up quality captioning. It’s very important that captions are, number one, accurate. Errorless captions are the goal for each production. Secondly, they should be consistent. Uniformity in style and presentation of all captioning features is crucial for viewer understanding. Third, captions need to be clear. A complete textual representation of the audio, including speaker identification and non-speech information, provides clarity.

Fourth, they need to be readable. Captions need to be displayed with enough time to be read completely. They need to be in synchronization with the audio and are not obscured by nor do they obscure the visual content. And finally, they need to be equal. Equal access requires that the meaning and the intention of the material is completely preserved. And it’s important to note that, although our breakdown here is a little bit different than the FCC recent mandates in 2014, these points go hand in hand with the verbiage that they use in their document.

So there are a couple of core captioning rules. Number one’s a pretty basic one. You’ve got to spell things correctly. As we saw from that video clip that we just watched, obviously if words are misspelled– sometimes grossly misspelled– it’s very difficult for the individual relying on the captions to garner the meaning.

Secondly, you need to use correct grammar and punctuation. Oftentimes grammar rules differ between reference manuals, and there are sometimes more than one way to produce a correct caption. But your goal should be to be consistent.

So again, we already hit on the first point about spelling with our example. But why is grammar important? Let’s take a look at this simple sentence– “Let’s eat, Grandma.” Now let’s take a look at it another way. We’ve simply removed one small little comma and now it becomes, “Let’s eat Grandma.” Obviously, the connotation of that sentence is remarkably different with just the one comma being removed.

So now we’re going to turn it back over to Cindy. She’s going to walk through some of the additional guidelines listed in the Key.

CINDY CAMP: While the Captioning Key is very detailed, we wanted to pull out some of the most pertinent information and share that with you so you can see what a valuable tool it is if you are captioning yourself or if you’re outsourcing to a company. You want to make sure that they’re following these guidelines.

They’ve been researched for their accuracy and the readability. Because as I mentioned previously, if you’re not using high-quality captions, it really can be almost a waste of money, because the individuals relying on those captions aren’t going to benefit as they should.

So we’ll start with some of the basics. No more than 32 characters per line, including spaces. You can see in the image on the left, this caption follows that guideline. Also the fact that you only want one to two lines of text per screen.

When you look at the caption on the left, it’s easy to read it, take it in at a glance, and also take in the other visual information on the screen. When you look at the caption on the right, this one has very long lines of text and multiple lines per screen.

And you can see how your eye would have to scan over this to read it, which is going to take away from your ability to see the visual information and the graphical information on the screen as well. So we want to make sure that we keep the captions to an appropriate amount so it’s easy for the reader to take them in.

Next is something that might sound contradictory, because a lot of the captions that we see on our television are all uppercase. But in reality, it’s much easier to read a font that uses upper- and lowercase letters and also has ascenders and descenders.

This means that your words descend below the line, such as, in the first caption, the “your”– the tail of the Y goes below the line. The T in “this” and the H go above the line so it’s easier to read. This is what individuals are used to reading in books and newspapers, and it’s the easiest type of text to read. Research actually shows that all-capital letters is one of the most difficult types of font to read. So we want to avoid that.

We also want to make sure that the font you’re using, it has the right density. If you’ll notice, in the top left, that font is very narrow, and it’s difficult to read. The font on the top right is too heavy. It’s in a bold, a very heavy font that also makes it difficult to read. But in the bottom, you’ll see what is considered just right– kind of a medium-white font.

Another element that helps make captions more readable is to use a drop shadow. You may not be able to see it as clearly in this graphic. But there’s a slight shadow behind the letters, using the white text with that drop shadow.

And then adding a gray translucent box as a background helps make the captions much more readable. That translucent box also helps to take away any distortion there might be when the background changes. I’m sure you’ve all seen captions that show up fine on some screens. But then the background will change to either being dramatically brighter or darker, and the captions kind of fade away. This type of captioning format really helps make sure that the captions stay clear and crisp throughout.

When possible, we also want to make sure that the captions are in the center of the screen but left-justified. This also makes reading easier, because it’s what we’re used to reading in books and other forms of text. When the captions are all centered, it can make it more difficult to read.

It also means that when the next line of captions appears, they can move around on the screen, which can make it also more difficult to read. Not all forms of captioning software will support this type of formatting. And even if your software does support it, sometimes the caption format that you use may not allow you to keep all of these formatting elements. We’ll talk about that a little bit later in the presentation. But just know that these guidelines are going to make your captions as accessible as possible.

Now we’ll talk about some of the fun part of captioning, which is making sure that your lines are broken in grammatically correct places. A lot of people overlook this element because they don’t realize how important it is.

If you’re using software to help you caption, a lot of times you’ll be able to program in some standards, such as you can ask it to not have more than 32 characters per line. You can also ask the software to recognize if there is punctuation, that a caption should be broken at that point.

However, software is not human and cannot understand all of the intricacies of English grammar. So that’s why it’s very important to have a human presence review these captions and to make sure that they are broken in grammatically correct places.

In our first rule, do not break a modifier from the word it modifies. So the incorrect version would be, “Mark pushed his black– truck.” Correctly, you would break it with, “Mark pushed– his black truck.” It may seem unimportant. But breaking up the modifier and the word it modifies can subtly alter the meaning, and it can make it more difficult to read. Another rule is to not break a prepositional phrase. Correctly, “Mary scampered– under the table.” Incorrect would be, “Mary scampered under– the table.”

Next, we want to look at a person’s name or title. Those need to get together, or it can lead to confusion. Correctly, we would say, Bob and Suzy– excuse me– “Bob and Susan Smythe– are at the movies.” We wouldn’t want to break the first and last name– “Bob and Susan– Smythe are at the movies.” Another example, correctly done– “Suzy and Professor Baker– are here.” “Suzy and Professor– Baker are here,” would be the incorrect version.

Also, you don’t want to break a line after a conjunction. The incorrect way would be, “In seconds she arrived, and– he ordered a drink.” Correctly would be, “In seconds she arrived– and he ordered a drink.” Hopefully you’re beginning to see the logic in how we break these lines and follow the grammar rules.

Another example would be not breaking auxiliary verb from the word it modifies. Incorrect would be, “Mom said I could– have gone to the movies.” Correctly, “Mom said I could have gone– to the movies.” Then, never end a sentence and began a new sentence on the same line, unless they’re very short related sentences containing only a word or two.

Incorrectly, “He suspected that his face turned pale. He knew he–” and then we would have to wait till the next screen for the end of that sentence. It would be much better to just have the one sentence, “He suspected that his face turned pale.” Then on the next screen, we can see the next sentence.

As I’ve been involved in the captioning process, one error that I often see is, when individuals do line breaks, they try and get as close to the 32-character count as possible. When in reality, it can be much more beneficial to have more short sentences and short phrases.

Breaking the sentence at a lower point can make it easier to read and also easier to follow the grammar rules. So don’t feel that you need to get 30 or 31 characters. It can be just as beneficial to have 10 or 14 characters, just as long as you break the sentence so that it’s easier to read and to follow along.

Then we’ll talk about the sound effects. Videos include a lot of auditory information, and not all of it is spoken language. So we want to be sure that we include as much of that auditory information as possible. And the Captioning Key gives us some great guidelines on how to do that.

If you’re describing a sound effect, it should be in brackets so that the person viewing the video knows that this is going on, and it may not be clearly apparent with what’s on screen. For example, in the first graphic, we see a person riding a horse and doing barrel racing. And we know now that the audience is cheering in the background.

We also want to describe the sound effects. But if it’s clear where the sound is coming from, then we don’t have to label that source. So in the second picture, we see a dog, and we have the sound effect for growling. And we know that that’s coming from the dog. So no additional information is needed.

Some sound effects are combined with a text that shows the onomatopoeia. For example, in the first, we see a boat, and the engine is idling. So we have the string Rs to show what that may sound like.

These sound effects are really important, I’ve learned, especially for those who are hard of hearing, because it helps them understand what the sound is. They can relate to that very well. And I’ve had hard-of-hearing friends tell me how much they appreciate that extra information.

If a sound effect is offscreen, we’re going to put that in italics and in brackets. This could also be music. In our example, thunder is rumbling. You wouldn’t know that from looking at the picture. But that could be very important information for what happens next.

One of my favorite examples here is the fact that horror movies use a lot of sound to indicate what’s about to happen. For example, a creaking door– we know that something is about to jump out at the character. Horror movies aren’t quite as scary if we don’t have those background noises and that ominous music playing. So we want to make sure that those depending on captions get that information as well.

Then music. It’s a very important part of most videos. And so we want to include that information as well. If background music is playing, we want to include that information. It’s a good idea to include descriptive words, since the music sets the mood for what’s happening. But you want to be careful and not be too subjective. Avoid things like delightful, beautiful, melodic, because those add a subjective meaning that not everyone might think is there.

We want to make sure that if it’s offscreen music and we don’t know where it’s coming from that we put that information in italics. If the music has lyrics, it’s important that we include those and also the name of the vocalist or the group that is playing the music.

This information can help the individual enjoy the movie more. And, in fact, I’ve heard a lot of people say that they really enjoy watching the captions, because it’s not always easy to understand the lyrics of a song. And so having the captions up, even if you do not have a hearing loss, can be very helpful, and you can get more information.

Next, we want to talk about, how do we identify who’s speaking? If there are multiple individuals on the screen, it’s very important that we’re able to identify who’s talking. So one of the best ways to do this is to actually move the captions around on the screen.

In the example below, you see that the caption on the left identifies the actress on the left is speaking. And then when it shifts to the actress on the right, the caption moves under that person. This is a great visual way to show who is speaking. However, again, not all caption formats will support being able to move the caption on the screen. And so we’re going to talk about how to handle that.

If you’re not able to move the caption around on the screen and you know the name of the person speaking, then you would add that information in. If you don’t know the name of the person speaking, then you would identify the speaker with information that anyone viewing the video would know, such as “female #2” or “male narrator.”

If the person who is speaking is onscreen, then that information is included in standard text. But if the person is offscreen, then we include that information in italics so that we know that person is not visible.

Before I continue, there’s one other bit of information I wanted to add about placement of captions on the screen. One issue that can come up is when you have a video that includes a caption or a graphic. For example, during an interview, text may appear on the screen that gives the individual’s name and title. Quite often, this information is placed in the bottom middle of the screen, exactly where we would be putting our captions.

So when possible, you want to be able to place your caption at another point on the screen so it doesn’t interfere with any of the graphical information. If that is not possible, what will happen is the caption will overlay the text or the graphic. And that’s what we want to avoid.

Spoken language has so much meaning. It also has oddly formed sentences. And we even include wordplay. So captions need to include accuracy, clarity, and readability. This can be very challenging when you’re creating a transcript. Captions should include the auditory information that is not conveyed visually to ensure full access for those who are deaf and hard of hearing.

Captioning can sound very simple. I’ve heard a lot of people say, oh, it’s no big deal to create a transcript and then to use a program to get those translated into captions. In reality, human speech is very complex. And some parts of captioning can be quite subjective. We want to include as much information as possible.

For example, the tone of voice that a person is using conveys a lot of information. And if you just read the words, you’re not always sure if the person was being very serious and stern when they said that or if they said it in a sarcastic manner so you know that it was not meant to be taken literally. It’s important that we consider all of those things as we are creating the transcript and displaying them as captions.

As I said, it’s important that the tone of the voice is indicated if we know it. The clues to the emotional state of the speaker can be very important. In this one, we know that the speaker is angry. So we add that information in. And that way, the individual reading the captions has the same access to the information as everyone else. If someone’s whispering, we would put that information in as well.

And just as important, we want to know when there’s no audio or if the speaking is muffled so that the individual looking at the captions knows what’s going on as opposed to just thinking perhaps the captions were missing at this point. So we want to include all of that information. And I’ll turn it back over to Jason at this point.

JASON STARK: Great. Thanks, Cindy. So the caption standards Cindy reviewed there are based on research, best practices, and the experience of experts in the field. Definitely following these guidelines, we realized they’re quite specific. But if you follow them, they will ensure high-quality captions, which will promote access and learning.

However, as Cindy kind of touched upon and many of you know, not all captioning software is capable of creating captions in this preferred formatting. And in addition, even if the software itself allows you to create them as you export that information, not all caption file types are going to support all of the formatting.

So for example, most captions on YouTube videos– internet videos in general– are positioned in the bottom third of the frame and are both center-justified and aligned, instead of the preferred method discussed here of having the text positioned appropriately to suit the scene while being left aligned like a book. This is because SRT files, a common caption file type for online videos, they don’t support caption placement.

So when choosing a piece of captioning software and a file format for captions, it’s very important to recognize the limitations of each. They can significantly impact the quality and the versatility of your captions. In addition, you may actually have to change your captions if a particular formatting option isn’t available. And a great example of this would be using a platform that did not support positioning. Thus you could not use positioning to denote speaker ID and would have to implicitly state who was speaking instead of relying on positioning.

So we’re going to push another video out to you. And what we’ve done here is we have a short clip from the classic movie Pygmalion. And in the first half of the video, we basically have stripped all of the formatting. We’ve obviously already discussed the importance of making your captions accurate in terms of spelling and grammar. The first section here strips all the formatting, and then we’ll jump into a section that has Caption Key formatted captions.


-You won’t get a taxi there, what with the rain and the theater traffic.

-Oh. Oh, here he is.

-There’s not a taxi to be had for love or money.

-You haven’t tried at all.

-Oh, you really are helpless, Freddy. Go again and don’t come back until you have found us one.

-I shall simply get soaked for nothing.

-Well, what about us, you selfish pig?

-Oh, very well. I’ll go, I’ll go.



-Now then, Freddy, look where you’re goin’, d’ya? Oh, all my violets trod in the mud. What’d ya do that for? As if I haven’t got enough to do. [INAUDIBLE].

-How do you know that my son’s name is Freddy?

-Oh, he’s your son, is he? Well, if you’d done your duty by him as a mother should, he’d know better than to spoil a poor girl’s flowers and run away without payin’.

-Well, uh, there’s a shill– sixpence for you.

-Oh. Thank you kindly, lady.

-Now will you tell me how you know the young man’s name?

-I don’t.

-But I heard you call him Freddy. Now, don’t you try and deceive me.

-Yeah, who’s tryin’ to deceive you?

-I calls him Freddy or Charlie, same as you might yourself if you was talkin’ to a stranger and wished to be pleasant.

-Come along, Mother, sixpence thrown away.

-Oh, cheer up, Captain. If it’s raining worse, it’s a sign it’s nearly over. Come on, buy a flower for poor gal.

-I’m sorry, I haven’t any change.

-G’on, Captain. I can change half a crown.

-Now, don’t be troublesome. There’s a good girl. I haven’t any change really. Here’s a tuppence, [INAUDIBLE]. Taxi!

-Oh, thank you, Captain.

-Be careful! Give him a flower for it. There’s a bloke over there taking on every blessed word you’re saying.

-But I ain’t done nothing wrong by speaking to the gentleman. I’m a good girl, I am, so help me.

-What’s she hollerin’ about?

-But I never spoke to him except to ask him to buy a flower off me.

-What’s the gooda fussin’?


JASON STARK: OK, great. I see from the question box that apparently at least some of you are not able to see the video playing. So I apologize for that. We’ll work on making sure that this content is available for you guys to watch as part of the archive.

Basically, as you saw there– most of you did, hopefully– the captions in the second half of the clip had font that was much preferred over the thin font from the beginning. A translucent box was used behind the captions. Speakers were identified by placement and a few other improvements.

And again, it’s all centered around increased readability. Probably most of you sitting there reading that– obviously, you can read the words on the screen. They’re there. Following the Caption Key guidelines are just going to increase that readability. And now we will open it up for questions.

LILY BOND: Great. Thank you so much, Jason and Cindy. That was a really informative presentation. And there are just a ton of questions coming in. So as we go into Q&A, all of our contact information is on the screen here. And you should feel free to reach out to us with further questions.

We have a few upcoming webinars on implementing universal design as well as implementing accessible lecture capture and Quick Start to Captioning. And you can register through those on our website at 3playmedia.com/webinars.

CINDY CAMP: Lily, let me also add that that video is actually part of a training on post-production captioning that we have housed on the pepnet 2 website. Jason and I worked in developing this. It takes about two hours to complete.

And it goes over all of the basics of captioning and gives a lot of really good video examples of why the Captioning Key standards are so important. It also includes a short simulation for making captions so that people can really understand that process. And it’s free of charge.

So if you would like to access that, it’s actually on the last page of this PowerPoint, on our resources. And anyone who would like to learn more is welcome to go and go through that training.

LILY BOND: That’s a great point, Cindy. I’m sure a lot of people will take you up on that. So as I said, we have a lot of questions. I’m just going to dive in and start asking some of them. So Cindy and Jason, the first question here is, what if the speaker makes a grammatical error? Should we caption based on the speaker or based on incorrect grammar?

JASON STARK: Cindy, I’ll jump in here.

CINDY CAMP: Oh, good.

JASON STARK: You basically want to follow what the speaker is saying. Certainly if a speaker is doing false starts, a lot of um’s and uh’s, those can be edited, particularly when you’re faced with rapid dialogue and the captions appearing too quickly on screen.

But obviously, as hearing individuals are able to hear what a speaker is speaking, whether they have an accent or they’re making grammatical errors, that allows us to make a judgment about that person and how they are conveying their message. We want to make sure that that’s conveyed to the viewers of captions as well.

LILY BOND: Thanks, Jason. Another question here– how should you handle foreign languages being spoken? I assume in the midst of a mainly English video.

JASON STARK: What we do here with DCMP is we– it’s important to note that DCMP does open captioning. Those of you that were able to see the video– obviously those captions were superimposed on the video. And so the advantage to that and our clients is kind of twofold.

Number one, the captions don’t have to be turned on or off. And so they’re always going to be on. There’s not going to be a technical difficulty preventing somebody from viewing them. Because of that, we, DCMP, are not relying on a specific player or technology to display the captions. We have some leeway there.

What we do internally is we caption foreign-translated dialogue in yellow. So there’s just a little bit of a difference in color so that it’s very clear that it’s translated. If that’s not an option, then you would definitely need to ID, much like you would a sound effect, that it was translated dialogue.

LILY BOND: Great, thank you. Someone is asking if you could go into more detail about what “equal” means for what is important to captioning.

JASON STARK: Sure. Equal kind of is a summary of all of the other four points. And specifically the definition that I gave there was that equal access requires that the meaning and intention of the material is completely preserved. And again, that’s everything from making sure that you caption the sound effects and the dialogue, accents, the grammatical errors– like we just spoke. The goal here is to convey exactly what’s being communicated.

In addition, there are times when you must edit the dialogue. This could be just because somebody is rapidly speaking and there’s just not enough time to get all the words on the screen. And it’s very important. We do this at DCMP because we’re focused on educational captioning.

And so if we have a video that is targeted particularly to younger populations– kids who are just learning to read– we realize that 200 words per minute thrown up on the screen is just going to be too much for them to deal with.

So we do have a policy where we edit that dialogue. But again, we pay attention to this rule, and that is that you’ve got to make sure that the meaning and the intention of the material’s completely preserved. We don’t want to edit out important vocabulary. We don’t want to change concepts. It needs to be an equal representation.

LILY BOND: Great, thank you. Someone also is asking, what font and size are considered correct for captioning?

JASON STARK: Again, DCMP does open captioning. And I’ll speak to that and what we do here internally. For those of you that are doing player-based captioning, whether it be for YouTube or some other player where the player itself is going to dictate the settings, obviously you’re going to have no control over that.

But for standard-definition videos and the captioning software we use, we use Arial, which is, of course, a sans-serif font. And we use a font size of 22. For the high-def stuff, where the resolution is greater, we bump that up to 44.

And I’d be glad to follow up directly with anybody who has more specific questions on what we use. The important thing there is a sans-serif font. Helvetica, Verdana, and Arial– those are the preferred types of fonts.

LILY BOND: Thanks, Jason. Someone else is asking, YouTube doesn’t allow you to add line breaks without copying them from somewhere else. They do this so that word wrapping for larger caption text does not increase the number of lines being used. How should we handle that when considering the 32-character-per-line limit?

CINDY CAMP: I can answer that one. Normally what I’d do if I’m trying to caption a video on YouTube is I will create the captions in a Word document as opposed to creating them in YouTube. And then I can go ahead and create the line breaks and upload that. For me, I have not found that it takes any more time to create it in Word and upload that file than it would if I were trying to type it out in YouTube. So that would be my suggestion.

LILY BOND: Great. Thank you, Cindy. Someone else is asking, how do you handle profanity or inflammatory language?

JASON STARK: DCMP is, again, an educational program. So part of our process in selecting media for our collection is oftentimes to avoid productions that have that in it. The strict interpretation of equal access is, if hearing folks are going to hear that, that profanity, it should be captioned. But obviously, that also has to do with the intended audience who’s the viewer of the production. But a strict interpretation of providing equal access would be to caption all dialogue.

LILY BOND: Great, thank you. Another question here is, how do you handle a person who speaks with an accent or uses provincial language?

CINDY CAMP: Actually, if you were able to view the sample video, that was included. The person who spoke with a British accent– that information was in brackets, “British accent.” Also, the woman was labeled as speaking with a cockney accent. And the words were typed out as they would sound.

So instead of typing out the full word, it was phonetically based. I’m sure most of you can tell that I’m from the South. So the example I would use is “y’all.” And we would type that out– Y-apostrophe-A-L-L. So you want to include that information as well.

LILY BOND: Thanks, Cindy. Someone is asking, should you include sound effects that are not integral to the plot?

JASON STARK: Again, that is a decision that you, as a captioner, need to make. Certainly if you were going to have to sacrifice other dialogue in order to fit them in, it certainly would not be important. But again, I’m going to fall back to providing equal access means providing a full complete equal interpretation. And so, even if it’s not integral, if you’ve got time, it should be included.

LILY BOND: Thanks, Jason. Another question here is, I’ve been told by a vendor that including music lyrics may be a problem as it may violate copyright. How do we manage that?

JASON STARK: That’s a good question. And again, not speaking as an attorney, for us, what we do– we obviously are working with the producer or distributor of the product, who has the rights to the content as it’s provided to us. So we categorically do caption the lyrics.

I guess the best rule of thumb to avoid legal issues down the line is, if you are concerned about that, then don’t caption them. But DCMP’s rule is that we always do. And in 50-some odd years of our program, we’ve not had an issue doing so.

LILY BOND: Thanks, Jason. So a more broad question– a couple of people have asked, what captioning software would you recommend?

CINDY CAMP: This one can be kind of tricky. Jason and I both would probably recommend a company with software for both the Mac and the PC. And it’s CaptionMaker or MacCaption. And that is a really nice piece of software that has all the bells and whistles.

However, it can be on the higher end in terms of cost. And I know a lot of people are looking for some low-cost solutions. MAGpie is a free captioning software. Amara.org is an online captioning solution. And, in fact, it works really well if you’re trying to caption YouTube videos which you do not own.

You can run a search, and there are going to be a lot of different options out there. Some are available for free download. The biggest issue to be aware of is, if you are going to use one of the free or low-cost options, you’re not going to have control over a lot of the standards which we talked about– placing captions on the screen, adding a translucent background, choosing a font even.

So if you’re going to do a lot of captions, I think it’s really worth the investment to get some high-quality software that’s going to meet all your needs.

LILY BOND: Thank you, Cindy. That was really helpful. Someone is asking if there’s a format for closed captioning that supports caption placement. And just SCC, STL, WebVTT, DFXP, and SMPTE Timed Text are some that I’m aware of. I don’t know if either of you have formats to add to that list.

JASON STARK: Yeah, I think that’s got it. We personally, with YouTube– I know there was a question about YouTube captioning. And again, primarily DCMP uploads are open-captioned version just because we want the captions to be displayed using our format.

But when we do upload files to YouTube, we use SCC files. We’ve had some difficulty getting WebVTT files to be exported in a usable fashion from the software that we utilize. And SCCs work beautifully with YouTube.

LILY BOND: Thanks, Jason. That’s helpful. Another music question– when music is used to set a mood, would we want to describe the music as ominous or happy to show how the music and visuals relate, or would that be considered subjective?

JASON STARK: Yeah, absolutely. Obviously, the word “subjective” is up to the interpretation. And so to some extent, you definitely want to convey the meaning of the scene. Obviously, some general words of “upbeat” and, you know– general terms are great. You just simply don’t want to necessarily use an adjective that might be your own personal opinion.

LILY BOND: Thanks, Jason. Another question here. For speaker identification, if you can’t left-align the captions, is it still better to put the speaker identification on a separate line?

JASON STARK: I would probably answer that yes, simply because if you re-purpose the captions, take your file and use them in a different format, obviously, from a readability standpoint, you’d want them there. It is going to result specifically on YouTube if you just have a single line with a speaker ID and then allow YouTube to chunk the rest of your captions without setting your own line breaks. You’re going to potentially end up with a very short line followed by a very long line.

And again, as we said, the standards that we presented here today are based on a lot of research. And DCMP in its various iterations has been doing captioning back to the 1950s. And so these are definitely standards for educational captioning that have been tried and true throughout all that period.

There are lots of different opinions. A lot of people do speaker ID with a colon. There are different ways to handle it. But specifically for this question, I would tend to say yes, caption it the way it’s supposed to be done so that, if you repurpose that file, you don’t have to go make changes later.

LILY BOND: Thanks, Jason. I think we have time for a few more questions. It is approaching 3 o’clock now. But if you guys have five minutes, then we can get through another couple. And those will be included in the recording, if anyone has to run right at 3 o’clock.




LILY BOND: Quickly, someone is asking if 3Play Media follows these captioning standards. Yes, we do vary slightly with some of the guidelines. But for the most part, yes, we do. And quality and accuracy are very important to us. Cindy and Jason, someone else is asking how you should include captions if there are foreign-language subtitles that are burned into the video.

JASON STARK: Yeah. DCMP, originally, our guidelines stated that you should not duplicate on-screen text. And so whether it be an on-screen graphic or, in this case, superimposed, burned-on foreign subtitles, we would not duplicate that.

And typically on a case by case basis, you’d have to make a decision as to whether or not the captions you are providing, if you did caption, would overlay those. Nowadays, captions, of course, are used secondarily for content indexing and the searchability of particular content.

And so we have basically modified our guidelines to be flexible there. Certainly in a situation where you had on-screen text, whether it be the foreign translation or vocabulary words or the new drug that you were watching the video about– that type of thing– you probably want that content indexed. And so we would suggested that you caption that and just make sure that your caption placement is such that you don’t obscure that graphic.

LILY BOND: Thanks, Jason. Another question– is there a minimum time that a caption frame should remain on screen?

JASON STARK: Basically, we would do about a second or a second to two seconds. The old analog closed-caption decoders used to have a minimum load time that the captions had to be displayed in order for them to realize the caption needed to be displayed.

It’s really going to come down to readability. If you’ve got a very short caption– somebody saying, “Yes”– obviously it needs a much shorter time than a multi-word caption.

LILY BOND: That make sense. And how do you handle speakers talking over each other?

JASON STARK: It’s a challenge. Most higher-end captioning software will actually allow you to do two different captions on screen at the same time. Ideally, if you’ve got placement, if that’s an option for you, then you can place the captions to denote the speaker ID.

Otherwise, if you did not have placement, the best option would be to do maybe a hyphen followed by the dialogue on top of each other just to denote that they are two different speakers. But again, it gets difficult if it’s not clear which of the speakers is saying which of the statements. And so the clarity of the message is important.

LILY BOND: Thanks. And one final question here– do you have any advice for captioning math content?

JASON STARK: Again, with the newer captioning software, it’s not as much of a challenge as it used to be in the sense that most of the characters are going to be available to you in your captioning software as well as on the display.

The Captioning Key does have a number section as a reference. And so I will follow up after this and confirm that we’ve got some math information there. I don’t off the top of my head remember whether we specifically cover that or not.

LILY BOND: Great. Thank you, Jason. So with that, I think we should come to a close. Thank you, Jason and Cindy, so much for your presentation and really great answers to the Q&A. It was really, really valuable information. And we appreciate you being here.

JASON STARK: You’re welcome.

LILY BOND: So thank you, everyone, so much for being here. And I hope you have a great rest of the day.