How to Meet Netflix Captioning Specs

January 11, 2019 BY ELISA LEWIS
Updated: May 1, 2020

Before streaming media became widely popular, there were brick and mortar stores that rented DVDs out to the public. On a Friday evening, you would find a movie you wanted to watch, wait in the long line of other renters, purchase the film, then bring it back within the next couple days to avoid a late fee. It’s hard to imagine doing this now, right? What may be considered prehistoric for some, was once the norm for many Americans who wanted to watch movies in the comfort of their homes. In fact, Netflix didn’t introduce its online streaming service until 2006. The streaming media giant started out as a DVD rental company and quickly grew into one of the most prominent online businesses of our time, boasting over 137 million subscribers worldwide. With popular phrases like “Netflix and chill” or viral challenges like the Bird Box challenge, there’s no denying the influence of Netflix. 

Netflix original movie Bird Box

Image courtesy of Netflix original movie Bird Box

Now you can watch from anywhere in the world as long as you have a smart device and an internet connection. The advancement of technology allows you to be virtually anywhere to enjoy your favorite binge-worthy shows. Many people watch programs in sound-sensitive environments or may be deaf or hard of hearing, and can’t rely on the audio to gather pertinent information. The reality is that for millions of people, captions are needed to enjoy a show or film. If you’re submitting content to Netflix, you must provide captions in order for all viewers to have an enjoyable viewing experience. Here’s how to meet the specifications required by Netflix.

Netflix Specs and General Requirements

All timed-text, regardless of whether it’s submitted for an original or non-original show, must follow these guidelines in order to be accepted by Netflix.

Duration: This is also known as load time or build-up time. Captions require a duration in order to be displayed at the correct time.  

  • Minimum duration: ⅚ of a second per subtitle event (e.g., 20 frames for 24 fps)
  • Maximum duration: 7 seconds per subtitle event

File format: all subtitles and SDH files for all languages must be delivered in TTML (.dxfp or .xml)

Frame gap: must be two frames minimum regardless of frame rate

Glyph list: captions must only use text/characters included on the Netflix Glyph List

Line treatment: Always keep the text on one line, unless it exceeds the character limitation. If the text has to be broken into two lines, follow these rules:

line treatment for captions

Positioning: All subtitles should be center justified and placed at either the top or bottom of the screen, except for Japanese. Position captions accordingly, as to not interfere with onscreen text. If there’s a case where interference is impossible to avoid (text at the top and bottom of the screen), place captions where it is easiest to read.

Timing: captions should be timed to the audio, or in some cases, within three frames of the audio. If more time is required for better reading speed, the out-time can be extended up to 12 frames past the timecode at which the audio ends. Avoid captions that cross the shot changes whenever possible because it can disrupt the viewing experience. If the dialogue crosses the shot change, adjust the timecodes to either be at the shot change or at least 12 frames from it.

  • If the dialogue starts between 8-11 frames (indicated in the green zone) before the shot change, move the in-time up to 12 frames before the shot change.

shot change 1 timecode

Image courtesy of Netflix

  • If the dialogue starts seven frames or less (indicated in the red zone) before the shot change, move the in-time to the shot change. 

shot change 2 timecode

Image courtesy of Netflix

  • If the dialogue ends between 8-11 frames (green zone) after the shot change, moved the out-time to 12 frames after the shot change.

shot change 3 timecode

Image courtesy of Netflix

  • If the dialogue ends seven frames or less (red zone) after the shot change, the time code out should be moved to the shot change, respecting the two-frame gap.

shot change 4 timecode

Image courtesy of Netflix

  • If there’s one caption before and one after the shot change, the second caption should start on the shot change, and the first should end two frames before the shot change.

shot change 3 timecode

Image courtesy of Netflix

Consistency: all KNPs/formality tables must be created and used for translation to ensure consistency across episodes and seasons. Netflix recommends that content creators contact their Netflix contact to determine the best KNP workflow for their content.

Netflix credit translations: Translations for Netflix originals title cards must be included in full and forced subtitle streams. Refer to the Originals Credit Translation document provided by Netflix. Time captions to match the exact duration of the on-screen Original credit if possible.

Title cards/dedications: caption plot-pertinent and other relevant information that isn’t covered in the dialogue and/or redundant in the target language. 

scenic landscape of home and sunset with caption that says based on a true story

Currency: Any mention of money in the dialogue should remain in its original currency. Do not convert currency in subtitle files.

Brand names treatment: treatment of brands should be handled in the following ways: 

  • If the brand is widely known and used in that territory, use the same English-language brand name
  • Use a generic term for the product
  • Don’t swap one brand for another company’s trademarked item

Translator credits: use the translator credit as the last event in the caption file. It should occur after the end of the main program during the copyright disclaimer card. Ensure that it abides by the approved translations provided in the Original Credits Translations document, as mentioned above.

Netflix allows content creators to credit only one individual translator per asset, and the credit should be in the target language of the caption file. The credit should be timed for reading speed, with a duration between 1-5 seconds. It shouldn’t appear on-screen at the same time as the Netflix Ident

For subtitles for the Deaf and hard of hearing (SDH) files, include translator credits only if translating from the original language. If you’re transcribing the original or dubbed audio, do not include translator credits. When translating from multiple source language, more than one translator can be mentioned in the same credit

Forced narrative files should credit the subtitle translations for episode titles as long as there are translations in the file other than the Netflix provided translations for episode titles and the approved Netflix Original Credits Translations. 

Credits may only be omitted if the translator has submitted a formal waiver. 

Technical aspects: all TTML files must adhere to the following technical specifications:

  • Only use percentage values, not pixel values
  • Use tts:textAlign and tts:displayAlign for positioning along with static values for tts:extent and tts:origin
  • tts:fontSize should be defined as 100%. Don’t use pixel values

Netflix English Timed Text Style Guide

If you’re submitting English-specific captions, adhere to the English Timed Text Style Guide in order for your content to be approved by Netflix.

Accuracy of content:

  • Include as much of the original content as possible
  • Don’t simplify or water down the original dialogue
  • Shortening the original dialogue should be limited to instances where reading speed and synchronicity to the audio are an issue
  • Transcription of the source language should follow the word choice and sentence order of the spoken dialect. Slang and other dialectal features should not be changed

Character limitation: only 42 characters per line

Continuity: Do not use ellipses or dashes when an ongoing sentence is split between two or more continuous captions:

older couple sitting on couch. Husband wraps hands around wife as they engage in a conversation. Caption 1 says I always knew and caption 2 says you would agree with me

Use ellipses to indicate a pause or dialogue trailing off. If there is a pause and the sentence continues in the next subtitle, do not use ellipses at the beginning of the second caption:

woman says to man in caption had I known...and in caption 2 I wouldn't have called you

Use hyphens to indicate abrupt interruptions:

man cuts off cohorts by telling them to be quiet

Use ellipses followed by a space when there is a significant pause:

woman looks off to side in hesitation. Caption says she hesitated about accepting the job

Use ellipses without a space to indicate that the caption is starting mid-sentence:

two women sit on the floor talking to each other, suddenly one says have signed an agreement

Documentary: for TV/movie clips, all audible lines should be transcribed. If the audio interferes with the dialogue, the plot-pertinent content takes precedence. If the speaker is on-screen for at least part of the scene, do not italicize. Leave italics for off-screen narrators.

Foreign Dialogue: If foreign dialogue is translated, use [in language]. For example, if a film is in Spanish, use “[in Spanish]”. If the foreign dialogue is not meant to be understood by the viewer, use [speaking language]. For example, “[speaking Spanish]”. Accents or dialects require the same treatment. If it is a Spanish accent, use “[in Spanish accent]”. Be sure to research the language being spoken. Netflix does not accept “[speaking foreign language]” for foreign dialogue. 

Julia Child holding plate a food. The caption reads bon appetit

All proper names, such as foreign locations or company names, should not be italicized. Foreign words that are used in a mostly English line of dialogue don’t require translation but should be italicized. Be sure to verify spelling, accents, and punctuation for all foreign dialogue. Additionally, familiar foreign words and phrases that are in Webster’s dictionary do not need to be italicized (i.e., bon appétit, rendezvous, dopplelgänger, etc.)  

Italics: italicize the following in captions: 

voice overs, song lyrics, off-screen dialogue, titles of books and other works of art, unfamiliar foreign words, obviously emphasized speech, and dialogue heard through electronic media

Numbers: All numbers between 1-10 should be written out: one, two, three, etc. Numbers above 10 should be written numerically: 11, 12, 13, etc. If a number begins a sentence, it should be spelled out regardless of the number. 

Follow the following rules for the time of day:

  • Use the numerical value when exact times are emphasized: 9:30 a.m.
  • Use lowercase a.m. (ante meridiem) and p.m. (post meridiem) when the time is mentioned in dialogue
  • Spell out words or phrases that do not include actual numbers: half past, quarter of midnight, noon, etc. 
  • When “o’clock” is mentioned in dialogue, always spell out the number: eleven o’clock in the morning

Quotes: use English grammatical rules when using quotations. 

Reading speed:

  • Adult programs should be 20 characters per second  
  • Children’s programs should be 17 characters per second.

Songs: Caption and italicize all audible song lyrics that do not interfere with dialogue. Use song title identifiers when applicable – song titles should be in quotes, for example [“Forever Your Girl” playing]. Use the name of a musical number or classical piece only if widely known, for example [“The Nutcracker Suite” plays]. Album titles should be in italics. All song lyrics should be enclosed with a music note at the beginning and the end of each subtitle. Use an uppercase letter at the beginning of each line, ellipses when a song continues in the background but is no longer captioned to give precedence to dialogue. Only question marks and exclamation marks should be used at the end of a line – no periods. Commas can be used within the lyrics line, if necessary. 

Speaker ID/Sound effects:

  • Use brackets [ ] to enclose speaker IDs or sound effects
  • Use all lowercase, except for proper nouns
  • Only use speaker IDs or sound effects when they cannot be visually identified
  • When a speaker ID is required for a character who has yet to be identified by name, use [man] or [woman], or [male voice] or [female voice], so as not to provide information that is not yet present in the narrative. If the same identifier is used multiple times in one scene, numbers should be added to distinguish them, for example [man 1]
  • Use a generic ID to indicate and describe ambient music, for example [rock music playing over stereo]
  • Use objective descriptions that describe genre or mood identifiers for atmospheric non-lyrical music, for example [menacing electronic music plays]
  • Sound effects should be plot-pertinent
  • Sound effects that interrupt dialogue should be treated as follows:

man says however I've lately I've been and interrupts by coughing and sniffing. He finishes his sentence and says seeing a lot more of this

  • Never italicize speaker IDs or sound effects, even when the spoken information is italicized, such as in a voice-over

Why Netflix’s Captioning Specs are so High

For many years, captions were seen as a secondary asset. The reality is that millions of people rely on closed captions to enjoy their favorite programs. Captions are useful for individuals who speak English as a second language, watch shows in sound-sensitive environments, or who are deaf or hard of hearing. It’s paramount to provide captions, not just to get them approved by Netflix, but because everyone’s viewing needs and preferences should be accommodated.

In June 2011 the National Association of the Deaf (NAD) filed a lawsuit against Netflix for failing to provide closed captions for streaming video, citing that the company violated the Americans with Disabilities Act (ADA). Title III of the ADA states that “places of public accommodation” must be accessible to people with disabilities. The ADA was passed in 1990 before the internet became mainstream, and public places were thought to be physical brick and mortar stores. Although Netflix doesn’t have a physical location, millions of Americans use the service to enjoy shows and films with their loved ones. If captions weren’t provided for online video, many people would be excluded from enjoying this experience.

Judge Ponsor ruled that it would be irrational to conclude that “places of public accommodation” are limited to actual physical structures.

“In a society in which business is increasingly conducted online, excluding businesses that sell services through the internet from the ADA would run afoul of the purpose of the ADA. It would severely frustrate Congress’s intent that individuals with disabilities fully enjoy the goods, services, privileges, and advantages available indiscriminately to other members of the general public.”

Keep in mind that the NAD v. Netflix case was not a U.S. Supreme Court decision, so it’s not the law of the land. However, this case made a strong message in regards to digital businesses: captions are a primary asset.

Netflix has high standards for closed captions. Accurate captions are required in order for files to be accepted. In fact, more files are rejected for poor translations than any other type of error. When captions are inaccurate or flow awkwardly, it disrupts the viewing experience. Netflix expects publishers to review caption files before submitting them for approval.

Netflix continues to update and improve the film and television experiences for subscribers.

At 3Play, our transcriptionists are certified to create captions that meet Netflix’s specs. We adhere to Netflix’s standards and have a specific SMPTE-TT output that will incorporate all requirements.

Learn more about our captioning services!

get started with 3play's captioning services cta


3Play Media logo

Subscribe to the Blog Digest

Sign up to receive our blog digest and other information on this topic. You can unsubscribe anytime.

By subscribing you agree to our privacy policy.