Transcription vs. Captioning – What’s the Difference?
Updated: March 30, 2021
Have you ever thought about transcription vs. captioning and how they compare to each other? We’re here to tell you the disparity goes beyond mere dictionary definitions. Both transcription and captioning come with their own uses, benefits, and legal requirements. While they have individual properties and work uniquely on their own, they can also work together to create more accessible and more user-friendly content.
Transcription vs. Captioning
Transcription and captioning are each a separate process with their own output product. We’ll explain the basic definitions of both transcription and captioning and how and why they differ, which will help you gain a solid understanding of the purpose of each.
Transcription is the process in which speech or audio is converted into a written text document, whereas captioning divides transcript text into into time-coded chunks, known as “caption frames.” While transcription forms the basis for captioning, they each have different use cases. Transcription can be used to make audio-only content accessible, but accurate captioning is legally required to make videos accessible. Both transcription and captioning help to boost video and audio SEO.
Transcription is the process in which speech or audio is converted into a written, plain text document. Transcripts are the output of transcription, and because they are plain text there is no time information attached to it.
There are two main transcription practices: verbatim and clean read. Verbatim transcribes the audio word-for-word and includes all utterances and sound effects, great for scripted speech like a TV show, movie, or skit. Clean read transcription edits the text to read more fluidly, perfect for unscripted content like interviews and recorded speaking events.
Captioning is a process that involves dividing transcript text into chucks, known as “caption frames,” and time-coding each frame to synchronize with the audio of a video. The output of captioning are captions which are typically located at the bottom of a video screen. Captions allow viewers to follow along with the audio and video or captions interchangeably.
Closed captions should depict speech and sound effects, as well as identify various speakers. Captions must account for any sound that is not visually apparent, and assume that the viewer cannot hear the video at all. Afterall, d/Deaf and hard of hearing people often rely on captions to consume video media.
The Benefits of Transcription vs. Captioning
Both transcription and captioning offer their own benefits, and knowing the role they each play in video and audio will help you determine how to utilize them.
Benefits of Transcription
Transcription has many benefits, and is a great supplement to your video and audio content. First and foremost, transcription is a useful accessibility tool. If you’re not using a professional transcription and captioning service, then transcripts are the perfect segway into creating closed captions in-house.
Do you have a podcast or radio show? Do you listen to them regularly? Transcripts make radio shows accessible, improve comprehension for ESL listeners, allow for increased user interaction, and can even help with SEO. This American Life (TAL), one of the most popular podcasts, was able to increase their unique visitors through organic search results by over 6% and increase inbound links by nearly 4%, all due to transcribing 100% of their audio archive.
Transcription not only improves SEO for radio shows, but improves SEO for video content, too. Search engines are not able to physically watch a video, therefore they have no way of ranking video content based on more than the metadata. Transcripts, along with caption files, allow search engines to read and “view” the true contents of a video, thus making it possible to index and rank them.
Benefits of Captioning
Captioning offers several benefits, as long as the captions are accurate. Captions are necessary to make video content accessible to d/Deaf and hard of hearing viewers. On top of that, they provide assistance to English as second language (ESL) speakers, and help viewers with learning disabilities or attention deficits more easily maintain their concentration.
Captions have shown to help people with comprehension of the dialogue, and that’s a good things since 41% of video is incomprehensible without sound or captions. Captions also allow viewers to watch video in a sound sensitive environment like a quiet office or on public transportation. All these things combined lead to more people watching your videos; a Facebook video advertising study found that captions increased video view time by 12%.
Accessibility Requirements: Transcription vs. Captioning
Making video and audio content accessible is important for many reasons, mainly because it allows all viewers the opportunity to consume media and because it’s the law.
When you’re considering transcription vs. captioning for your video content, it’s important to think about what’s required to ensure your content is fully compliant with the law.
Is Transcription Enough to Meet Accessibility Laws?
The short answer is no. Let’s talk about why:
The Americans with Disabilities Act (ADA) requires that an equivalent experience must be made for d/Deaf and hard of hearing people. Transcription is not time-coded, therefore it does not allow d?deaf or hard of hearing viewers to follow along in real-time with the content. Captions must be included in order to provide that crucial element of an equivalent experience.
In summary, transcription and transcripts are not enough to make video accessible in compliance with the law.
Transcription, Captioning, and Accessibility
Just because transcription alone isn’t enough to legally comply with accessibility standards, that doesn’t mean it has no place in making your content accessible. Here’s how transcripts contribute to accessibility:
- the first step in creating captions for video
- the best way to make radio shows accessible
- simple to translate into other languages for global accessibility
- Helpful for ESL users, those with autism, attention deficits, & learning disabilities
Thinking about transcriptions and captioning?
3 Reasons Why You Need Video Transcription
Video transcription is the process of translating your video’s audio into text using automatic speech recognition technology, human transcriptionists, or a combination of the two. Without video transcription, your videos rely solely on audiovisual material to convey information. What motivates a video…
Recommended Resources: Online Learning Accessibility & Policy
We may be accessibility experts, but we certainly didn’t get here without doing our research. That’s why we’ve compiled this industry-specific resource guide on the implementation of online learning accessibility policies & best practices at institutions of higher education. Read through our…
WebVTT Captions: How to Create a Web Video Text Track File
A “Web Video Text Track” file, also known as WebVTT (.vtt), is a popular subtitle and caption file format. WebVTT was created in 2010 by the Web Hypertext Application Technology Working Group (WHATWG) to support text tracks in HTML5. WebVTT was broadly…