How Much Does It Cost to Do Closed Captioning In-House?

February 4, 2019 BY ELISA LEWIS
Updated: October 3, 2023

When it comes to closed captioning, cost is usually the main concern. Oftentimes, an organization will look within their internal resources to tackle the task. Maybe an intern or a grad student will be willing and able to transcribe and caption videos to make it accessible.

The assumption is that professional captioning services are expensive, and that it is more cost-effective caption in-house.

Is this really true?

Is it always cheaper and better to caption your own videos instead of sending them to a professional captioning company?

Let’s find out.

In-House Captioning Cost Calculation

Let’s walk through each step in the in-house captioning workflow and estimate the cost.

First, we should define the requirements for a successfully captioned video file. For most web-based video content, a video can be captioned using a small, external file that does not require any additional encoding or authoring of the video itself. That caption file is essentially a transcript that is broken up into caption frames with timecodes to denote when each caption frame should appear.

There are three main components in creating captions for video content: transcribing the video, synchronizing the text, controlling quality, and then managing the overall process.

1. Video Transcriptiondownload the ebook DIY resources for closed captioning and transcription

Let’s start with the first, and most time-consuming, task: video transcription. Traditionally, it takes a trained transcriptionist four to five hours to transcribe one hour of normal audio or video content.

If this task is to be done in-house, only a large corporation will be able to afford to hire and manage professional transcriptionists. More likely, for higher education or government, a student or intern will be available to work on video transcription part-time.

Not only will it take a student longer to transcribe than a professional transcriptionist, but they will also demand training and oversight in order to maintain consistent quality.

A conservative estimate for the transcription portion of our captioning exercise will be five hours. And let’s assume we pay our students $15 per hour.

That’s $75 to transcribe one hour of content.

2. Synchronization

Once you have a video transcript, it needs to be broken up into timed caption frames. There are a number of ways this can be accomplished.

There are free tools that allow a user to create caption frames and transcribe directly into the open fields. Alternatively, you can load a transcript into the tool and pick time points to break up lines.

Automated solutions also exist and can save time, but are extremely dependent on the quality of the audio and the quality of the transcript to properly match and synch the text to the video. YouTube actually offers this for free for any video you upload and have a transcript for.

For analysis purposes, let’s assume the synchronization effort adds 20% to the time requirement. In this case, that would be one more hour, or $15.

We’re now up to $90 per hour.

3. Quality Control

Finally, management and quality control are key factors for an ongoing captioning operation. In order for closed captions to be ADA compliant, they need to be as accurate as possible. If your captions are inaccurate, all that time and money invested in captioning will come up short.

Quality comes into play in two ways: training workers on correct captioning standards and reviewing/checking errors after a file is complete. If only a couple of videos need to be captioned, these issues may not be as apparent since someone can provide a bit more care and attention without severely driving up costs. However, a continuous workflow absolutely requires these quality considerations.

The main factors that dictate QA difficulty include:

  • Poor audio quality
  • Complex or niche content
  • Speakers with accents
  • Fast speakers
  • Overlapping speakers

When any of the above is true, the more time consuming it will be to correct a caption file. The key here is that errors in your caption file are most likely to occur because one (or more) of the above factors is true. This means that the person QAing your captions has to listen, pause, rewind, listen again, and correct the error(s). In really difficult cases, they may have to listen to the same section many times over, increase the volume, watch the speakers on screen, and research terms or names they’re not familiar with.

While reviewing for textual errors, the QA process should also include a review of caption timings. Changing the time codes of your caption files is time-consuming because it needs to be precise. This requires moving the playback bar incrementally to get the exact timing for each caption frame.

It’s also important to be on the lookout for words that sound right but are actually incorrect, like there/their/they’re or your/you’re. Similarly, differentiating terms like “can” and “can’t” transform the meaning of the file and are very easy to miss if the QA worker is not paying close attention.

For a proper review process, it is safe to say that a quality check will take more than the duration of the actual content. So let’s say one and a half hours for the one hour of content. This will likely be done by another student, but at least at the same $15 per hour rate.

That adds $22.50, bringing us up to $112.50 per hour.

4. Operations Management

closed captioning best practices and standards

The last question of management time largely depends on how much content needs to be captioned. That, in turn, will determine how many students or interns require training and scheduling oversight.

Let’s assume a student or intern can work 20 hours per week. If the fully loaded time to caption one hour of content is 7.5 hours (transcription plus captioning plus QA), then we can’t even get 3 hours of video captioned with one person in a week. Someone has to oversee this growing staff.

Let’s assume we’re dealing with 100 hours of content per month so we can figure out what the management costs might be. One hundred hours per month would require 750 labor hours to complete.

At 20 hours a week, we need 10 people working to complete the task. A single supervisor can likely oversee this group of 10, maybe even 12 to provide some overlap. At $25 per hour for 40 hours per week, a supervisor will cost $16,000 for every 4-month stint – the equivalent of one semester or term of an intern.

The one last piece of management that we haven’t discussed is training.

Transcription and captioning each have a long list of standards that must be followed to produce a consistent output. These standards cover issues such as how to transcribe someone’s false start to a sentence, how to represent numbers and math formulae, and how to identify speaker changes.

Closed captioning has rules about timing and number of characters per line and lines per frame. Student transcriptionists must be trained well in these standards in order to produce adequate captions. A conservative estimate of training time per student worker is $500. Plus, it is likely that a new group of students or interns is coming in every four months and will need training. Total training costs are now $10,000 for two shifts of 10 people.

If we just look at 8 months of the year (one academic year), management and training costs will be $42,000 to cover 800 hours of captioning. Labor fees for the actual transcription and captioning total $90,000.

The total cost of captioning per hour of content is now $165.

Cost Estimate for Closed Captioning Videos In-House

Let’s add up our total costs in the hypothetical scenario outlined above:

In-house captioning costs per house of footage: Transcription: $75; Synchronization: $15; QA: $23;Management: $52

A conservative estimate for an in-house, large-scale captioning operation averages $165 per hour of video. That’s more than double the cost of professional closed captioning and WAY more hassle.

This model assumes that everything goes smoothly – that 7.5 hours per hour is accurate and that little to no support is required beyond the creation of the files. For example, if it ends up taking 10 hours per hour of content, the cost per hour balloons above $200. At higher scale, management costs also quickly rise.

George Mason University: In-House vs. Outsourced Cost Comparison

Edtech professional Korey Singleton examined the cost of in-house vs. outsourced captioning at his organization, George Mason University. Korey oversaw an in-house captioning process, followed by a hybrid model with professional outsourcing. He presented his economic analysis of both methods in the webinar In-House Captioning Workflows and Economic Analysis.

When Korey initiated the in-house captioning program at George Mason University in FY12, they hired several grad students solely for transcription work. This made it easy for Korey to see exactly how much time and labor cost went into captioning.

(Note: in subsequent years, Korey didn’t have such precise data on in-house labor because the work was later distributed among several people).

The grad student transcriptionists climbed a steep and variable learning curve.

The average non-professional transcriptionist takes at least 4 times the duration of content to transcribe audio from scratch. Some students excelled more than others, but all of the in-house transcriptionists required a ramp-up time to get familiar with the process and editorial standards.

Orchestrating the whole captioning process internally drained the department of more resources than expected, and the in-house captioning that resulted was quite high.

In-house vs. Outsourced cost comparison (FY12 George Mason University)
In-house Outsourced
Hours of Content 38.92 18.63
Jobs 171 24
Cost/min to caption $5.87 $2.94
Total Cost $13,723.45 $3,286.33

In FY12, the cost-per-minute of in-house captioning was nearly double that of the professional captioning company. And because the majority of content was captioned in-house, that means a lot more was spent on captioning.

Korey admitted, “Honestly if we had just gone ahead and outsourced those minutes, we would have saved close to $7,000.”

You can really see cost savings with outsourcing at scale.

George Mason University increased its hours of captioned content annually as it ramped up its captioning process, and each increase in content drove down the price-per-minute.

Outsourcing at Scale: FY12-FY14 Outsourced Captioning Costs at George Mason University
FY12 FY13 FY14
Total Minutes 3,453 7,309 16,419
Total Hours 57.55 121.82 278.4
Total Jobs 195 371 1034
Hours (Outsourced) 18.63 68.97 222.55
Jobs (Outsourced) 24 177 901
Total Cost (Outsourced) $3,286.33 $11,297.00 $31,379.55
Avg. Cost/Min (Outsourced) $2.94 $2.73 $2.35

Between FY12 and FY14, George Mason saw their captioning costs decrease from $2.94/min to $2.35/min. This is because some closed captioning vendors offer discounts for bulk orders.

Korey also saw greater efficiency in outsourcing longer content to professionals. That prevented the in-office captioners to get backed up with big projects and kept the process moving.

What About In-House QA?

In our 2017 State of Captioning survey, 95% of respondents said they review their caption files for accuracy before publishing at least some of the time. But how much time does it take to review and correct caption files, and what is the cost of that time?

Several factors impact the time it takes to QA captions in-house including:

  • Poor audio quality
  • Complex or niche content
  • Speakers with accents
  • Fast speakers
  • Overlapping speakers

The cost of QA will vary depending on who is reviewing and correcting files. If student workers are doing most of the work, we can assume an hourly rate of $15 per hour. However, if staff members are QAing your caption files, the hourly rate will be much higher. Because of these possibilities, let’s assume a range of $15-$40 per hour. Similarly, student workers may only work 20 hours per week, but full-time employees work 40 hours. We’ll look at this range, as well.

Cost Comparison for a 1-hour Video

The graph below shows a cost comparison for a 1-hour video at 99% accuracy, 95% accuracy, and automatic captions. The graph includes a $15/hr labor rate compared with a $40/hr labor rate.

  • We’d estimate that QAing a 99% accurate file would take about 1.3x real-time, which is a reasonable task to manage in-house. For a one hour video, this QA job would cost $19.50-$50.
  • We’d estimate that QAing a 95% accurate file would take about 4x real-time. For a one hour video, this QA job would cost $60-$160.
  • We’d estimate that QAing an automatic caption file would take about 6x real-time. For a one hour video, this QA job would cost $90-$240.

QA cost increases dramatically based on accuracy. The chart shows a line graph of cost for QAing 99% accurate captions compared to 95% accurate captions compared to automatic captions.


Compromising on quality from the get-go can lead to expensive internal costs. If your original captions are consistently below 99% accuracy, these costs have to be considered in your budget.

At scale, these costs can multiply quickly and can make consistently delivering high-quality captions difficult.


At lower quantities, in-house captioning may be a good way to save a few dollars. Estimate the time and labor cost of captioning content in-house and compare it to the price of professional closed captioning services.

But at scale, the cost of in-house captioning skyrockets, while quality, consistency, and efficiency become harder to maintain.

Compare your in-house costs to professional closed captioning services: download our pricing and discounts form.


get started with 3play today! captioning and transcription
This post was originally published on January 28, 2016, by Emily Griffin, and has since been updated. 
3Play Media logo

Subscribe to the Blog Digest

Sign up to receive our blog digest and other information on this topic. You can unsubscribe anytime.

By subscribing you agree to our privacy policy.