Are you paying too much for your captions?
Updated: January 4, 2018
Slides, notes, and a transcript are available for the presentation In-House Captioning Workflows and Economic Analysis. You can also read a digest of the presentation and the Q&A.
A look at the true cost of in-house captioning services.
When it comes to captioning, cost is almost always a concern. Often times an organization will look within to tackle the task. Maybe an intern, or some students, will be willing and able to help attain the goal of accessibility. The general intuition is that outside services are expensive, and that it is more cost-effective to use internal resources. So let’s take a look at what the fully loaded cost looks like.
First, we should define the requirements for a successfully captioned video file. For most web-based video content, a video can be captioned using a small, external file that does not require any additional encoding or authoring of the video itself. That caption file is essentially a transcript that is broken up into caption frames with timecodes to denote when each caption frame should show up.
There are three main components in creating captions for video content: transcribing the video, synchronizing the text, and then managing the overall process.
Let’s start with the first, and most time-consuming, task: transcription. Traditionally, it takes a trained transcriptionist four to five hours to transcribe one hour of normal audio or video content. But, if this task is to be done in-house, only a large corporation will be able to afford to hire and manage trained transcriptionists. More likely, for higher education or government, a student or intern will be available to work part-time on the task. Not only will the time-requirement to complete the work be on the higher end, but also training and management are now more critical in order to maintain consistent quality and turnaround.
A conservative estimate for the transcription portion of our captioning exercise will be five hours. And let’s assume we pay our students $15 per hour. That’s $75 to transcribe one hour of content.
Now, let’s discuss the synchronization step. There are a number of ways this can be accomplished. There are several free tools that allow a user to create caption frames and transcribe directly into the open fields. Alternatively, you can load a transcript into the tool and pick time points to break up lines. Automated solutions also exist and can save time, but are extremely dependent on the quality of the audio and the quality of the transcript to properly match and synch the text to the video. YouTube actually offers this for free for any video you upload and have a transcript for. For analysis purposes, let’s assume the synchronization effort adds 20% to the time requirement. In this case, that would be one more hour, or $15. We’re now up to $90 per hour.
3. Operations Management
Finally, management and quality control are key factors for an ongoing captioning operation. Quality comes into play in two ways: up front training of transcription and captioning standards and review/error checking after a file is complete. If only a couple videos need to be captioned, these issues may not be as apparent since someone can provide a bit more care and attention without driving up cost too severely. But a continuous workflow absolutely requires these quality considerations in order to provide an acceptable level of service and output.
For a proper review process, it is safe to say that a quality check will take more than the duration of the actual content. So let’s say one and a half hours for the one hour of content. This will likely be done by another student, but certainly at the same $15 per hour rate. We now add $22.50 to get to $112.50 as our running tab for the hour of content.
The last question of management time largely depends on how much content needs to be captioned. That in turn will determine how many students or interns require training and scheduling oversight. Let’s assume a student or intern can work 20 hours per week. If the fully loaded time to caption one hour of content is 7.5 hours (transcription plus captioning plus QA), then we can’t even get 3 hours of video captioned with one person in a week. Someone has to oversee this growing staff.
Let’s assume we’re dealing with 100 hours of content per month so we can figure out what the management costs might be. 100 hours per month would require 750 labor hours to complete. At 20 hours a week, we need 10 people working to complete the task. A single supervisor can likely oversee this group of 10, maybe even 12 to provide some overlap. At $25 per hour for 40 hours per week, a supervisor will cost $16,000 for every 4-month stint – the equivalent of one semester or term of an intern.
The one last piece of management that we haven’t discussed is training. Transcription and captioning each have a long list of standards that must be followed to produce a consistent output. These standards cover issues such as how to transcribe someone’s false start to a sentence, how to represent numbers and math formulae, and how to identify speaker changes. Captioning has rules about timing and number of characters per line and lines per frame. All these things have to be made systematic up front to reduce ongoing support costs. A conservative estimate of training time per student worker is $500. Plus, it is likely that a new group of students or interns is coming in every four months and will need training. Total training costs are now $10,000 for two shifts of 10 people.
If we just look at 8 months of the year (one academic year), management and training costs will be $42,000 to cover 800 hours of captioning. Labor fees for the actual transcription and captioning total $90,000. The total cost of captioning per hour of content is now $165. This assumes that everything goes smoothly – that 7.5 hours per hour is accurate and that little to no support is required beyond the creation of the files. For example, if it ends up taking 10 hours per hour of content, the cost per hour balloons above $200. At higher scale, management costs also quickly rise.
At lower quantities, in-house captioning may be a good way to save a few dollars. But, when scale is required, the costs will most definitely rise while quality and consistency will almost always suffer because transcription and captioning just isn’t what a university student or intern is trained to do.
For those who would like to brave the do-it-yourself world, here are some tools to help:
DIY Captioning Tips
Subtitle Horse (web-based)
MAGpie (Windows and Mac)
SubTitle Workshop (Windows only)
For everyone else, here is some information about how we’ve built a scalable transcription service.
Q&A: McGraw-Hill’s Roadmap Towards Greater Accessibility
Through their Roadmap to Accessibility, McGraw-Hill is steadily incorporating its accessibility initiatives into their products. As a result, McGraw-Hill is becoming a leader in accessible publishing. While they are the first to admit that it’s not always a clear road ahead, McGraw-Hill’s…
4 Reasons You Need Caption Encoding
What is it? Caption encoding is when captions are embedded into the video and presented as a single asset. Typically, captions are added onto a video as a “sidecar file,” but this method is intended for online video where one can upload…
University-Wide Accessibility: Q&A with CUNY
Carlos Herrera is Assistant Director of Services for Students with Disabilities, and Coordinator of the Technology Accessibility Task Force at City University of New York (CUNY). CUNY is comprised of 24 colleges and graduate schools across New York City, all of which…