Transcription and Custom Manufacturing …. (?!)

May 1, 2009 BY CJ JOHNSON
Updated: January 4, 2018

One of the hotter topics of conversation in the business world is custom manufacturing. The advent of intelligent software and machinery, in concert with innovation in the organization and application of human labor, creates opportunity for applying the benefits of the manufacturing model to new industries.  

One of my favorite up-and-coming companies, Proper Cloth, receives a custom order from their Web UI, sends specs to the “machine shop”, and a few days later, you have yourself a fresh new shirt, fit to your measurements and styling needs.  This process utilizes the best of Web-based IT and smart, precision manufacturing.  On the “About Our Dress Shirts”  page, we learn “The shape of each panel of fabric is computer generated and cut with a robotic cutter to ensure accuracy and precision.”

Brilliant.  The Proper Cloth model combines the best attributes of both mass clothing production (lower cost) and custom tailoring (higher quality).  As a consumer, my alternatives are to buy a shirt from a department store that doesn’t fit, at a similar price; or to visit a live tailor and pay a small fortune for a custom fit.

At 3Play Media, we have focused heavily on operations and process management in an industry notorious for its dramatic quality-to-cost tradeoff.  Bear with me as I apply the manufacturing analogy to the transcription process.

A typical transcription shop will resemble a combination of the “workshop system” and the “inside contracting system“.  Contractors use machines at the company headquarters or else use their own assets in their own homes to complete a job, start to finish.  Typical tools are a computer keyboard, mouse, transcription software, and a USB-based foot pedal (used to control playback).  The contractors are responsible for every piece of the output: getting the words & sentences right; as well as the style formatting (bolds, italics, etc.), document formatting (margins, spaces after periods, etc.), output transcript format (Word, HTML, TXT, XML, etc.), and output caption format (Real Text, DFXP, SRT, SCC for iOS, etc.).

We take a “mass customization” approach to our work, similar to the one Proper Cloth uses.  This implies minimizing the cost to produce output file-to-file, while meeting the customer requirements for output formatting; leading to the coveted low-cost, high-quality, consistent production.

In our production process, we rely heavily on software-based “machinery” to automate pieces of the overall transcript, and the vast majority of output format, styling, etc.  Employees are trained to operate one or more of our “machines” to create transcript or caption output.  To reduce cost, we strip down the required peripherals as much as possible, and task operators with the sole responsibility of getting the words and phrasing right.  Meanwhile, admins create custom configurations for style, time-synchronization, etc. to globally dictate the final, polished output for our clients.  

The benefit of this approach is two-fold.  First, it allows operators to focus solely on getting words right, lowering the “defect rate” for a transcript.  Second, it creates a consistency in output so both our clients and our QA experts can ensure quality on many issues via sampling rather than fine-tuned, close (uber-costly) inspection.

This may sound like overkill or completely unnecessary, but let’s take a real life example of a project we’ve seen.

Client ABC wants transcription and closed captioning services for 75 video files.  For their transcripts, they would like 1.25″ margins, 1.5 spaced paragraphs, and bold all-caps names for speaker changes.   They would like timestamps at the beginning of each paragraph.  Also, for 30 of the files, they want captions.   They would like output for their custom-made Flash player that accepts DFXP as an input; but it would be really nice if they could also get YouTube and iTunes formats as well.

In the case of the standard firm, managers have two options.  

  1. Take the project, but deny the customization requests.  The firm uses standardized practices to create transcripts, and it is up to the end user to make modifications.
  2. Send the job to staff members, but with the special requests attached.  These practices are outside of standard procedure, so it is up to them to conform to these standards outside of what they’re used to.  Since it is 75 hours of footage, due in a week, I will need to employ 8 staffers to complete the project.  My fingers are crossed that they’ll all successfully operate outside of standard firm practices.  Based on experience, my staff is really good and adequately meets custom standards for an entire file an astounding 95% of the time.  However, being a fan of probability and statistics, I quickly realize that across this entire project, my implied probability of meeting these standards for the entire project is only 2%.  This means I will have to go fishing through every file to ensure quality.  Of course, the only way to justify the added time is to pass on that cost to my client, making the entire project more expensive.  

This becomes a nightmare to manage, and nearly impossible to ensure consistency among the output files. For example, imagine being tasked to ensure that there are two spaces after every sentence ending.  Moreover, I need to ensure that I have staffed people who can create caption files for those 30 special requests, and be sure they have the appropriate tools to do so.

Meanwhile, in a mass customization model, we take the stress off the machine operators and instead customize the client and file configurations up front.  The operators process the files as usual, and on the pre-recorded end, our global configurations of all those tiny adjustments apply throughout the entire batch.  We can globally adjust file output standards, styles, margins, etc. for an entire batch with the click of one button.

Let’s say in passing through the list of requirements, we accidentally used 1″ margins instead of 1.25″ margins on half the files.  Seems like a small change, right? 

In the old model, we would open each of the 75 files individually, change the margins on the ones that needed it, re-save, and send the output again.  This is an additional cost to me, and we all know these types of labor costs end up being built in as higher prices across the board.  

In our process, we anticipate these kinds of issues (hey, we’re only human!).  To change the margins on all 75 files, we simply change one number, press a button, and the changes are made instantly.  For my money, I would much rather make a mistake like that across an entire batch.  It’s much easier to identify it and fix it globally than to have to worry that individual files, paragraphs, or even sentences are inconsistent.

For more information about customized transcription and captioning services, as well as our value-add library browsing tools, please visit our website.


Read the free report: 2017 State of Captioning.

The closed caption CC icon shown in the middle of a TV.