What Is 99% Accuracy, Really? Why Caption Quality Matters
Updated: February 22, 2023
99% accuracy is the industry standard for caption quality.
Most captioning vendors guarantee 99% accuracy.
But, how true are their claims?
What Does Caption Quality Include?
Accuracy is a critical aspect of the captioning process.
Not only do accurate captions create an equal viewing experience for viewers who are d/Deaf or hard of hearing, but they also ensure your organization is in compliance with major accessibility laws.
The industry standard for captioning accuracy is 99%.
A 99% accuracy rate means that there is a 1% chance of error or leniency of 15 errors total per 1,500 words.
Accuracy measures punctuation, spelling, and grammar.
Note: Punctuation errors can be subjective. At 3Play, we measure our accuracy to include punctuation because we believe incorrect punctuation can make a tremendous difference to the comprehension of a file.
In addition to including punctuation, we also have internal standards to meet linguistic and legal requirements for caption quality.
Download the 2022 Annual State of ASR Report
Why is it 99% accuracy and not 100%?
Neither technology nor humans can deliver a truly 100% accurate caption file.
Automatic speech recognition (ASR) is good, but not good enough to remove humans from the process.
ASR doesn’t include speaker identifications or important sound effects. ASR transcripts are notoriously riddled with inconsistencies in spelling and in grammar. That’s why it’s important to have human editors review the transcript to ensure caption quality.
So, how do errors affect your content?
- Decreases reading comprehension
- Poorly reflects your branding
- Lack of compliance with FCC, DCMP, and WCAG standards
- Alters the meaning of your content
- Makes your content inaccessible
Challenging Our Competitor’s Claims on Caption Quality
We decided to challenge 2 major competitors on their 99% accuracy claims.
We submitted several files with:
- Good quality audio 🎧
- Varied subject matter 📚
- Duration of 5+ minutes ⏱️
We measured each file’s accuracy based on spelling errors, grammar errors, and word error rate (WER).
Note: WER is commonly used in the speech recognition community to judge and determine ASR quality.
We also measured accuracy including and excluding punctuation.
Don’t Shouldn’t Lie
Both competitors we challenged advertise a 99% accuracy rate on all their files.
But in reality, their measured accuracy rate falls between 84.7% and 94.4%.
What Errors Did Our Competitors Make?
We uncovered numerous spelling errors and inconsistencies throughout every single file we submitted.
The most common errors made by our competitors were spelling, punctuation, incorrect wording, and false starts.
Competitors Transcript Examples
- “Then, my favor poet E.E. Cummings came along”
- “Bob Nice is a co-inventor of integrated circuits”
- “…expand the coup of influence in more militant ways.”
- He proposed the law of octanes by analogy with the seven intervals of the music scale”
- It’s what orients and crowns us.”
- “…and this includes people talking about Eco cities.“
Correct Transcript Examples
- “Then, my favorite poet E.E. Cummings came along”
- “Bob Noyce is a co-inventor of integrated circuits”
- “…expand the scope of influence in more militant ways.”
- He proposed the Law of Octaves by analogy with the seven intervals of the music scale”
- It’s what orients and grounds us.”
- “…and this includes people talking about eco-cities.“
Grammar and punctuation are critical for reading comprehension and for compliance with captioning best practices.
Even when we removed grammar and punctuation errors from our calculations, the best accuracy rates we measured were only 92-94.4%. Such low accuracy rates are unacceptable from an industry and accessibility perspective.
Discover How ASR Engines Impact Caption Quality ➡️
Why Did Our Competitors Make These Errors?
Process directly impacts file accuracy.
Both of the competitors we tested follow a similar process.
First, they cut a single file into smaller segments; then they distribute those segments across a pool of transcribers.
Once the transcription is complete, the file is pieced back together.
As a result, your files come back full of inconsistencies and errors.
Furthermore, our competitors also miss a critical step in the quality assurance process. Both competitors have little-to-no quality assurance processes. This means the final file is never reviewed before being sent back to the customer.
This results in inconsistencies in spelling and grammar across a single file. For example, “mom” could be spelled “mum,” or “flyer” could be spelled “flier.” These subtle changes mean everything to the comprehension of a file.
How Can 3Play Guarantee 99% Accuracy?
At 3Play Media, we guarantee at least a 99% accuracy rate on all of your files.
Like our competitors, our process directly impacts the accuracy of our files. We use an innovative approach that combines artificial intelligence and human editing to maximize efficiency and optimize our process.
- Automatic Speech Recognition: Your video will go through ASR to produce a rough transcript
- Human Editing Cleanup: A professional editor will clean up the transcript using our proprietary software
- Human Quality Review: A quality assurance manager will conduct a final review of the transcript and captions
Our model is designed to make the captioning process easier for you. Every file is first put through ASR to produce a rough skeleton for our transcript editors to edit – this step simply helps to speed up the editing process so that we can offer faster turnaround options.
Next, one of our thousands of professional transcript editors takes the rough ASR transcript and begins editing it.
Finally, every file is put through a round of human quality review. A quality assurance manager conducts a thorough review of the final transcript and caption file, making sure grammar and spelling are consistent and accurate.
We keep your files intact and also check them twice!
Everything You Need to Know About ASR Technologies➡️
What is the Cost of Inaccurate Captions?
99% accuracy at rates approaching $1 per minute seems too good to be true – because it is.
When we took into consideration spelling, punctuation, incorrect words, and insertions, our competitors’ accuracy rates ranged from only 84.7%-94.4% accuracy.
For a sentence of 8 words, a 95% accuracy rate means there will be an average of 2.5 errors in every sentence.
Incorrect grammar and punctuation make captions extremely difficult to follow – when we allowed for those errors, the average accuracy was still only 92-94.4% accurate.
Inaccurate captions mean more work for you. Why should you have to go back and edit your files, if you paid for someone else to professionally caption them?
Captioning your videos isn’t cheap.
When you pay for captioning, your vendor should be thorough and accurate. Inaccurate captions take a lot more work on your end. Instead of focusing on other projects, you or your employees will have to spend additional time editing returned files or spend even more money reprocessing the files.
The Department of Justice submitted a statement of interest agreeing that MIT and Harvard’s inaccurate captions fail to provide equal access to the deaf and hard of hearing.
Inaccurate captions lead to the miscomprehension of content. Misspellings and incorrect words can be detrimental to the comprehension of your content.
This is especially problematic with educational videos. Mistakes in your caption file can easily misinform a student.
Additionally, inaccurate captions hurt your brand. A UK study uncovered that what damages a customer’s view of a brand is “poor spelling or grammar.”
Grammar errors deter the credibility of a brand.
In a study by Global Lingo, 59% of respondents said they “would not use a company that had obvious grammatical or spelling mistakes on its website or marketing material.”
It’s imperative to ensure inaccurate captions don’t hurt your brand.
How to Alleviate Caption Quality Concerns
To help alleviate accuracy concerns, look for vendors who:
- Measure their accuracy rate – and can tell you how.
- Don’t crowdsource.
- Train transcriptionists on the standards.
- Have a method for handling difficult content.
- Have an easy way to edit captions without having to reprocess.
- And always make sure to ask vendors about their captioning process.
The “Human Touch” in Live Captioning Ensures Accuracy & Accessibility
Closed captions are an important factor in making video accessible for all audiences, including live streamed events. After the pandemic gave rise to a renewed popularity of virtual events, more organizations are looking for live captioning solutions – but keep in mind,…
How do 3Play’s Live Captions Compare to Zoom’s Built-in Captions?
Artificial intelligence-based automatic speech recognition (ASR) is one step of 3Play Media’s innovative transcription process, and it’s also what powers our live captioning solution. As a result, we’re deeply invested in following trends in the ASR industry in order to make sure…
Dog Training and Machine Learning: What They Have In Common
Although sometimes it seems we’re eerily close, machines haven’t replaced us yet. Yes, machines can make faster and more complex decisions, but it’s pretty easy to break one. Also, machines still can’t process logic that they haven’t been taught. Try out some…
Subscribe to the Blog Digest
Sign up to receive our blog digest and other information on this topic. You can unsubscribe anytime.