Tuned ASR: How 3Play is Advancing Live Automatic Speech Recognition for Closed Captions

April 6, 2023 BY JENA WALLACE

Captioning Best Practices for Media & Entertainment [Free eBook]


When it comes to live events, choosing automatic speech recognition (ASR) for captioning may seem like an easy, inexpensive accessibility option. Unfortunately, relying solely on ASR can result in embarrassing captioning moments and viewer criticism. On the other hand, opting for expert human captioners provides higher accuracy, but can come with deeper costs and scheduling inconsistencies.

Both options have valid challenges for broadcasters when it comes to media accessibility, and there often isn’t much middle ground. That’s why we’re thrilled to share our new managed ASR live captioning solution: Tuned ASR. 

Tuned ASR is the best of both captioning worlds: innovative technology and expert human oversight. In this blog, we’ll cover exactly what Tuned ASR is, how it works, and its advantages over standard live captioning workflows.

What is Tuned ASR?

Imagine you’re talking about finance on your live program. You would probably want your live captions to read off numbers as numerals, such as “3” instead of “three.” Now, standard ASR solutions hear the word “three,” so they transcribe “three.” And while that’s not wrong, it’s not exactly what you want for your audience. 

Live professional captioners can do these kinds of common-sense conversions on every live event they work on, but when budgets and timelines are tight, you may not have the opportunity to use that kind of solution.

With this in mind, we decided to see if we could bottle the common sense and context cues that live professional captioners use in their work and apply it to ASR captions. The result? Tuned ASR.

Tuned ASR is an enhanced ASR captioning service that pairs our speech modeling expertise with top-quality 3Play ASR. Our managed ASR solution provides highly accurate captions at a fraction of the cost of standard live human captioning. 

With Tuned ASR, we mold our speech engine to give you captions that are most relevant to your audience. This can include providing the ASR system with correct spellings, translations of written numbers to numerals, and a variety of other formatting options.

Using this information and existing sample content, we train the ASR system to build your customized speech engine model and fine-tune it prior to going into production. 

But Tuned ASR isn’t a solution where we set it and forget it once it’s fine-tuned enough for production. We continuously make updates to the model daily, weekly, or monthly, depending on your content’s evolving needs. 

The ultimate guide to closed captioning for broadcast TV and streaming 📺

How Tuned ASR Works

Tuned ASR is a dynamic approach to live ASR captioning powered by technology and human experts. But how does Tuned ASR work? There’s three key steps: preparation, captioning, and re-tuning.

Preparation 

3Play’s speech model experts fine-tune our ASR engine using a sample of your standard content. Our team continues to curate the speech model by building out dictionaries, vocabularies, and configuring settings to support a high level of live captioning accuracy and quality.

Captioning 

Your customized Tuned ASR is ready to go live! Any pre-scheduled live events are programmed to ensure Tuned ASR captioning coverage for defined start and end times. We then push your content through the live production process and deliver captions via any broadcast captioning encoder, including iCap, Modem, or Telnet. 

That doesn’t mean that we press a button and leave you on your own for the duration of your event. A dedicated project manager is available for support throughout your engagement, ensuring you are set up for success.

After your event, caption files and transcripts are made available for download and can be upgraded to meet any post-production content needs.

Re-tune

Tuned ASR is not a one-and-done solution–it’s always learning. Our experts continuously re-tune your speech model to meet your needs. This means that we can make updates to your model monthly, weekly, or even daily, depending on your needs.

Sounds great! Can I push all of my content through Tuned ASR?
Tuned ASR is great! But it’s not for all content needs. Breaking events, for example, are not ideal candidates for Tuned ASR use at this time. That being said, when you use Tuned ASR for your content, you gain access to 3Play’s 24/7 captioning support, ensuring you have live professional captioning coverage for those unplanned, non-standard breaking events.

Why Tuned ASR?

Tuned ASR captioning has quite a few advantages over traditional ASR caption solutions and human captioning workflows.

Accurate ASR Captions 

Tuned ASR is a solution that leverages the best of humans and technology to increase the accuracy of ASR captions. 3Play’s expertise in curating speech models, combined with our best-in-class ASR means that Tuned ASR can provide live captions at higher accuracy levels for live broadcast television.

Standard ASR solutions on broadcast television are somewhat infamous for producing inaccurate captions and providing little ability to fine-tune the output. By introducing Tuned ASR into the equation, you gain the peace of mind that your captions meet accuracy standards for live captions.

Cost-effective ASR Captions in a Changing Industry

Live professional captioning workflows tend to be more costly, which is tough to balance alongside declining ad revenues and increased platform destinations. Tuned ASR is a budget-friendly captioning solution that keeps humans in the picture and enables you to get accurate live captions at a fraction of the cost of live professional captioning workflows.

Safeguarding Brands

Live captioning bloopers can be damaging to networks by drawing the ire of viewers and the FCC. Our Tuned ASR dictionaries are pre-built with attention to sensitive words and auto-detect areas needing punctuation and speaker changes. But Tuned ASR doesn’t stop there. 

When you need to ensure captions are right on point, our experts work with you to re-tune your speech model to suit your program’s needs. And in the event that your live program needs more than Tuned ASR can provide, you can easily pivot to live professional captions.

3Play Media’s Tuned ASR is an innovative managed ASR solution that offers a middle ground between traditional live ASR captioning and live human captioning workflows.

Tuned ASR offers highly accurate live captions at a fraction of the cost of traditional live captioning workflows. Plus, Tuned ASR offers continuous learning and fine-tuning, so you can be sure that your captions are always relevant for your viewers. 

With the ability to safeguard your brand and provide an enhanced live viewing experience for all audiences, Tuned ASR is a valuable solution for broadcasters looking to provide accessible content to their viewers.

Closed Captioning Best Practices for Media and Entertainment: Read the eBook

3Play Media logo

Subscribe to the Blog Digest

Sign up to receive our blog digest and other information on this topic. You can unsubscribe anytime.



By subscribing you agree to our privacy policy.