Google brings buzz to captions like never before
Just the other day Google announced its intentions to automatically generate closed caption files on a select group of YouTube files. The story quickly made it to the NY Times and all over the blogosphere, as it rightfully should. The idea is to eventually rollout the capability across YouTube for all users to test. With 20 hours of video being uploaded to YouTube every minute, that’s a lot of text being created!
At its core, this is a brilliant move by Google to improve YouTube search (and advertising) capabilities. But Google’s announcement, largely because it’s Google, also puts the accessibility issue in front of the entire country for a change. Captions are mandated for much of television, but they are only beginning to get some attention on the internet, well until now. Representative Ed Markey, the same Congressman who made the original push for closed captioning on television, introduced H.R. 3101, the Twenty-first Century Communications and Video Accessibility Act of 2009, during this session of Congress, and it currently has 19 co-sponsors. This is actually the second attempt at getting a bill passed that would mandate an improved user experience for the hearing impaired.
Thanks to one of the most talked about technology companies of our time, closed captioning is getting attention all over the internet. Anyone who works with online video is now paying attention to closed captioning. Not only are we empowering the hearing impaired, but in a virtual world that seems to be driven by search and discovery, video can now be made more “accessible” than ever.
So for a business that is centered on providing high quality, time synchronized transcripts, what does this announcement mean?
Well, it could mean a lot of things. First, let’s look into this new Google service. Google will deploy the same technology that powers Google Voice across YouTube to enable the creation of text. This means they will be using automatic speech recognition (ASR) to create the caption files. Using ASR on audio and video is not a new concept, but it’s new at this scale. We’ve commented on our experiences with ASR capbilities in the past. In fact, we’ve even played with the very engine that will be front and center for the YouTube initiative.
We’ve spoken with many people who have tested ASR solutions. Usually, if they are talking to us, they weren’t satisfied! The truth of the matter is that ASR will be good enough for some people, and it won’t be good enough for others. 80% accuracy (at its best and in studio quality recording conditions) leaves a lot to be desired. In fact, Google even admits that results can be somewhat amusing when they’re off. On the search front, the most critical keywords tend to be the most unique and, therefore, least common to be recognized accurately. Google’s announcement does not change that, it just makes an ASR solution easier to use and free to consume. In many cases, Google has likely provided a medium for people who may never have put captions on their video with the ability to do so with very little effort. Google has also made the search benefits of captions glaringly obvious.
Ultimately, the organizations that require (or believe in) high quality output for captions and search will be willing to pay for cleaned up text. There are significant benefits to the high-quality approach, whether it be accurate search results or truly legible transcripts. Branding is also a critical issue for many organizations who add a text component to their video offering.
We at 3Play Media will continue building high quality solutions that make multimedia more accessible for everyone. More people than ever are aware of the benefits of captions and time-synchronized transcripts now. We have some new product launches on the way that will build off these very benefits, and we can’t wait to show the world how their online video experience can be changed forever.