Speech Recognition Gaffe of the Week: Jamaican Vacation Caption Fail
Welcome to the weekend! Time for our weekly installment of auto speech recognition gaffes and goofs.
One of my colleagues brought my attention to the following hilarious video. Two comedians, Rhett and Link, take a common approach to humor examining how communication misfires can lead to laughs… but with a spin. They act out a script and then put that script through YouTube’s automatic caption generator which yields a new, albeit somewhat flawed script. They then act that out, verbatim. Finally, they repeat the process one more time until the original transcript and new script barely resemble each other. (See all of Rhett and Link’s Caption Fail Videos)
It’s clear that some words may have been chosen to lend to hilarity. Obviously a phrase like “Jamaican vacation” would be hard for YouTube’s auto speech recognition software to understand. Still I think we can agree this is a fun angle on caption technology.
It is Google’s priority to create tools that are innately useful and helpful to people, enriching lives. They’ve made great strides in captioning, but the YouTube technology isn’t perfect yet– and that’s okay. In the staggering mission to caption the web’s video, Google and captioning advocates understand the onus lies with content creators to upload their high quality transcripts and captions created by companies like us. This was a point Google accessibility engineer, Naomi Black, brought to light during our webinar with Google and Adobe on accessibility strategies.