Video Search, Schema, and Benefits of Transcription [Interview]
Updated: January 4, 2018
Last week at the Brightcove Play conference, I had the great opportunity to be interviewed by Andy Plesser from BeetTV to discuss the benefits of transcription for online video, our integration with Brightcove, and techniques for implementing transcripts for maximum SEO impact. Below is a full transcript of the interview.
Transcript of Interview
Our mission is to make video accessible, searchable, more engaging, and SEO friendly through innovative captioning, transcription, and translation solutions. The beauty of having video and text together is that you can reach a very large audience. People who are deaf or hard-of-hearing can consume the video. People who want to follow along at their own pace using the transcript can do that. It’s easier to navigate and find what you’re looking for, plus you get all the SEO benefits of having all of that text content indexed by search engines.
So a little background. Traditionally, these services were done either internally or by ma-and-pa shops in a very manual and labor-intensive way. And I think that works well for a few videos here and there. But with the proliferation of online video pretty much everywhere, just the need for having a scalable and reliable solution has increased greatly. And that’s where we fit in.
Unless you transcribe video, it simply just cannot really be indexed by search engines apart from maybe the title and some tags that you associate with the video. Once you transcribe a video, there are a few ways of getting that content indexed by search engines. One way is to actually have just a plain transcript on the page. Or maybe if it’s a longer transcript, it’s probably even better to have that on a separate page and to paginate it out so that you have a large transcript that is broken up into multiple pages to maximize SEO.
But coming back to your question about how search engines index that content, the key is that the transcript needs to appear in the source code of the page in search engines for it to be readable. And so one way to do it is to put it inside noscript tags. And another way, which is actually a new standard, is to use the video schema VideoObject, which allows you to– there’s actually a setting there for a transcript that allows you to add a transcript and to wrap a video player with video schema and to include the transcript that is readable by search engines.
This is a new standard that just came out last year. It was developed by Google in collaboration with Bing and Yahoo, I think. And what it does is it uses microdata. And so you can– it allows you to wrap not just video players, but actually any objects. But in this case, you can use the VideoObject to wrap a video player, and you can associate a variety of different metadata with the video. And one of the parameters is a transcript, so you can actually include a transcript there. And search engines will read that transcript and associate it with that video, so they can learn and know more about the content, the spoken content, of that video.
Our primary goal is really to make the process of captioning, transcription, and translation as easy as possible and to eliminate barriers. And whether it’s enterprise or education or media companies, what we’ve seen is that the resources for managing media are very scarce. For example, Brian Hudson from Sutter Health in a session yesterday talked about how in his organization, it’s a $9 billion company, yet there are only eight people that are tasked with producing, managing, and publishing that video. So people really just don’t have time to deal with manual workflows and compatibility issues.
And to that end, we have developed an enterprise solution that provides a user-friendly account system, flexible APIs, and integrations with a multitude of video players, platforms, and event capture systems. Brightcove is our favorite integration. Really, if you’re using Brightcove, coupled with 3Play Media, the process of creating captions and translations really can’t get any easier.
For example, if you want to caption a video, all you have to do is add a 3Play tag from your Brightcove account and everything else happens automatically. Brightcove sends the video to us. We process it, create captions, and send those captions back to Brightcove. They get re-associated with the video files, and then they just show up.
The way our process works is when we receive a video from Brightcove or from any source, we first put it through speech recognition. And that gets it to the point where it’s about 70% accurate. Subsequently, we have a professional transcriptionist who will go through and clean up the mistakes left behind by the computer. And the last step is that we have a QA person who will go through, research difficult words, and make sure all the grammar and punctuation are correct. So by the time we’re done with it, it’s pretty much flawless. It’s typically 99.5% accurate and time-synchronized word for word. And so that’s sort of our core output.
And from that, we produce a variety of derivative outputs. So we produce captions in many different formats. We produce transcripts in a variety of formats. But those are things that if you’re using Brightcove, you don’t even really have to worry about because the workflow is all automated. It all happens in the back end.
The standard turn-around is four business days. But then there are expedited options as well. You can select two-day turn-around, one-day turn-around. And we recently rolled out a same-day service so you can have your transcripts back within a matter of hours.
Behind the Scenes: The Making of an Accessible Campus at WSU
In April of 2016, Wichita State University (WSU) received a complaint through the Department of Education’s Office for Civil Rights (OCR) over an accessibility issue in a face-to-face classroom setting. Now, as a result, the institution is in the middle of a campus-wide accessibility…
Captioning and Transcription for Online Video Content
More video content is uploaded to the web in one month than TV has created in three decades. By 2019, 80% of the world’s internet traffic will be video. How does any single video stand out in the sea of this much…
Best Practices for Caption Quality
The DCMP defines captioning as “the key to opening up a world of information for persons with hearing loss or literacy needs.” However, not all captions are created equally. Standards and guidelines for captioning quality from the FCC, DCMP, and WCAG can…