HTML5 Video: How Captioning Works
HTML is the markup language used to render almost every page on the web. HTML5 is the latest version, and it’s replete with incredibly useful features, including a universal video standard that lets developers add video to a web page without using any third party plugins, like Flash. The new standard also makes it much easier to publish accessible video with closed captioning.
How HTML5 Improves Video Accessibility on the Web
Most browsers have adopted the basic video features offered by HTML5, as have popular cloud-based media players like Video.js and JW Player. Some browsers offer better video accessibility than others.
Best of all, HTML5 video offers a standardized caption format that works on all desktop devices and browsers.
Why Was Video Captioning in HTML so Difficult?
In older versions of HTML, there was no standard for rendering a video on a web page. Almost all videos were shown through plugins, like Flash, QuickTime, Silverlight, or RealPlayer.
Without standardization, you ran into compatibility conflicts across different browsers and devices. And although web publishers try to build redundancies and fallback provisions to maximize compatibility, it was practically impossible to publish video that worked universally.
As a consequence, publishing closed captions was difficult and unreliable because both caption format and encoding method depend on the video publishing technology used.
How HTML5 Captioning Works
HTML5 is a major step forward for standardizing video across web browsers and devices, and thus simplifying closed captioning. The idea is that web video will be based on an open, universal standard that works everywhere.
HTML5 natively supports video without the need for third party plugins. A video can be added to a web page using the
videoelement, which makes it almost as simple as adding an image.
trackelement can be used to display closed captions, subtitles, audio descriptions, chapter markers, or other timed-text data.
The HTML code below shows how these elements work:
<video width="320" height="240"> <source type="video/mp4" src="my_video_file.mp4" > <track src="captions_file.vtt" label="English captions" kind="captions" srclang="en-us" default > </video>
The attributes of the track element work like this:
What is the Standard Caption Format for HTML5 Video?
For several years, two caption formats competed for dominance in HTML5 video. In part, this is because there are two groups collaborating on HTML5: The Web Hypertext Application Technology Working Group (WHATWG) and the World Wide Web Consortium (W3C).
WHATWG developed and proposed the WebVTT (Web Video Text Tracks) caption format, which is a new, user-friendly text format that consists of line numbers, timelines, and text with formatting options. WebVTT is similar to the widely established SRT format, but accommodates text formatting, positioning, and rendering options (pop-up, roll-on, paint-on).
Eventually, W3C decided to support WebVTT. WebVTT is now the accepted standard format for HTML5 video.
WebVTT Caption Format
The WebVTT caption format is a text file with a .vtt extension. The file begins with a header “WEBVTT FILE” followed by cues and their corresponding text.
There are several parameters that allow you to control the line position, text position, and alignment. You can also add styling to the text within the cue itself. The example below demonstrates a bold <b> element.
WEBVTT 1 00:00:13.000 --> 00:00:016.100 <strong>ARNE DUNCAN:</strong> I'll start and then turn it over to you. 2 00:00:16.100 --> 00:00:20.100 It's so critically important that parents be actively engaged