[Blog] An introduction to OTT: What is OTT anyway?
IBC2017 is around the corner and with it comes a lot of preparation, whether it is familiarizing yourself with the layout, setting up your schedule or even double checking requirements/specifications of that special something you'll need to add to make your project complete. One of the most troubling aspects of the OTT branch is that it is loaded with jargon and covers a tremendously broad spectrum of activities possible which makes it easy to confuse things. With all the possible uses of OTT we thought it might be a good idea to discuss the basics of any OTT project. Luckily in an attempt to make the core basics of OTT a bit more clear our CTO Jaron gave a short presentation covering this at last year's IBC2016. Below is a transcript of his presentation.
What is OTT anyway?
OTT literally stands for Over The Top, which doesn't really tell you much about what it actually is. so to clarify that, it’s anything that is not delivered through traditional cable or receivers, so non traditionally television. To put it simply: video over the internet.
Now I’m going to be using the word media throughout this presentation instead of video as it could also be audio data, metadata or other data similar to that.
To add to that, Internet protocols, set top boxes, HBBTV and similar solutions are also technically examples of OTT even though they are associated with traditionally cable provides and may use a traditional cable receiver to provide it.
We will generalize this by saying OTT is internet based delivery.
Some of the important topics are Codecs, Containers and Transport.
A codec is a way to encode video or audio data, to make it take less space for sending over the internet. As raw data is just not doable.
Containers are methods to store that encoded data and put it in something that can be send over the internet.
Transport is the method you send it over the internet with.
Again: Codecs are the method to encode media for storage/transport.
There’s several popular ones, there’s more obviously, but I’ll list some of the popular ones here. For video you have: H264, the current most popular choice. H265, better known as HEVC, which is the up and coming many people are already switching to this. Then there’s VP8/9 which are the ones Google has been working on. Kind of a competitor to HEVC.
They all make video smaller, but have their individual differences.
For audio you have: AAC, the current most popular OPUS, what i personally think is the holy grail of audio codecs, it can do anything. MP3, it’s on the way out, but everyone knows it which is why it’s mentioned Dolby/DTS, popular choices for surround sound, which are not used over the internet often as most computers are not connected to a surround sound installation.
For subtitles you have: Subrip, which is the format usually taken from DVDs WebVTT, is more or less the Apple equivalent for this. There’s more, but there’s so many it’s impossible to list.
There’s upcoming video codecs: AV1, which is basically a mixture of VP10, Daala and Thor, all codecs in development merged together in what should be the holy grail for video codecs. Since they’ve decided to merge projects together it’s unclear how fast development is going. I expect them to be finished 2017 - 2019’ish.
So how do you pick a codec?
The main reason to pick a codec is convenience. It could be that it’s already encoded in that format or it’s easy to switch to it.
Another big reason is the bitrate, newer codecs generally have a better quality per pit and as internet connections usually have a maximum speed it’s really important to make sure you can send the best quality possible in the least amount of data.
Hardware support is another big reason. Since encoding and decoding is a really processor intensive operation you will want to have hardware acceleration for it. For example watching an H265 HD video would melt any smartphone without hardware support.
Container/transport compatibility, which is really convenient in a way. Some containers or transport can only support a certain set of transport, which means you’re stuck with picking that particular one.
Which brings us to Containers.
Containers dictate how you can mix codecs together in a single stream or file. Some of the pupular choices are:
MPEG TS, which is often used for traditional broadcast
ISO MP4, I think everyone is familiar with this one.
MKV/WebM, enthusiasts of japanese series usually use this as it has excellent subtitle support
Flash, which i consider a container even though it’s not technically a container. Because FLV and RTMP which are the flash formats have the same limitations from each other and limit what you can pick as well.
(S)RTP, which i consider a container even though it’s technically a transport method because it’s common among different transport methods as well.
That brings us to transport methods.
These say how codecs inside their container are transported over the internet. This is the main thing that has an impact on what the quality of your delivery will be.
I’ve split this into three different types of streaming.
True streaming protocols, RTSP, RTMP, WebRTC. What these do is what i consider actual streaming. You connect to them over something proprietary because all of these are protocols that are not by default integrated in players of devices yet, WebRTC should be in the future. As a pro they have a really fast start time, really low latency, they’re great for live. However they need a media server or web server extension to work and they usually, though not always, have trouble breaking through firewalls. Technically the best choice for live, but there’s a lot of but’s in there.
Pseudo streaming protocols, this is when you take a media file and you bit for bit deliver it, not all at once, but you stream it to the end delivery point. Doing so gives you the advantage of having low latency and high compatibility (it can pretend to be a file download) on the other hand you still need a media server or web server extension to deliver this format. It’s slightly easier though and there’s no firewall problems.
Segmented HTTP, which is the current dominant way to deliver media. You see all the current buzzwords of HLS, DASH and fragmented MP4 in there. They are a folder of different segments of video files and each of these segments contains a small section of the file. This has a lot of advantages, it’s extremely easy to proxy, you can use a web server for delivery, but they have the really big disadvantage of having a slow start up time and really high latency. For example HLS is in practise between 20 and 40 seconds of delay. Which is unacceptable for some types of media. All Segmented HTTP transport methods have the same kind of delay, some are a little faster, but you’ll never get subsecond with these.
End of presentation