18 Apr 2017

[Blog] Stream Latency

Hi readers, as promised by Balder, this blog post will be about latency.

When streaming live footage, latency is the amount of time that passes between what happens on camera, and the time that it is shown on the stream where it is watched. And while there are cases where artificial latency is induced into the stream to allow for error correction and selecting the right camera to display at the correct time, in general you want your latency to be as small as possible. Apart from this artificial latency, I will cover some major causes of latency encountered when handling live streaming, and the available options for reduction of latency in these steps.

The three main categories where latency is introduced are the following:

Encoding latency

The encoding step is the first in the process when we follow our live footage from the camera towards the viewer. Due to the wide availability of playback capabilities, H.264 is the most common used codec to encode video for consumer-grade streams, and I will therefore mostly focus on this codec.

While encoders are becoming faster at a rapid pace, the basic settings for most of them are geared towards optimization for VoD assets. To reduce size on disk, and through this reduce the bandwidth needed to stream over a network, most encoders will generate an in-memory buffer of several packets before sending out any. The codec allows for referencing frames both before and after the current for its data, which allows for better compression, as when the internal buffer is large enough, the encoder can pick which frames to reference in order to obtain the smallest set of relative differences to obtain it. Turning off the option for these so-called bi-predictive frames, or B-frames as they are commonly called, decreases latency in exchange for a somewhat higher bandwidth requirement.

The next bottleneck that can be handled in the encoding step is the keyframe interval. When using a codec based on references between frames, sending a 'complete' set of data on a regular interval helps with decreasing the bandwidth necessary, and is therefore employed widely when switching between different camera's on live streams. It is easily overlooked however, that these keyframe intervals also affect the latency on a large scale, as new viewers can not start viewing the stream unless they have received such a full frame — they have no data to base the different references on before this keyframe. This either causes new viewers to have to wait for the stream to be viewable, or, more often, causes new viewers to be delayed by a couple of seconds, merely because this was the latest available keyframe at the time they start viewing.

Playback latency

The protocol used both to the server hosting the stream and from the server to the viewers has a large amount of control over the latency in the entire process. With many vendors switching towards segment based protocols in order to allow for using widely available caching techniques, the requirement to buffer an entire segment before being able to send it to the viewer is introduced. In order to evade bandwidth overhead, these segments are usually multiple seconds in length, but even when considering smaller segment sizes, the different buffering regulations for these protocols and the players capable of displaying them causes an indeterminate factor of latency in the entire process.

While the most effective method of decreasing the latency introduced here is to avoid the use of these protocols where possible, on some platforms using segmented protocols is the only option available. In these cases, setting the correct segment size along with tweaking the keyframe interval is the best method to reduce the latency as much as possible. This segment size is configurable through the API in MistServer; even mid-stream if required.

Processing latency

Any processing done on the machine serving the streams introduces latency as well, though often to increase the functionality of your stream. A transmuxing system, for example, processes the incoming streams into the various different protocols needed to support all viewers, and to this purpose must maintain an internal buffer of some size in order to facilitate this. Within MistServer, this buffer is configurable through the API.

On top of this, for various protocols, MistServer employs some tricks to keep the stream as live as possible. To do this we monitor the current state of each viewer, and skip ahead in the live stream when they are falling behind. This ensures that your viewers observe as little latency as possible, regardless of their available bandwidth.

In the near future, the next release of MistServer will contain a rework of the internal communication system, removing the need to wait between data becoming available on the server itself, and the data being available for transmuxing to the outputs, reducing the total server latency introduced even further.

Our next post will be by Jaron, providing a deep technical understanding of our trigger system and the underlying processes behind it.

— Erik