In this article, I am going to cover one of the most common questions in the multimedia field: what is a standard and what does it cover? Once you have read this article, you will notice how often people, even the most influential ones, use some approximate wordings. Confusion leads to a lot of misunderstanding, so it’s good to review the subject.
A recurrent misinterpretation is about open-source software and open standards. In this article, I will take the AVC/H264 codec as an example.
Beside the coverage of a standard, I’d like to explain how standards link one to the other. I will use the examples of MPEG-DASH and the ISOBMF/MP4 container.
Then I will show what the MPEG-DASH standard is really about.
What is a standard?
A standard is a common ground of understanding. According to Wikipedia, it takes the form of:
a formal document that establishes uniform engineering or technical criteria, methods, processes and practices.
Officially, the word “standard” refers to a formal document ratified by a standards organization. There are not that many (ISO, IEC, ITU). Other groups such as IETF or W3C emit recommendations which are not politically recognized (e.g. not usable by some government agencies).
In the Broadcast multimedia field, the standard usually describes what a client (e.g. decoder) should expect as its input, and sometimes how it shall react to this input. The standard does not describes the server (e.g. encoder)’s behaviour. A compliant implementation of a standard is a client which behaves correctly to any syntax described in the standard.
Note: in other fields, the standard may describe thing differently. For example, IETF (Internet standards) defines the behaviour of the server. And W3C/HTML5 describes the expected behaviour of a decoder even for non-compliant contents.
Typically, a standard should only focus on one layer of the multimedia system. A standard should not force you to combine a transport layer with a specific codec or network protocol. For example MPEG-DASH is not bound to HTTP as a transport protocol or to H264/AVC as a codec. I’ll explain below how other documents take care of linking technologies together.
The standard should not give any rule about the client implementation. But it is common to read some out-of-the-standard parts or annexes about guidelines and use-cases. For example, MPEG-DASH annex A is called “Example DASH client behaviour ” and starts with “The information on client behaviour is purely informative and does not imply any normative procedures”. Annex G is called “(informative) MPD Examples and MPD Usage”.
A standard often comes with a reference software and conformance streams. The exact policy depends on the standardization body.
A standard is open when it is made publicly available, open to participation and doesn’t restrict you from implementing it (you may want to read this alternate review on openness). MPEG/ISO standards are either freely available or can be bought at the ISO Store. They are considered open. Some technologies from Adobe are not, and do not provide any specification: they are de facto standards due to their prominent position on the market, and require reverse-engineering. Obviously, these are just examples.
Beside their openness, open standards may be covered by some patents. Each standardization body may have its own policy regarding patents. W3C requires its members to give up their patents regarding their standards. MPEG makes unofficial calls for patent holders, in order for them to be able to advertise or gather in so-called “patent pools”. The most famous patent pool is called MPEG-LA (note: MPEG-LA is an independent firm, not affiliated with MPEG).
Since MPEG standards are open, any open-source project can implement them. GPAC implements about 40 open standards. x264 implements H264/AVC (MPEG4 part 10), which simply means that x264 output complies with the H264 syntax. You won’t be surprised when I tell you that the H264/AVC standard doesn’t say anything about the encoding process, which is implementation specific.
More generally, open-source multimedia software don’t come from reverse-engineering, nor do they implement a free version of a standard: they conform to the open standards. Some of them, such as x264, FFmpeg or GPAC are even good at it!
Link between standards
Modern standards are layered. A technology often comes as a core standard which is usually not subordinated to other standards.
Then there are two other types of documents:
- Extensions or complementary standards.
- Recommendations or profiles.
Extensions or complementary standards explain how to extend the standard to specific use-cases, or to add new capabilities. Let’s take an example: what we call ‘MP4′ is made of a core standard called ISO Base Media File Format (often referred as ISOBMF). For storage, ISOBMF is in turn extended by other standards for MP4 files, 3GP files, Motion J2K, H264/AVC files, etc.
Note: you probably noticed that when you name a standard (e.g. MP4), you often refer to a set of standards (MPEG 4 parts 12-14-15, respectively ISOBMF, MP4 File Format, AVC File Format). That’s a convenient shortcut in life, but unfortunately it creates shortcuts in the understanding too.
Contrary to the extensions, recommendations and profiles restrict the use of the standard. For example, let’s consider a MP4 file you want to play on an Apple device. MP4Box has a special ‘-ipod’ switch which organizes the ISOBMF container to be compatible with the Apple restrictions. Another example is to force the use of the H264/AVC codec with MPEG-DASH to ensure interoperability. ATSC and DVB also emit recommendations.
The MPEG-DASH case study
I recently watched a 45min video about MPEG-DASH. The speaker talked brightly about the MPEG-DASH structure with a XML playlist, periods, representations, adaptation sets, encryption, media segments, fragments, ISOBMF and MPEG2-TS, H264/AVC profiles and audio… This is not a description of MPEG-DASH, this is rather a description of test-cases for MPEG-DASH.
The MPEG-DASH specification (ISO/IEC 23009-1) mainly specifies the syntax of the XML playlist, called MPD (Media Presentation Description) (don’t worry, I cover the other points below). Therefore the list of test-cases is almost infinite.
As a consequence I have a good news: a compliance checker for MPEG-DASH is an just a XML checker. The other good news is: the XML Schema comes with the standard, which eases MPD validation.
To be clear, MPEG-DASH doesn’t even make mandatory the use of HTTP. Some people may want to implement it over other protocols on high latency networks where the TCP connection would be reset (and they could write a complementary standard). Or one can write a new standard to explain how the WebM container is supposed to integrate with MPEG-DASH.
As stated above, MPEG-DASH doesn’t deal with any network protocol, any codec, etc. However the MPEG-DASH standard is a little more complex: beside its core functionality (describing the MPD), it also contains extensions (i.e. how to use it with HTTP, or the MPEG2-TS and MP4 containers, etc.), plus some profiles/restrictions (Full, ISOBMF On Demand, ISOBMF live, etc.). This leads to confusion in people’s minds.
Since the MPEG-DASH standard was left open, a group of people from the industry (now called the DASH Industry Forum) created their own profiles to create an interoperability point for the Broadcast world. Their documents constraint the use of the container, of the codec, of level/profile of codecs, … Paradoxically, restrictions imply a more complex conformance checking process about the restrictions themselves. Who said the devil is in the details?
The GPAC team and standardization
This activity allows us to provide the first implementations and have a deep understanding of the standards. This is part of what makes GPAC such an interoperable software.
Please leave a comment or contact us if you have any questions.