Video coding standards

Digital video has become main stream and is being used in a wide range of applications including video streaming, digital TV or video storage. These applications use different video coding standards which use efficient video compression algorithms and techniques based on temporal and spatial redundancy as I explained in the previous post. Nowadays the most used video standards are MPEG-2 for digital television and H.264 for video streaming. The new standard H.265 also called HEVC will be used soon.

MPEG-2

tv-mpeg2

MPEG-2 is the generic video and audio coding for TV broadcasting of SDTV and HDTV among other applications. It uses three frame types to group the video. A group of pictures (GOP) setting defines the pattern of the three frame types used:

  •  I-frame (intra coded): the reference picture is independent of other pictures. Each GOP begins with an I-frame.
  •  P-frame (predictive coded): contains motion-compensated di fference information from the preceding I or P-frame.
  •  B-frame (bidirectionally predictive coded): contains di fference information from the preceding and following I or P-frame.

The elementary stream contains the video content and it is processed as:

  • Sequence: Grup of GOPs. The header contains the resolution, aspect ratio and bitrate. The sequence extension (optional) de fines the profi le and level, type of frames and the chrominance format.
  • GOP: series of frames.
    ES
  • Frame: series of consecutive slices. 
  • Slice: Resynchronization unit which contain a series on consecutive macroblocks.
  • Macroblock: Motion compensation unit which contains a section of blocks (4 blocks: 16×16).
  • Block: DCT unit which contains de 8×8 pixels.

MPEG-2 has differents profiles and levels. The profiles define the subset of features such as compression algorithm, chroma format, etc. The levels define the subset of quantitative capabilities such as maximum bit rate, maximum frame size, etc. Each application use a different profile an level, for example the DVB standard uses the Main Profile and Main Lever for TV broadcasting.

MPEG-4 AVC/H.264

iphone-mpeg4

MPEG-4 was aimed primarily at low bit-rate video communications. However, it evolved to a multimedia coding standard with an object oriented approach. It improves coding efficiency over MPEG-2 with error resilience to enable robust transmissions.

H.264 uses both Intra and Inter prediction with motion compensation blocks ranging from 16×16 to 4×4 samples. The elementary stream uses the same structure as MPEG-2 from a DCT block to a sequence but it defines two new types of slices apart of the I, P and B slices:

  •  SP slices contains I and P macroblocks.
  •  SI slices contains macroblocks (a special type of intra coded macroblock).

H.264 defines the following profiles and layers

  •  Baseline profile: used in live video.
  •  Main profile: used for standard-de nition digital TV broadcasts.
  •  Extended profile: used for Internet multimedia applications.
  •  High profile: used for broadcast and disk storage applications.
  •  VCL (Video Coding Layer): it is designed to efficiently represent the video content.
  •  NAL (Network Abstraction Layer): it encapsulates the VCL representation of video with header information in such a way that a variety of transport layers and storage media can easily adopt compressed contents.

H.265 – HEVC

H.265 is a draft video compression standard, a successor to H.264/MPEG-4 AVC. H.265 improve video quality, video efficiency and complexity. Some of the new futures are:

  • 3d-h265

     Intra prediction with 34 directional modes. 

  •  Better motion vector compensation.
  •  Parallel entropy coding
  •  Multiview video coding (3D)
  •  Large block structures
  •  Alternative transforms
  •  In-loop filtering.