Evolution of video interfaces

The existence of many different audio and video standards necessitates the definition of hardware interfaces, which define the physical characteristics of the connections between the electrical equipment.

A few years ago, composite video was the interface most used at homes and component video in professional environments. However, these interface have been replaced for digital ones, with or without compression. The differences between the video interfaces are as shown in the following tables, where their use, type of signals and other parameters are compared.

Analog video interfaces

Analog video interfaces

Digital Video Interfaces

Digital Video Interfaces

Compressed video interfaces

Compressed video interfaces

 

Television signaling standards

Emitters and receivers use the signal information to share information about the network, frequencies of the multiplex, services, channels guide… Each digital television broadcast standard uses a different signaling tables and descriptors to provide information to the receiver though all the signaling data is based on the MPEG-2 standard. 

PSI (MPEG-2)

The Program Specific Information (PSI) contains metadata about the programs (channels) and it is part of MPEG-2 TS. The PSI data contains the following tables:

  • PAT (Program Associated Table, PID=0x00): location of all the programs contained in the TS. It shows the association of PMT PID and Program Number.
  • PMT (Program Mat Table, tableId=0x02): PID numbers of ES associated with the program and information about the type of ES (audio, video or data).
  • CAT (Condition Access Table, PID=0x01, tableId=0x01): It is used for the conditional access and provides association with EMM stream.
  • NIT (Network Information Table): information about the multiplexes and transport streams on a given network.
  • TSDT (Transport Stream Descriptor Table): information about services and associated MPEG-2 descriptors in the TS. Descriptors are used for additional information.

SI (DVB)

The Service Information (SI) are additional tables used in the DVB standard to identify the services and the associated events contained in the TS. Most relevant signaling tables are:

  • NIT (Network Information Table): Information about the physical network such as network provider name, transmission parameters…
  • BAT (Boutquet Association Table): describes the program structure of several phyisical channels.
  • SDT (Service Description Table): describes the program structure or one physical channel.
  • EIT (Event Information Table): contains the program guide (EPG). There are 4 EITs to cover a period of 12h.
  • TDT (Time and Date Table) and TOT (Time O set Table): Information about time and date.
  • RST (Running Statuts Table): allows rapid updating of the timing status of one or more events
  • ST (Stung Table): features of the packetization.
  • Other tables:
    • DIT (Discontinuity Information Table) and SIT (Selection Information Table) are used in storage environments.
    • AIT (Application Information Table) is used in interactive applications.
    • INT (IP/MAC Noti cation Table) is used in transmission.
    • UNT (Update Notifi cation Table) is used for System Software Updates.

PSIP (ATSC)

PSIP is the protocol used in the ATSC television standard in United States, Canada and Mexico. It is based on MPEG-2 to encode the content but defi nes new signaling tables:

  • STT (System Time Table): current time.
  • MGT (Master Guide Table): data pointers to other PSIP tables.
  • VCT (Virtual Channel Table): de nes each virtual channels and enables EITs to be associated with the channel.
  • RRT (Rating Region Table): content ratings for each country or region.
  • EIT (Event Information Table): titles and program guide data
  • ETT (Extended Text Table): detailed descriptions of channels and aired events.
  • DCCT (Directed Channel Change Table): allows automatic changes to the channel width in response to noise conditions.
  • DCCSCT (Directed Channel Change Selection Code Table): update states, counties and program genres used in DCCT tables.

SI/PSI (ISDB)

ISDB is the Japanesse boradcast digital TV standard. It is based on MPEG-2 and it uses some PSI tables but also defines new ones:

  •  PSI: PMT, CAT and PAT with speci c descriptors.
  •  Equivalents to SI (DVB) with specif c descriptors]: NIT, SDT, BAT, EIT, TDT, RST, TOT and ST.
  •  ISDB/Tb:
    • PCAT (Broadcaster Information Table): conveys partial content announcement for data broadcasting.
    • BIT (Boutquet Association Table): it is used to submit broadcaster information on network.
    • NBIT (Network Board Information Table): board information on the network such as a guide.
    • LDT (Linked Description Table): it is used to link various descriptions from other tables.

The ISDB Japanese standard also use extended SI Information to describe local events and their information. It uses the LIT (Local Information Table), ERT (Event Relation Table) and ITT (Program Index Transmission Information Table). It also defines new descriptors to add new functionalities.

Signal tables in the DTMB standard

DTMB is the standard used in China, Hong Kong and Middle East countries. It is similar to DVB in terms of service information but there are some di fferences in the transmission parameters as you can see in the previous post.

Digital television broadcast standards

Nowadays many countries are replacing broadcast analog television with digital television. Digital standards use narrower bandwidth signal transmission which allows to fit more channels in a certain range of frequencies and higher resolutions. Several regions of the world are in different stages of adaptation and are implementing different broadcasting standards. DVB is the suite of open standards used in Europe and has some variations used in Japan (ISDB) or USA (ATSC). Other countries, like China (DTMB) developed their own standard.

DVB

DVB defines a a suite of  standards using different coding and modulation techniques and standards to allow the transmission of the signal in different environments and conditions. This standard was developed in Europe and it is internationally accepted for most of the countries. The suite of standards for digital television includes:

  • DVB-S: DVB for satellite television. It provides error protection to the MPEG-2 TS and adapts it to the channel characteristics. DVB-S use TDMA with a single carrier and QPSK modulation.
  • DVB-T: DVB for the digital terrestrial television. It uses MPEG-2 TS using COFDM modulation to reduce the ISI and the fading e ffect that appears in terrestrial communications. Several parameters can be chosen for a DVB-T transmission channel such as the bandwidth (6, 7 or 8MHz) and the operation modes (2K or 8K).
  • DVB-C: DVB over cable. It is similar to DVB-S with 64QAM modulation and without adding error protection to the MPEG-2 TS due to the channel characteristics.
  • DVB-H: DVB for handhelds. It use 4K COFDM because of the handhelds low energy consumption and mobility robustness.
  • DVB-SH: DVB Satellite services to Handhelds. The satellite downlink guarantees rural coverage and the terrestrial downlink is used in urban environments.
  • DVB-S2: DVB-Satellite 2nd generation. It is the successor for the DVB-S system. It includes enhanced modulation (QPSK, 8PSK, 16APSK and 32APSK) schemes and higher bitrates.
  • DVB-T2: DVB-Terrestrial 2nd generation. It is the extension of the DVB-T standard with higher bitrate and better usage of spectrum. It use OFDM with a large number of sub-carriers and several modes.
  • DVB-T2 Lite: New profi le of DVB-T2 for very low capacity applications such as mobile broadcasting. It is based on a limited subset of the modes of the T2 pro file and by avoiding modes which require the most complexity and memory.

ISDB

ISDB-T is the Japanese standard for digital TV. It is an extension of the DVB-T standard. It  uses MPEG-2 and the same DVB-T coding and COFDM modulation.This standards enables  hierarchical transmission which allows partial reception for mobile TV.

ATSC

ATSC standards are a set of standards for digital television used in USA, Canada and Mexico. It also uses MPEG-2 codification like DVB but it uses new modulation techniques. The stream can be modulated on 8VSB (terrestrial) or 16VSB (cable TV) modulation which consist on modulate a sinusoidal carrier to one of eight or sixteen levels allowing high spectral efficiency and impulse noise immunity.  ATSC signals use 6MHz bandwidth and achieve a throughput of 19.4Mbps.

DTMB

DTMB is the TV standard used in China, Hong Kong and Middle East. The system use advanced technologies like a pseudo-random noise code, low-density parity check
encoding to protect again mistakes, modulation TDS-OFDM. The system gives flexibility to the services off ered where diff erent modes and parameters can be chosen depending on the type of service and network.

 

Video coding standards

Digital video has become main stream and is being used in a wide range of applications including video streaming, digital TV or video storage. These applications use different video coding standards which use efficient video compression algorithms and techniques based on temporal and spatial redundancy as I explained in the previous post. Nowadays the most used video standards are MPEG-2 for digital television and H.264 for video streaming. The new standard H.265 also called HEVC will be used soon.

MPEG-2

tv-mpeg2

MPEG-2 is the generic video and audio coding for TV broadcasting of SDTV and HDTV among other applications. It uses three frame types to group the video. A group of pictures (GOP) setting defines the pattern of the three frame types used:

  •  I-frame (intra coded): the reference picture is independent of other pictures. Each GOP begins with an I-frame.
  •  P-frame (predictive coded): contains motion-compensated di fference information from the preceding I or P-frame.
  •  B-frame (bidirectionally predictive coded): contains di fference information from the preceding and following I or P-frame.

The elementary stream contains the video content and it is processed as:

  • Sequence: Grup of GOPs. The header contains the resolution, aspect ratio and bitrate. The sequence extension (optional) de fines the profi le and level, type of frames and the chrominance format.
  • GOP: series of frames.
    ES
  • Frame: series of consecutive slices. 
  • Slice: Resynchronization unit which contain a series on consecutive macroblocks.
  • Macroblock: Motion compensation unit which contains a section of blocks (4 blocks: 16×16).
  • Block: DCT unit which contains de 8×8 pixels.

MPEG-2 has differents profiles and levels. The profiles define the subset of features such as compression algorithm, chroma format, etc. The levels define the subset of quantitative capabilities such as maximum bit rate, maximum frame size, etc. Each application use a different profile an level, for example the DVB standard uses the Main Profile and Main Lever for TV broadcasting.

MPEG-4 AVC/H.264

iphone-mpeg4

MPEG-4 was aimed primarily at low bit-rate video communications. However, it evolved to a multimedia coding standard with an object oriented approach. It improves coding efficiency over MPEG-2 with error resilience to enable robust transmissions.

H.264 uses both Intra and Inter prediction with motion compensation blocks ranging from 16×16 to 4×4 samples. The elementary stream uses the same structure as MPEG-2 from a DCT block to a sequence but it defines two new types of slices apart of the I, P and B slices:

  •  SP slices contains I and P macroblocks.
  •  SI slices contains macroblocks (a special type of intra coded macroblock).

H.264 defines the following profiles and layers

  •  Baseline profile: used in live video.
  •  Main profile: used for standard-de nition digital TV broadcasts.
  •  Extended profile: used for Internet multimedia applications.
  •  High profile: used for broadcast and disk storage applications.
  •  VCL (Video Coding Layer): it is designed to efficiently represent the video content.
  •  NAL (Network Abstraction Layer): it encapsulates the VCL representation of video with header information in such a way that a variety of transport layers and storage media can easily adopt compressed contents.

H.265 – HEVC

H.265 is a draft video compression standard, a successor to H.264/MPEG-4 AVC. H.265 improve video quality, video efficiency and complexity. Some of the new futures are:

  • 3d-h265

     Intra prediction with 34 directional modes. 

  •  Better motion vector compensation.
  •  Parallel entropy coding
  •  Multiview video coding (3D)
  •  Large block structures
  •  Alternative transforms
  •  In-loop filtering.

Video coding techniques

Video takes up a lot of space. One minute of uncompressed video takes up about 60GB. Because of that, video must be compressed. Usually, lossy compression is more useful,  because lossless compression can lose a relatively large amount of data before you start noticing a difference. However, there are a lot of factors to take in consideration such as the video quality, the quantity of data needed, the robustness to data losses and errors, the complexity of encoding and decoding algorithms…

There are several techniques used to reduce the temporal and spatial redundancy of video sequence.

  • Reduce the number of bits used for the quantification.
  • Reduce the color resolution. Usually it uses 4:2:2 or 4:2:0 resolution.
  • Compress the image in the frequency domain using the DCT.
  • Motion compensation techniques.
  • Variable-length codes based on the probability of occurrence for each possible value of the sensor symbol.
  • Use of predictive coding techniques due to the similarity of straight data.

vido_coding

Spatial coding

Every frame is compressed in the frequency domain. Once compressed, all series of frames are also compressed using predictive coding and motion compensation techniques. The steps to encode each frame during the video coding can be summarized into:

  1. The frame is divided into blocks, usually 8×8 pixel blocks. 
  2. Every block is separately encoded in the frequential domain using the DCT, a domain where the representation is more compact. The result is an 8×8 transform coefficient array in which the top-left element is the DC component (zero frequency) and entries with increasing vertical and horizontal index values represent higher vertical and horizontal spatial frequencies.
  3. Every 8×8 coefficient array is quantizated using quantization tables. The quantization steps are used for each transform coefficient based on the visual perception. As a result, the area of high frequency (bottom-right) contains several coefficients set to zero, that can be compressed.
  4. The 8×8 quantization blocks are scanned using Zigzag scanning.
  5. The coefficients are transmitted using run-length encoding or Huffman coding

encoding

Temporal coding

Video sequences present a large amount of temporal redundancy. Usually neighbor images (the previous and next frame) are almost the same so a frame can be predicted using the closest frames. Different prediction techniques are used:

  • Intra mode prediction: the image is coded using spatial coding with DCT coeffficients.
  • Inter mode prediciton: it estimates the motion and the scene using the decoded images.

The motion estimation process consists on:

  1. Dividing the image in macroblocks of 16×16 pixels. 
  2. Searching each macroblock in the reference image. This process is know as Block Matching. If it uses backward prediction (prediction from an early frame) all pixels in the current image have a reference in the previous one but if it uses forward prediction some pixels in the current image may not have a reference in the previous one.
  3. Extracting the motion vectors and the DCT error coefficients. Motion vectors provide an offset from the coordinates of the macroblock in the decoded pictures to the coordinates in the reference frame. The diference betwen the macroblock in the decoded picture and the reference is coded using the DCT, called DCT error coefficients.

motion_estimation

All the video coding standards, such as MPEG-2 or MPEG-4, use both temporal and spatial coding combined with advanced coding techniques. The most useful video standards are detailed on the next post.

Control of quality in the control center

The control center is the core of an audiovisual production studio. Main functions consists on routing all the audio and video signals, synchronization, analysis and quality control of all audio and video signals. ControlCenterIt includes the direction control equipment, audio control equipment, editing systems, post-production systems, continuity control and control center equipment.

The equipment of the control center is:

  •  Switching matrix: connects N inputs with M outputs.
  •  Quality monitoring systems: vectorscope and waveform monitor to check the signals.
  •  Processor and standard conversion: digital processing to changing one type of TV signal to another (50Hz to 60Hz, 625 lines to 575 lines, 4:3 to 16:9).
  •  Network equipment: satellite receivers and other receiver systems to code, modulate and send the signals.

Measures of the video signal

Several testing intruments are used to verify the content of the video signal:

  • Waveform monitor to check the level each video signal component (luminance) respect to time. It also uses the black-burst to sinchronize the signal.
  • Vectorscope displays a X-Y plot of the B-Y and R-Y signals to visualize chrominance. It is used to check tone and saturation variations.
  • Lightning measure to check gain and delays between R-Y and B-Y, which causes tone variations.
  • Diamond measure to prevent illegal colors in the RGB space (color gamut).
  • Arrowhead measure displays a representation of the Y signal and the R-Y and B-Y modulated signals. It is used to check the color values after modulation in the PAL standard.

Measure of transmission parameters

TransmissionWhen the signal is transmitted through the equipment and outside the control center, there is a degredation of the signals. Because of this degredation and the limitation of the system some parameters have to be verified using different measures:

  • Eye pattern measured in a waveform monitor to analyze the duration, amplitude, synchronization with the system clock, noise, BER…of the digital signal.
  • Jitter measurement gives a DC value of the phase modulation caused by the temporal instability.
  • BER measurements using the auxiliary data from lines 5 and 318.

Audio measurements

Like the video signal, the audio signal could have distortions and degredations. The testing equipment is used to normalize all audio levels from different sources, to detect distortions and measure the sound pressure:

  • VUMeter measure the sound pressure.
  • PPM detects peaks of the sound pressure that can produce distortions in the audio signal.
  • Sonority normalize the audio levels from digital tv, radio, program changes, commercials… TheR-128 standard normalizes sonority in broadcasting systems.

AudioSignal

Microsoft Kinect in healthcare

Microsoft Kinect has transformed the way people interact with technology. The combination of the camera, the depth sensor and the microphone enable new applications and functionalities such as 3D motion capture or speech recognition. New applications are emerging.

In the medical field, kinect sensor offers great possibilities to the treatment and prevention of disease, illness or injury. Most of applications in the medical field, specially in eHealth (healthcare through the use of technology), use the kinect for tracking patient’s movements in order to perform rehabilitation or patient’s monitoring.

Physical Theraphy and Rehabilitation

Red Hill Studios from University of California is researching about how Kinect can be implemented to help people with Parkinson’s disease. They have developed specialized games to improve the gaith and balance of people with functional impairments and diseases.

Home Training System for Rehab Patients (HTSRP) motions and captures the physical exercises of the patient using a Kinect. It presents a 3D version of the person, analyzes the movements  to pre-detect if physical exercises are performed right or wrong and gives visual feedback on the screen. Jintronix, like HTRSP and other companies, performs rehabilitation in a virtual environment.

Telemedicine

Medicine and diagnose can be performed remotely for people who lives a long way from the hospital. A few months ago Microsoft launched InAVate, a platform which allows group therapy session using avatars.

Collaboration and Annotation of Medical Images (CAMI) of University of Arcasas uses Microsoft Kinect to capture the data, which can be checked on a Windows device such as a tablet, phone or computer.

University of South Florida developed a Robotic Telemedicine System using the kinect sensor for mapping the environment and plan a path and trajectory. The telemedicine platform also provides video communications for doctor-patient communication through the camera and microphone’s kinect.

Remote patient monitoring

Remote patient monitoring and telehealth are used in chronic diseases like heart disease, diabetics or asthma. Cognitiont’s Global Technology is a remote health monitoring solution for multiple sclerosis physiotherapy at home.

Medical applications

Microsoft Kinect is also used as a hands free tool to control medical imaging equipment. Toronto’s Sunnybrook Hospital uses Kinect to move and zoom X-Ray and MRI without touching the screen and leaving the sterile area, which is very useful for surgeons and interventional radiologist. Getstix also has developed a touchless gestural interface for the surgeons and interventional radiologists.

Researchers from University of Konstanz have made a step forward with NAVI system. They use the Kinect camera and depth sensor with a laptop, a vibrating bell and a bluetooth headset to avoid obstacles for blind people.