Parametric stereo

History

Summarize

Perspective

Background

Advanced Audio Coding Low Complexity (AAC LC) combined with Spectral Band Replication (SBR) and Parametric Stereo (PS) was defined as HE-AAC v2. A HE-AAC v1 decoder will only give a mono output when decoding a HE-AAC v2 bitstream. Parametric Stereo performs sparse coding in the spatial domain, somewhat similar to what SBR does in the frequency domain. An AAC HE v2 bitstream is obtained by downmixing the stereo audio to mono at the encoder along with 2–3 kbit/s of side info (the Parametric Stereo information) in order to describe the spatial intensity stereo generation and ambience regeneration at the decoder. By having the Parametric Stereo side info along with the mono audio stream, the decoder (player) can regenerate a faithful spatial approximation of the original stereo panorama at very low bitrates. Because only one audio channel is transmitted, along with the parametric side info, a 24 kbit/s coded audio signal with Parametric Stereo will be substantially improved in quality relative to discrete stereo audio signals encoded with conventional means. The additional bitrate spent on the single mono channel (combined with some PS side info) will substantially improve the perceived quality of the audio compared to a standard stereo stream at similar bitrate. However, this technique is only useful at the lowest bitrates (approx. 16–48 kbit/s and down to 14.4 kbps in xHE-AAC used in DRM) to give a good stereo impression, so while it can improve perceived quality at very low bitrates, it generally does not achieve transparency, since simulating the stereo dynamics of the audio with the technique is limited and generally deteriorates perceived quality regardless of the bitrate.

Development

The development of Parametric Stereo was as a result of necessity to further enhance the coding efficiency of audio in low bandwidth stereo media. It has gone through various iterations and improvements, however, it was first standardized as an algorithm when included in the feature set of MPEG-4 Audio.^[1] Parametric Stereo was originally developed in Stockholm, Sweden by companies Philips and Coding Technologies, and was first unveiled in Naples, Italy, in 2004 during the 7th International Conference on Digital Audio Effects (DAFx'04).^[2]

Approaches

The implementation in MPEG-4 is based on specifying the relative amount, delay, and correlation (coherence) of left and right channels by each frequency band in the mixed mono audio. Special handling is given to transient signals, as the approach would otherwise cause unacceptable delays. Compared to intensity stereo coding, which does not record delay or correlation, PS can provide more ambience.^[2]

Modifications to PS continue to be proposed.

A 2006 conference report describes ways to mitigate the loss of amplitude in downmixing.^[3]
A 2009 paper adds pilot-based coding to PS.^[4]
A 2011 conference paper describes the use of additional "residual information" to record and eliminate PS artifacts.^[5]

MPEG Surround uses a technique related to PS.

Parametric stereo

History

Background

Development

Approaches

See also

References

Wikiwand - on