Immersive Audio Coding

New Technologies demand for new Codecs

New concepts in Spatial Audio Acquisition and Processing require new concepts for the representation and data efficient transmission of spatial audio. The traditional way to describe spatial audio for multimedia content is to use channel-based formats such as Stereo, 5.1 or 7.1. It falls to the content producer to render the content to a fixed playback setup before the data is deployed to the consumer as indicated with (c) in the figure below. Unfortunately, this concept does not meet the needs of today’s inhomogeneous set of playback devices. The same content may be listened to on a smartphone, a traditional stereo hi-fi system, a surround sound system or with headphones.

In order to overcome this problem, the producer may relinquish the loudspeaker rendering to the receiver as shown with cut (a) in the figure. This concept is known as Spatial Audio Object Coding (SAOC). SAOC requires not only each source signal to be transmitted but also a parametric audio object description in order to tell the decoder how to render each sound source. As a consequence, the receiver can adapt to the playback setup in a very flexible way. However, a parametric description of the sound scene is not always possible – e.g., in recorded scenes – or even undesired.

Higher Order Ambisonic (HOA) combines the advantages of both concepts by introducing an abstract sound field description which is independent of both the source and loudspeaker signals. In the figure, this is illustrated with cut (b). HOA is based on the transformation of the direction of incidence of sound in a reference point. It is not only a powerful transformation domain for versatile spatial audio processing but also a data efficient and scalable spatial sound representation.

Versatile 3D Audio Coding with MPEG-H

Audio codecs are required to serve for manifold applications. They have to deal with spatial recordings, gaming, video, teleconferencing and many more. To offer the most flexibility, modern immersive audio codecs such as MPEG-H 3D Audio have to consider all three types of the previously presented spatial audio representations. 

References

Herre, J., Hilpert, J., Kuntz, A. and Plogsties, J.:
MPEG-H 3D Audio: The New Standard for Coding of Immersive Spatial Audio
in: IEEE Journal of Selected Topics in Signal Processing, 2015, Vol. 9, No. 5.

Meltzer, S., Neuendorf, M., Sen, D. and Jax, P.:
MPEG-H 3D Audio – The Next Generation Audio,
in: IBC Conference, (Amsterdam, Niederlande), Sep. 2014.