ISO/IEC 23003-2:2010
(Main)Information technology — MPEG audio technologies — Part 2: Spatial Audio Object Coding (SAOC)
Information technology — MPEG audio technologies — Part 2: Spatial Audio Object Coding (SAOC)
ISO/IEC 23003-2:2010 specifies the reference model of MPEG Spatial Audio Object Coding (SAOC): an efficient parametric coding technology designed to encode, transmit, and interactively render multiple audio objects for playback with various kinds of channel configurations (mono, stereo, 5.1, headphones/binaural). Rather than performing a discrete coding of the individual audio input signals, MPEG SAOC captures the perceptually relevant properties of audio signals into a compact set of parameters that are used to synthesize a flexibly rendered audio scene from a transmitted downmix signal. MPEG SAOC extends MPEG Surround in a way that provides several significant advantages in terms of additional functionality available to users. It allows the user on the decoding side to interactively control the multi-channel rendering of each individual audio object on different kinds of sound reproduction setup. In addition, MPEG SAOC inherits many advantages of MPEG Surround technology, like transmission (in a backward compatible way) of complex multi-object audio content at bitrates not much higher than what is required for its mono or stereo downmix. MPEG SAOC processing effectively reuses the multi-channel rendering functionality of MPEG Surround in a computationally efficient manner. Therefore, MPEG SAOC technology can be directly used to extend MPEG Surround and upgrade existing distribution infrastructures for stereo or mono audio content (teleconferencing systems, music downloads, Internet streaming, etc.) towards the delivery of audio content while retaining full compatibility with existing receivers. Rendering can be interactively controlled by the end-user and is independent of the playback system setup. Key features of MPEG SAOC are: interactive rendering of audio objects on the decoder/receiver side; transmitted SAOC bit stream is independent of loudspeaker (or headphones) configuration; low-power processing mode (e.g. for applications on portable devices); low-delay processing mode (e.g. for communication applications); flexibly selectable bitrate overhead, allowing scalability from low bitrate applications such as Internet streaming to high-quality applications such as custom remix of music; it can be applied upon audio using any coding scheme; backward compatibility: the default downmix is always available for legacy playback devices.
Technologies de l'information — Technologies audio MPEG — Partie 2: Codage d'objet audio spatial (SAOC)
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 23003-2
First edition
2010-10-01
Information technology —
MPEG audio technologies —
Part 2:
Spatial Audio Object Coding (SAOC)
Technologies de l'information — Technologies audio MPEG —
Partie 2: Codage d'objet audio spatial (SAOC)
Reference number
©
ISO/IEC 2010
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2010
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2010 – All rights reserved
Contents Page
Foreword .v
Introduction.vi
1 Scope.1
2 Normative references.1
3 Terms and definitions .1
4 Symbols, notation and abbreviated terms.3
4.1 Notation .3
4.2 Operations.3
4.3 Constants .3
4.4 Variables.3
4.5 Abbreviated terms.5
5 SAOC overview.7
5.1 Introduction.7
5.2 Basic structure of the SAOC transcoder/decoder .7
5.3 Tools and functionality .9
5.4 Delay and synchronization.10
5.5 SAOC Profiles and Levels .15
6 Syntax.17
6.1 Payloads for SAOC.17
6.2 Definition .29
7 SAOC processing.34
7.1 Compressed data stream decoding and dequantization of SAOC data .34
7.2 Compressed data stream encoding and quantization of MPS data.38
7.3 Time/frequency transforms .39
7.4 Post(processing) downmix compensation.39
7.5 Signals and parameters.39
7.6 Transcoding modes .41
7.7 Decoding modes.49
7.8 EAO processing.53
7.9 DCU processing.61
7.10 MBO processing .65
7.11 MCU Combiner.66
7.12 Effects.67
7.13 Low Power SAOC processing.70
7.14 Low Delay SAOC processing .70
8 Transport of SAOC side information.73
8.1 Overview.73
8.2 Transport and signalling in an MPEG environment.73
8.3 Transport of SAOC data over PCM channels .77
9 Transport of predefined rendering information .78
9.1 Introduction.78
9.2 Rendering information description file format.79
Annex A (normative) Tables .80
Annex B (normative) Low Delay MPEG Surround.109
Annex C (informative) Effects processing.119
Annex D (informative) Encoder .121
© ISO/IEC 2010 – All rights reserved iii
Annex E (informative) Guidelines for rendering matrix specification .125
Annex F (informative) MCU Combiner.127
Annex G (informative) Patent statement.129
Bibliography .130
iv © ISO/IEC 2010 – All rights reserved
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
ISO/IEC 23003-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
ISO/IEC 23003 consists of the following parts, under the general title Information technology — MPEG audio
technologies:
— Part 1: MPEG Surround
— Part 2: Spatial Audio Object Coding (SAOC)
© ISO/IEC 2010 – All rights reserved v
Introduction
The International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC)
draw attention to the fact that it is claimed that compliance with this document may involve the use of patents.
ISO and IEC take no position concerning the evidence, validity and scope of these patent rights.
The holders of these patent rights have assured ISO and IEC that they are willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this respect,
the statements of the holders of these patent rights are registered with ISO and IEC. Information may be
obtained from the companies listed in Annex G.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights other than those identified in Annex G. ISO and IEC shall not be held responsible for identifying any or
all such patent rights.
vi © ISO/IEC 2010 – All rights reserved
INTERNATIONAL STANDARD ISO/IEC 23003-2:2010(E)
Information technology — MPEG audio technologies —
Part 2:
Spatial Audio Object Coding (SAOC)
1 Scope
This part of ISO/IEC 23003 specifies the reference model of the Spatial Audio Object Coding (SAOC)
technology that is capable of recreating, modifying and rendering a number of audio objects based on a
smaller number of transmitted channels and additional parametric data. In the preferred modes of operating
the SAOC system, the transmitted signal can be either mono or stereo. The audio objects can be represented
by a mono and stereo signal or have the MPEG Surround (MPS) Multi-channel Background Object (MBO)
format. The additional parametric data exhibits a significantly lower data rate than required for transmitting all
objects individually, making the coding very efficient. At the same time this ensures compatibility of the
transmitted signal with legacy devices.
When a multi-channel rendering setup (e.g. a 5.1 loudspeaker setup) is required, the SAOC system acts as a
transcoder, converting the additional parametric data to MPS parameters, and interfaces to the MPS decoder
that acts as rendering device. For certain rendering setups (e.g. a binaural or plain stereo setup), the SAOC
system behaves as a decoder, using its own rendering engine. Another key feature is that the SAOC
parametric data from different streams can be merged at parameter level to allow for the combination of
SAOC streams, similar to the functionality of a Multi-point Control Unit (MCU).
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO/IEC 13818-7:2006, Information technology — Generic coding of moving pictures and associated audio
information — Part 7: Advanced Audio Coding (AAC)
ISO/IEC 14496-3:2009, Information technology — Coding of audio-visual objects — Part 3: Audio
ISO/IEC 23003-1:2007, Information technology — MPEG audio technologies — Part 1: MPEG Surround
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
3.1
audio object
input audio signal consisting of one, two or multiple channels, including Multi-channel Background Object
(MBO)
3.2
frame
time segment to which SAOC processing is applied according to the data conveyed in the corresponding
SAOCFrame() syntax element
© ISO/IEC 2010 – All rights reserved 1
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.