Information technology — Coding of audio-visual objects — Part 10: Advanced Video Coding

ISO/IEC 14496-10:2003 was developed jointly with the ITU-T in response to the growing need for higher compression of moving pictures for various applications such as digital storage media, television broadcasting, Internet streaming and real-time audiovisual communication. It is also designed to enable the use of the coded video representation in a flexible manner for a wide variety of network environments. It is designed to be generic in the sense that it serves a wide range of applications, bit rates, resolutions, qualities and services. The use of ISO/IEC 14496-10:2003 allows motion video to be manipulated as a form of computer data and to be stored on various storage media, transmitted and received over existing and future networks and distributed on existing and future broadcasting channels. In the course of creating ISO/IEC 14496-10:2003, various requirements from typical applications have been considered, necessary algorithmic elements have been developed, and these have been integrated into a single syntax. Hence, ISO/IEC 14496-10:2003 will facilitate video data interchange among different applications. The coded representation specified in the syntax is designed to enable a high compression capability for a desired image quality. The algorithm is not lossless, as the exact source sample values are typically not preserved through the encoding and decoding processes. A number of techniques are defined that may be used to achieve highly efficient compression. The expected encoding algorithm (not specified in ISO/IEC 14496-10:2003) selects between inter and intra coding for block-shaped regions of each picture. Inter coding uses motion vectors for block-based inter prediction to exploit temporal statistical dependencies between different pictures. Intra coding uses various spatial prediction modes to exploit spatial statistical dependencies in the source signal for a single picture. Motion vectors and intra prediction modes may be specified for a variety of block sizes in the picture. The prediction residual is then further compressed using a transform to remove spatial correlation inside the transform block before it is quantised, producing an irreversible process that typically discards less important visual information while forming a close approximation to the source samples. Finally, the motion vectors or intra prediction modes are combined with the quantised transform coefficient information and encoded using either variable length codes or arithmetic coding. Annexes A through E contain normative requirements and are an integral part of ISO/IEC 14496-10:2003. Annex A defines three profiles (Baseline, Main and Extended), each being tailored to certain application domains, and defines the levels of capability within each profile. Annex B specifies syntax and semantics of a byte stream format for delivery of coded video as an ordered stream of bytes or bits. Annex C specifies the Hypothetical Reference Decoder and its use to check bitstream and decoder conformance. Annex D specifies syntax and semantics for Supplemental Enhancement Information message payloads. Annex E specifies syntax and semantics of the Video Usability Information parameters of the sequence parameter sets of coded video sequences.

Technologies de l'information — Codage des objets audiovisuels — Partie 10: Codage visuel avancé

General Information

Status
Withdrawn
Publication Date
26-Nov-2003
Withdrawal Date
26-Nov-2003
Current Stage
9599 - Withdrawal of International Standard
Start Date
28-Sep-2004
Completion Date
19-Apr-2025
Ref Project

Relations

Standard
ISO/IEC 14496-10:2003 - Information technology -- Coding of audio-visual objects
English language
262 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 14496-10
First edition
2003-12-01
Information technology — Coding of
audio-visual objects —
Part 10:
Advanced video coding
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé

Reference number
©
ISO/IEC 2003
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

©  ISO/IEC 2003
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2003 – All rights reserved

CONTENTS
Foreword. vii
0 Introduction . viii
0.1 Prologue. viii
0.2 Purpose. viii
0.3 Applications. viii
0.4 Profiles and levels . viii
0.5 Overview of the design characteristics.ix
0.6 How to read this specification.x
1 Scope .1
2 Normative references.1
3 Definitions.1
4 Abbreviations .8
5 Conventions.9
5.1 Arithmetic operators.9
5.2 Logical operators.9
5.3 Relational operators.10
5.4 Bit-wise operators.10
5.5 Assignment operators.10
5.6 Range notation.10
5.7 Mathematical functions.10
5.8 Variables, syntax elements, and tables.11
5.9 Text description of logical operations .12
5.10 Processes.13
6 Source, coded, decoded and output data formats, scanning processes, and neighbouring relationships.13
6.1 Bitstream formats.13
6.2 Source, decoded, and output picture formats .14
6.3 Spatial subdivision of pictures and slices.16
6.4 Inverse scanning processes and derivation processes for neighbours .17
7 Syntax and semantics .28
7.1 Method of describing syntax in tabular form .28
7.2 Specification of syntax functions, categories, and descriptors.29
7.3 Syntax in tabular form.30
7.4 Semantics.47
8 Decoding process.81
8.1 NAL unit decoding process.81
8.2 Slice decoding process .82
8.3 Intra prediction process .100
8.4 Inter prediction process .111
8.5 Transform coefficient decoding process and picture construction process prior to deblocking filter
process .133
8.6 Decoding process for P macroblocks in SP slices or SI macroblocks.140
8.7 Deblocking filter process .145
9 Parsing process.155
9.1 Parsing process for Exp-Golomb codes .155
9.2 CAVLC parsing process for transform coefficient levels .158
9.3 CABAC parsing process for slice data.166
Annex A (normative) Profiles and levels .204
A.1 Requirements on video decoder capability.204
A.2 Profiles.204
A.3 Levels.205
Annex B (normative) Byte stream format.212
B.1 Byte stream NAL unit syntax and semantics .212
© ISO/IEC 2003 – All rights reserved iii

B.2 Byte stream NAL unit decoding process . 212
B.3 Decoder byte-alignment recovery (informative). 213
Annex C (normative) Hypothetical reference decoder . 214
C.4 Operation of coded picture buffer (CPB). 216
C.5 Operation of the decoded picture buffer (DPB). 218
C.6 Bitstream conformance. 219
C.7 Decoder conformance. 221
Annex D (normative) Supplemental enhancement information . 224
D.8 SEI payload syntax . 224
D.9 SEI payload semantics. 232
Annex E (normative) Video usability information. 250
E.10 VUI syntax. 250
E.11 VUI semantics. 252
Annex F (informative) Patent Rights . 262

LIST OF FIGURES
Figure 6-1 – Nominal vertical and horizontal locations of 4:2:0 luma and chroma samples in a frame . 15
Figure 6-2 – Nominal vertical and horizontal sampling locations of samples top and bottom fields. 16
Figure 6-3 – A picture with 11 by 9 macroblocks that is partitioned into two slices . 16
Figure 6-4 – Partitioning of the decoded frame into macroblock pairs. . 17
Figure 6-5 – Macroblock partitions, sub-macroblock partitions, macroblock partition scans, and sub-macroblock partition
scans. . 18
Figure 6-6 – Scan for 4x4 luma blocks. 19
Figure 6-7 – Neighbouring macroblocks for a given macroblock. 20
Figure 6-8 – Neighbouring macroblocks for a given macroblock in MBAFF frames. 21
Figure 6-9 – Determination of the neighbouring macroblock, blocks, and partitions (informative) . 22
Figure 7-1 – The structure of an access unit not containing any NAL units with nal_unit_type equal to 0, 7, 8, or in the
range of 12 to 31, inclusive . 52
Figure 8-1 – Intra_4x4 prediction mode directions (informative) . 102
Figure 8-2 –Example for temporal direct-mode motion vector inference (informative) . 121
Figure 8-3 – Directional segmentation prediction (informative) . 122
Figure 8-4 – Integer samples (shaded blocks with upper-case letters) and fractional sample positions (un-shaded blocks
with lower-case letters) for quarter sample luma interpolation. . 127
Figure 8-5 – Fractional sample position dependent variables in
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.