Information technology — Coding of audio-visual objects — Part 10: Advanced video coding

This document specifies advanced video coding for coding of audio-visual objects

Technologies de l'information — Codage des objets audiovisuels — Partie 10: Codage visuel avancé

General Information

Status
Withdrawn
Publication Date
15-Dec-2020
Current Stage
9599 - Withdrawal of International Standard
Completion Date
08-Nov-2022
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 14496-10:2020 - Information technology -- Coding of audio-visual objects
English language
859 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/IEC FDIS 14496-10:Version 26-sep-2020 - Information technology -- Coding of audio-visual objects
English language
859 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-10
Ninth edition
2020-12
Information technology — Coding of
audio-visual objects —
Part 10:
Advanced video coding
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
Reference number
ISO/IEC 14496-10:2020(E)
©
ISO/IEC 2020

---------------------- Page: 1 ----------------------
ISO/IEC 14496-10:2020(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-10:2020(E)
Contents
Foreword vi
0 Introduction vii
0.1 Prologue . vii
0.2 Purpose . vii
0.3 Applications . vii
0.4 Publication and versions of this document . vii
0.5 Profiles and levels . ix
0.6 Overview of the design characteristics . ix
0.6.1 General . ix
0.6.2 Predictive coding . x
0.6.3 Coding of progressive and interlaced video . x
0.6.4 Picture partitioning into macroblocks and smaller partitions . x
0.6.5 Spatial redundancy reduction . xi
0.7 How to read this document . xi
0.8 Patent declarations . xi
1 Scope 1
2 Normative references 1
3 Terms and definitions 1
3.1 General terms related to advanced video coding . 1
3.2 Terms related to scalable video coding (Annex F) . 16
3.3 Terms related to multiview video coding (Annex G) . 22
3.4 Terms related to multiview and depth video coding (Annex H) . 27
3.5 Terms related to multiview and depth video with enhanced non-base view coding (Annex I) . 28
4 Abbreviated terms 28
5 Conventions 29
5.1 Arithmetic operators . 29
5.2 Logical operators . 30
5.3 Relational operators . 30
5.4 Bit-wise operators . 30
5.5 Assignment operators . 31
5.6 Range notation. 31
5.7 Mathematical functions . 31
5.8 Order of operation precedence . 32
5.9 Variables, syntax elements, and tables . 33
5.10 Text description of logical operations . 34
5.11 Processes . 35
6 Source, coded, decoded and output data formats, scanning processes, and neighbouring relationships 35
6.1 Bitstream formats . 35
6.2 Source, decoded, and output picture formats . 35
6.3 Spatial subdivision of pictures and slices . 40
6.4 Inverse scanning processes and derivation processes for neighbours . 41
6.4.1 Inverse macroblock scanning process . 41
6.4.2 Inverse macroblock partition and sub-macroblock partition scanning process . 42
6.4.3 Inverse 4x4 luma block scanning process . 43
6.4.4 Inverse 4x4 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 44
6.4.5 Inverse 8x8 luma block scanning process . 44
6.4.6 Inverse 8x8 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 44
6.4.7 Inverse 4x4 chroma block scanning process . 44
6.4.8 Derivation process of the availability for macroblock addresses . 44
6.4.9 Derivation process for neighbouring macroblock addresses and their availability . 45
6.4.10 Derivation process for neighbouring macroblock addresses and their availability in MBAFF frames . 45
6.4.11 Derivation processes for neighbouring macroblocks, blocks, and partitions . 46
6.4.12 Derivation process for neighbouring locations . 51
6.4.13 Derivation processes for block and partition indices . 54
7 Syntax and semantics 55
7.1 Method of specifying syntax in tabular form . 55
© ISO/IEC 2020 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-10:2020(E)
7.2 Specification of syntax functions, categories, and descriptors . 56
7.3 Syntax in tabular form . 58
7.3.1 NAL unit syntax . 58
7.3.2 Raw byte sequence payloads and RBSP trailing bits syntax . 59
7.3.3 Slice header syntax . 67
7.3.4 Slice data syntax . 72
7.3.5 Macroblock layer syntax . 73
7.4 Semantics . 80
7.4.1 NAL unit semantics . 80
7.4.2 Raw byte sequence payloads and RBSP trailing bits semantics . 90
7.4.3 Slice header semantics . 104
7.4.4 Slice data semantics . 115
7.4.5 Macroblock layer semantics. 116
8 Decoding process 129
8.1 NAL unit decoding process . 130
8.2 Slice decoding process . 131
8.2.1 Decoding process for picture order count . 131
8.2.2 Decoding process for macroblock to slice group map . 135
8.2.3 Decoding process for slice data partitions . 138
8.2.4 Decoding process for reference picture lists construction . 139
8.2.5 Decoded reference picture marking process . 146
8.3 Intra prediction process . 150
8.3.1 Intra_4x4 prediction process for luma samples . 150
8.3.2 Intra_8x8 prediction process for luma samples . 157
8.3.3 Intra_16x16 prediction process for luma samples . 164
8.3.4 Intra prediction process for chroma samples. 167
8.3.5 Sample construction process for I_PCM macroblocks . 171
8.4 Inter prediction process . 172
8.4.1 Derivation process for motion vector components and reference indices . 174
8.4.2 Decoding process for Inter prediction samples . 186
8.4.3 Derivation process for prediction weights . 195
8.5 Transform coefficient decoding process and picture construction process prior to deblocking filter process . 197
8.5.1 Specification of transform decoding process for 4x4 luma residual blocks . 197
8.5.2 Specification of transform decoding process for luma samples of Intra_16x16 macroblock prediction
mode . 198
8.5.3 Specification of transform decoding process for 8x8 luma residual blocks . 199
8.5.4 Specification of transform decoding process for chroma samples . 200
8.5.5 Specification of transform decoding process for chroma samples with ChromaArrayType equal to 3 . 201
8.5.6 Inverse scanning process for 4x4 transform coefficients and scaling lists . 202
8.5.7 Inverse scanning process for 8x8 transform coefficients and scaling lists . 203
8.5.8 Derivation process for chroma quantization parameters . 204
8.5.9 Derivation process for scaling functions . 205
8.5.10 Scaling and transformation process for DC transform coefficients for Intra_16x16 macroblock type . 206
8.5.11 Scaling and transformation process for chroma DC transform coefficients . 207
8.5.12 Scaling and transformation process for residual 4x4 blocks . 209
8.5.13 Scaling and transformation process for residual 8x8 blocks . 211
8.5.14 Picture construction process prior to deblocking filter process . 215
8.5.15 Intra residual transform-bypass decoding process . 217
8.6 Decoding process for P macroblocks in SP slices or SI macroblocks . 217
8.6.1 SP decoding process for non-switching pictures . 218
8.6.2 SP and SI slice decoding process for switching pictures . 220
8.7 Deblocking filter process . 222
8.7.1 Filtering process for block edges . 226
8.7.2 Filtering process for a set of samples across a horizontal or vertical block edge . 228
9 Parsing process 234
9.1 Parsing process for Exp-Golomb codes . 234
9.1.1 Mapping process for signed Exp-Golomb codes . 236
9.1.2 Mapping process for coded block pattern . 236
9.2 CAVLC parsing process for transform coefficient levels . 239
9.2.1 Parsing process for total number of non-zero transform coefficient levels and number of trailing ones . 240
iv © ISO/IEC 2020 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC 14496-10:2020(E)
9.2.2 Parsing process for level information . 243
9.2.3 Parsing process for run information . 245
9.2.4 Combining level and run information . 248
9.3 CABAC parsing process for slice data . 248
9.3.1 Initialization process . 249
9.3.2 Binarization process . 273
9.3.4 Arithmetic encoding process . 303
Annex A (normative) Profiles and levels 310
Annex B (normative) Byte stream format 333
Annex C (normative) Hypothetical reference decoder 336
Annex D (normative) Supplemental enhancement information 357
Annex E (normative) Video usability information 445
Annex F (normative) Scalable video coding 465
Annex G (normative) Multiview video coding 689
Annex H (normative) Multiview and depth video coding 755
Annex I (normative) Multiview and depth video with enhanced non-base view coding 804
Bibliography 859

© ISO/IEC 2020 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC 14496-10:2020(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described in
the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details of any
patent rights identified during the development of the document will be in the Introduction and/or on the ISO
list of patent declarations received (see www.iso.org/patents) or the IEC list of patent declarations received
(see http://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration
with ITU-T. The technically identical text is published as ITU-T H.264 (06/2019).
This ninth edition cancels and replaces the eighth edition (ISO/IEC 14496-10:2014), which has been
technically revised. It also incorporates the Amendments ISO/IEC 14496-10:2014/Amd. 1:2015 and
ISO/IEC 14496-10:2014/Amd. 3:2016.
The main changes compared to the previous edition are as follows:
— specification of an additional profile (the Progressive High 10 profile);
— additional colour-related video usability information codepoint identifiers;
— additional supplemental enhancement information messages;
— minor corrections and clarifications throughout the document.
A list of all parts in the ISO/IEC 14496 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.


vi © ISO/IEC 2020 – All rights reserved

---------------------- Page: 6 ----------------------
ISO/IEC 14496-10:2020(E)
0 Introduction
0.1 Prologue
As the costs for both processing power and memory have reduced, network support for coded video data has diversified,
and advances in video coding technology have progressed, the need has arisen for an industry standard for compressed
video representation with substantially increased coding efficiency and enhanced robustness to network environments.
Toward these ends the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group
(MPEG) formed a Joint Video Team (JVT) in 2001 for development of a new Recommendation | International Standard.
0.2 Purpose
This Recommendation | International Standard was developed in response to the growing need for higher compression of
moving pictures for various applications such as videoconferencing, digital storage media, television broadcasting,
internet streaming, and communication. It is also designed to enable the use of the coded video representation in a
flexible manner for a wide variety of network environments. The use of this Recommendation | International Standard
allows motion video to be manipulated as a form of computer data and to be stored on various storage media, transmitted
and received over existing and future networks and distributed on existing and future broadcasting channels.
0.3 Applications
This Recommendation | International Standard is designed to cover a broad range of applications for video content
including but not limited to the following:
 CATV: cable TV on optical networks, copper, etc.
 DBS: direct broadcast satellite video services.
 DSL: digital subscriber line video services.
 DTTB: digital terrestrial television broadcasting.
 ISM: interactive storage media (optical disks, etc.).
 MMM: multimedia mailing.
 MSPN: multimedia services over packet networks.
 RTC: real-time conversational services (videoconferencing, videophone, etc.).
 RVS: remote video surveillance.
 SSM: serial storage media (digital VTR, etc.).
0.4 Publication and versions of this document
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 1 refers to the first approved version of this Recommendation |
International Standard.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 2 refers to the integrated text containing the corrections specified in the
first technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 3 refers to the integrated text containing both the first technical
corrigendum (2004) and the first amendment, which is referred to as the "Fidelity range extensions".
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 4 refers to the integrated text containing the first technical corrigendum
(2004), the first amendment (the "Fidelity range extensions"), and an additional technical corrigendum (2005).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 5 refers to the integrated version 4 text with its specification of the
High 4:4:4 profile removed.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 6 refers to the integrated version 5 text after its amendment to support
additional colour space indicators.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 7 refers to the integrated version 6 text after its amendment to define five
new profiles intended primarily for professional applications (the High 10 Intra, High 4:2:2 Intra, High 4:4:4 Intra,
CAVLC 4:4:4 Intra, and High 4:4:4 Predictive profiles) and two new types of supplemental enhancement information
(SEI) messages (the post-filter hint SEI message and the tone mapping information SEI message).
© ISO/IEC 2020 – All rights reserved vii

---------------------- Page: 7 ----------------------
ISO/IEC 14496-10:2020(E)
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 8 refers to the integrated version 7 text after its amendment to specify
scalable video coding in three profiles (Scalable Baseline, Scalable High, and Scalable High Intra profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 9 refers to the integrated version 8 text after applying the corrections
specified in a third technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 10 refers to the integrated version 9 text after its amendment to specify a
profile for multiview video coding (the Multiview High profile) and to define additional SEI messages.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 11 refers to the integrated version 10 text after its amendment to define a
new profile (the Constrained Baseline profile) intended primarily to enable implementation of decoders supporting only
the common subset of capabilities supported in various previously-specified profiles.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 12 refers to the integrated version 11 text after its amendment to define a
new profile (the Stereo High profile) for two-view video coding with support of interlaced coding tools and to specify an
additional SEI message specified as the frame packing arrangement SEI message. The changes for versions 11 and 12
were processed as a single amendment in the ISO/IEC approval process.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 13 refers to the integrated version 12 text with various minor corrections
and clarifications as specified in a fourth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 14 refers to the integrated version 13 text after its amendment to define a
new level (Level 5.2) supporting higher processing rates in terms of maximum macroblocks per second and a new profile
(the Progressive High profile) to enable implementation of decoders supporting only the frame coding tools of the
previously-specified High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 15 refers to the integrated version 14 text with miscellaneous corrections
and clarifications as specified in a fifth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 16 refers to the integrated version 15 text after its amendment to define
three new profiles intended primarily for communication applications (the Constrained High, Scalable Constrained
Baseline, and Scalable Constrained High profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 17 refers to the integrated version 16 text after its amendment to define
additional supplemental enhancement information (SEI) message data, including the multiview view position SEI
message, the display orientation SEI message, and two additional frame packing arrangement type indication values for
the frame packing arrangement SEI message (the 2D content and tiled arrangement type indication values).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 18 refers to the integrated version 17 text after its amendment to specify
the coding of depth signals, including the specification of an additional profile, the Multiview Depth High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 19 refers to the integrated version 18 text after incorporating a correction
to the sub-bitstream extraction process for multiview video coding.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 20 refers to the integrated version 19 text after its amendment to specify
the combined coding of video view and depth enhancement, including the specification of an additional profile, the
Enhanced Multiview Depth High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 21 refers to the integrated version 20 text after its amendment to specify
additional colorimetry identifiers and an additional model type in the tone mapping information SEI message.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 22 refers to the integrated version 21 text after its amend
...

FINAL
INTERNATIONAL ISO/IEC
DRAFT
STANDARD FDIS
14496-10
ISO/IEC JTC 1/SC 29
Information technology — Coding of
Secretariat: JISC
audio-visual objects —
Voting begins on:
2020-10-02
Part 10:
Voting terminates on:
Advanced video coding
2020-11-27
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/IEC FDIS 14496-10:2020(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
©
NATIONAL REGULATIONS. ISO/IEC 2020

---------------------- Page: 1 ----------------------
ISO/IEC FDIS 14496-10:2020(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC FDIS 14496-10:2020(E)
Contents
Foreword vi
0 Introduction vii
0.1 Prologue . vii
0.2 Purpose . vii
0.3 Applications . vii
0.4 Publication and versions of this document . vii
0.5 Profiles and levels . ix
0.6 Overview of the design characteristics . ix
0.6.1 General . ix
0.6.2 Predictive coding . x
0.6.3 Coding of progressive and interlaced video . x
0.6.4 Picture partitioning into macroblocks and smaller partitions . x
0.6.5 Spatial redundancy reduction . xi
0.7 How to read this document . xi
0.8 Patent declarations . xi
1 Scope 1
2 Normative references 1
3 Terms and definitions 1
3.1 General terms related to advanced video coding . 1
3.2 Terms related to scalable video coding (Annex F) . 16
3.3 Terms related to multiview video coding (Annex G) . 22
3.4 Terms related to multiview and depth video coding (Annex H) . 27
3.5 Terms related to multiview and depth video with enhanced non-base view coding (Annex I) . 28
4 Abbreviated terms 28
5 Conventions 29
5.1 Arithmetic operators . 29
5.2 Logical operators . 30
5.3 Relational operators . 30
5.4 Bit-wise operators . 30
5.5 Assignment operators . 31
5.6 Range notation. 31
5.7 Mathematical functions . 31
5.8 Order of operation precedence . 32
5.9 Variables, syntax elements, and tables . 33
5.10 Text description of logical operations . 34
5.11 Processes . 35
6 Source, coded, decoded and output data formats, scanning processes, and neighbouring relationships 35
6.1 Bitstream formats . 35
6.2 Source, decoded, and output picture formats . 35
6.3 Spatial subdivision of pictures and slices . 40
6.4 Inverse scanning processes and derivation processes for neighbours . 41
6.4.1 Inverse macroblock scanning process . 41
6.4.2 Inverse macroblock partition and sub-macroblock partition scanning process . 42
6.4.3 Inverse 4x4 luma block scanning process . 43
6.4.4 Inverse 4x4 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 44
6.4.5 Inverse 8x8 luma block scanning process . 44
6.4.6 Inverse 8x8 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 44
6.4.7 Inverse 4x4 chroma block scanning process . 44
6.4.8 Derivation process of the availability for macroblock addresses . 44
6.4.9 Derivation process for neighbouring macroblock addresses and their availability . 45
6.4.10 Derivation process for neighbouring macroblock addresses and their availability in MBAFF frames . 45
6.4.11 Derivation processes for neighbouring macroblocks, blocks, and partitions . 46
6.4.12 Derivation process for neighbouring locations . 51
6.4.13 Derivation processes for block and partition indices . 54
7 Syntax and semantics 55
7.1 Method of specifying syntax in tabular form . 55
© ISO/IEC 2020 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC FDIS 14496-10:2020(E)
7.2 Specification of syntax functions, categories, and descriptors . 56
7.3 Syntax in tabular form . 58
7.3.1 NAL unit syntax . 58
7.3.2 Raw byte sequence payloads and RBSP trailing bits syntax . 59
7.3.3 Slice header syntax . 67
7.3.4 Slice data syntax . 72
7.3.5 Macroblock layer syntax . 73
7.4 Semantics . 80
7.4.1 NAL unit semantics . 80
7.4.2 Raw byte sequence payloads and RBSP trailing bits semantics . 90
7.4.3 Slice header semantics . 104
7.4.4 Slice data semantics . 115
7.4.5 Macroblock layer semantics. 116
8 Decoding process 129
8.1 NAL unit decoding process . 130
8.2 Slice decoding process . 131
8.2.1 Decoding process for picture order count . 131
8.2.2 Decoding process for macroblock to slice group map . 135
8.2.3 Decoding process for slice data partitions . 138
8.2.4 Decoding process for reference picture lists construction . 139
8.2.5 Decoded reference picture marking process . 146
8.3 Intra prediction process . 150
8.3.1 Intra_4x4 prediction process for luma samples . 150
8.3.2 Intra_8x8 prediction process for luma samples . 157
8.3.3 Intra_16x16 prediction process for luma samples . 164
8.3.4 Intra prediction process for chroma samples. 167
8.3.5 Sample construction process for I_PCM macroblocks . 171
8.4 Inter prediction process . 172
8.4.1 Derivation process for motion vector components and reference indices . 174
8.4.2 Decoding process for Inter prediction samples . 186
8.4.3 Derivation process for prediction weights . 195
8.5 Transform coefficient decoding process and picture construction process prior to deblocking filter process . 197
8.5.1 Specification of transform decoding process for 4x4 luma residual blocks . 197
8.5.2 Specification of transform decoding process for luma samples of Intra_16x16 macroblock prediction
mode . 198
8.5.3 Specification of transform decoding process for 8x8 luma residual blocks . 199
8.5.4 Specification of transform decoding process for chroma samples . 200
8.5.5 Specification of transform decoding process for chroma samples with ChromaArrayType equal to 3 . 201
8.5.6 Inverse scanning process for 4x4 transform coefficients and scaling lists . 202
8.5.7 Inverse scanning process for 8x8 transform coefficients and scaling lists . 203
8.5.8 Derivation process for chroma quantization parameters . 204
8.5.9 Derivation process for scaling functions . 205
8.5.10 Scaling and transformation process for DC transform coefficients for Intra_16x16 macroblock type . 206
8.5.11 Scaling and transformation process for chroma DC transform coefficients . 207
8.5.12 Scaling and transformation process for residual 4x4 blocks . 209
8.5.13 Scaling and transformation process for residual 8x8 blocks . 211
8.5.14 Picture construction process prior to deblocking filter process . 215
8.5.15 Intra residual transform-bypass decoding process . 217
8.6 Decoding process for P macroblocks in SP slices or SI macroblocks . 217
8.6.1 SP decoding process for non-switching pictures . 218
8.6.2 SP and SI slice decoding process for switching pictures . 220
8.7 Deblocking filter process . 222
8.7.1 Filtering process for block edges . 226
8.7.2 Filtering process for a set of samples across a horizontal or vertical block edge . 228
9 Parsing process 234
9.1 Parsing process for Exp-Golomb codes . 234
9.1.1 Mapping process for signed Exp-Golomb codes . 236
9.1.2 Mapping process for coded block pattern . 236
9.2 CAVLC parsing process for transform coefficient levels . 239
9.2.1 Parsing process for total number of non-zero transform coefficient levels and number of trailing ones . 240
iv © ISO/IEC 2020 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC FDIS 14496-10:2020(E)
9.2.2 Parsing process for level information . 243
9.2.3 Parsing process for run information . 245
9.2.4 Combining level and run information . 248
9.3 CABAC parsing process for slice data . 248
9.3.1 Initialization process . 249
9.3.2 Binarization process . 273
9.3.4 Arithmetic encoding process . 303
Annex A (normative) Profiles and levels 310
Annex B (normative) Byte stream format 333
Annex C (normative) Hypothetical reference decoder 336
Annex D (normative) Supplemental enhancement information 357
Annex E (normative) Video usability information 445
Annex F (normative) Scalable video coding 465
Annex G (normative) Multiview video coding 689
Annex H (normative) Multiview and depth video coding 755
Annex I (normative) Multiview and depth video with enhanced non-base view coding 804
Bibliography 859

© ISO/IEC 2020 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC FDIS 14496-10:2020(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described in
the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details of any
patent rights identified during the development of the document will be in the Introduction and/or on the ISO
list of patent declarations received (see www.iso.org/patents) or the IEC list of patent declarations received
(see http://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration
with ITU-T. The technically identical text is published as ITU-T H.264 (06/2019).
This ninth edition cancels and replaces the eighth edition (ISO/IEC 14496-10:2014), which has been
technically revised. It also incorporates the Amendments ISO/IEC 14496-10:2014/Amd. 1:2015 and
ISO/IEC 14496-10:2014/Amd. 3:2016.
The main changes compared to the previous edition are as follows:
— specification of an additional profile (the Progressive High 10 profile);
— additional colour-related video usability information codepoint identifiers;
— additional supplemental enhancement information messages;
— minor corrections and clarifications throughout the document.
A list of all parts in the ISO/IEC 14496 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.


vi © ISO/IEC 2020 – All rights reserved

---------------------- Page: 6 ----------------------
ISO/IEC FDIS 14496-10:2020(E)
0 Introduction
0.1 Prologue
As the costs for both processing power and memory have reduced, network support for coded video data has diversified,
and advances in video coding technology have progressed, the need has arisen for an industry standard for compressed
video representation with substantially increased coding efficiency and enhanced robustness to network environments.
Toward these ends the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group
(MPEG) formed a Joint Video Team (JVT) in 2001 for development of a new Recommendation | International Standard.
0.2 Purpose
This Recommendation | International Standard was developed in response to the growing need for higher compression of
moving pictures for various applications such as videoconferencing, digital storage media, television broadcasting,
internet streaming, and communication. It is also designed to enable the use of the coded video representation in a
flexible manner for a wide variety of network environments. The use of this Recommendation | International Standard
allows motion video to be manipulated as a form of computer data and to be stored on various storage media, transmitted
and received over existing and future networks and distributed on existing and future broadcasting channels.
0.3 Applications
This Recommendation | International Standard is designed to cover a broad range of applications for video content
including but not limited to the following:
 CATV: cable TV on optical networks, copper, etc.
 DBS: direct broadcast satellite video services.
 DSL: digital subscriber line video services.
 DTTB: digital terrestrial television broadcasting.
 ISM: interactive storage media (optical disks, etc.).
 MMM: multimedia mailing.
 MSPN: multimedia services over packet networks.
 RTC: real-time conversational services (videoconferencing, videophone, etc.).
 RVS: remote video surveillance.
 SSM: serial storage media (digital VTR, etc.).
0.4 Publication and versions of this document
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 1 refers to the first approved version of this Recommendation |
International Standard.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 2 refers to the integrated text containing the corrections specified in the
first technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 3 refers to the integrated text containing both the first technical
corrigendum (2004) and the first amendment, which is referred to as the "Fidelity range extensions".
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 4 refers to the integrated text containing the first technical corrigendum
(2004), the first amendment (the "Fidelity range extensions"), and an additional technical corrigendum (2005).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 5 refers to the integrated version 4 text with its specification of the
High 4:4:4 profile removed.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 6 refers to the integrated version 5 text after its amendment to support
additional colour space indicators.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 7 refers to the integrated version 6 text after its amendment to define five
new profiles intended primarily for professional applications (the High 10 Intra, High 4:2:2 Intra, High 4:4:4 Intra,
CAVLC 4:4:4 Intra, and High 4:4:4 Predictive profiles) and two new types of supplemental enhancement information
(SEI) messages (the post-filter hint SEI message and the tone mapping information SEI message).
© ISO/IEC 2020 – All rights reserved vii

---------------------- Page: 7 ----------------------
ISO/IEC FDIS 14496-10:2020(E)
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 8 refers to the integrated version 7 text after its amendment to specify
scalable video coding in three profiles (Scalable Baseline, Scalable High, and Scalable High Intra profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 9 refers to the integrated version 8 text after applying the corrections
specified in a third technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 10 refers to the integrated version 9 text after its amendment to specify a
profile for multiview video coding (the Multiview High profile) and to define additional SEI messages.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 11 refers to the integrated version 10 text after its amendment to define a
new profile (the Constrained Baseline profile) intended primarily to enable implementation of decoders supporting only
the common subset of capabilities supported in various previously-specified profiles.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 12 refers to the integrated version 11 text after its amendment to define a
new profile (the Stereo High profile) for two-view video coding with support of interlaced coding tools and to specify an
additional SEI message specified as the frame packing arrangement SEI message. The changes for versions 11 and 12
were processed as a single amendment in the ISO/IEC approval process.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 13 refers to the integrated version 12 text with various minor corrections
and clarifications as specified in a fourth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 14 refers to the integrated version 13 text after its amendment to define a
new level (Level 5.2) supporting higher processing rates in terms of maximum macroblocks per second and a new profile
(the Progressive High profile) to enable implementation of decoders supporting only the frame coding tools of the
previously-specified High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 15 refers to the integrated version 14 text with miscellaneous corrections
and clarifications as specified in a fifth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 16 refers to the integrated version 15 text after its amendment to define
three new profiles intended primarily for communication applications (the Constrained High, Scalable Constrained
Baseline, and Scalable Constrained High profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 17 refers to the integrated version 16 text after its amendment to define
additional supplemental enhancement information (SEI) message data, including the multiview view position SEI
message, the display orientation SEI message, and two additional frame packing arrangement type indication values for
the frame packing arrangement SEI message (the 2D content and tiled arrangement type indication values).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 18 refers to the integrated version 17 text after its amendment to specify
the coding of depth signals, including the specification of an additional profile, the Multiview Depth High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 19 refers to the integrated version 18 text after incorporating a correction
to the sub-bitstream extraction process for multiview video c
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.