Information technology - Generic coding of moving pictures and associated audio information - Part 2: Video

ISO/IEC 13818-2:2013 specifies a coded representation of video data and the decoding process required to reconstruct pictures. It provides a generic video coding scheme which serves a wide range of applications, bit rates, picture resolutions and qualities. Its basic coding algorithm is a hybrid of motion compensated prediction and discrete cosine transform (DCT). Pictures to be coded can be either interlaced or progressive. Necessary algorithmic elements are integrated into a single syntax, and a limited number of subsets are defined in terms of Profile (functionalities) and Level (parameters) to facilitate practical use of this generic video coding International Standard.

Technologies de l'information - Codage générique des images animées et du son associé - Partie 2: Données vidéo

Informacijska tehnologija - Splošno kodiranje gibljivih slik in pripadajočih avdio informacij - 2. del: Video

To priporočilo | mednarodni standard določa kodirano upodobitev slikovnih informacij za digitalne medije za shranjevanje in digitalno video komunikacijo ter določa postopke dekodiranja. Upodobitev podpira prenos s stalno bitno hitrostjo, prenos s spremenljivo bitno hitrostjo, naključni dostop, kanalske skoke, prilagodljivo dekodiranje, urejanje bitnih tokov ter posebne funkcije, kot so predvajanje hitro naprej, predvajanje hitro nazaj, počasni posnetek, prekinitev in mirujoče slike. To priporočilo | mednarodni standard je naprej združljiv s standardom ISO/IEC 11172-2 in naraščujoče ali
padajoče združljiv s formati EDTV, HDTV, SDTV.
To priporočilo | mednarodni standard se uporablja predvsem za digitalne medije za shranjevanje ter video oddajanje in komunikacijo. Medij za shranjevanje je lahko z dekodirnikom povezan neposredno ali prek komunikacijskih sredstev, kot so vodila, lokalna omrežja (LAN) ali telekomunikacijske povezave.

General Information

Status
Published
Publication Date
20-Aug-2018
Technical Committee
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
25-Jul-2018
Due Date
29-Sep-2018
Completion Date
21-Aug-2018

Relations

Buy Standard

Standard
ISO/IEC 13818-2:2018
English language
235 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day
Standard
ISO/IEC 13818-2:2013 - Information technology — Generic coding of moving pictures and associated audio information — Part 2: Video Released:9/27/2013
English language
225 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO/IEC 13818-2:2013 - Information technology -- Generic coding of moving pictures and associated audio information
English language
225 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


SLOVENSKI STANDARD
01-september-2018
1DGRPHãþD
SIST ISO/IEC 13818-2:2005
SIST ISO/IEC 13818-2:2005/Amd 1:2010
SIST ISO/IEC 13818-2:2005/Amd 2:2010
SIST ISO/IEC 13818-2:2005/Amd 3:2010
,QIRUPDFLMVNDWHKQRORJLMD6SORãQRNRGLUDQMHJLEOMLYLKVOLNLQSULSDGDMRþLKDYGLR
LQIRUPDFLMGHO9LGHR
Information technology - Generic coding of moving pictures and associated audio
information - Part 2: Video
Technologies de l'information - Codage générique des images animées et du son
associé - Partie 2: Données vidéo
Ta slovenski standard je istoveten z: ISO/IEC 13818-2:2013
ICS:
35.040.40 Kodiranje avdio, video, Coding of audio, video,
multimedijskih in multimedia and hypermedia
hipermedijskih informacij information
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

INTERNATIONAL ISO/IEC
STANDARD 13818-2
Third edition
2013-10-01
Information technology — Generic coding
of moving pictures and associated audio
information —
Part 2:
Video
Technologies de l'information — Codage générique des images
animées et du son associé —
Partie 2: Données vidéo
Reference number
©
ISO/IEC 2013
©  ISO/IEC 2013
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2013 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 13818-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration with
ITU-T. The identical text is published as ITU-T Rec. H.262 (2012).
This third edition cancels and replaces the second edition (ISO/IEC 13818-2:2000), which has been
technically revised. It also incorporates the Amendments ISO/IEC 13818-2:2000/Amd.1:2001, ISO/IEC 13818-
2:2000/Amd.2:2007 and ISO/IEC 13818-2:2000/Amd.3:2010, and the Technical Corrigenda ISO/IEC 13818-
2:2000/Cor.1:2002 and ISO/IEC 13818-2:2000/Cor.2:2007.
ISO/IEC 13818 consists of the following parts, under the general title Information technology — Generic
coding of moving pictures and associated audio information:
 Part 1: Systems
 Part 2: Video
 Part 3: Audio
 Part 4: Conformance testing
 Part 5: Software simulation
 Part 6: Extensions for DSM-CC
 Part 7: Advanced Audio Coding (AAC)
 Part 9: Extension for real time interface for systems decoders
 Part 10: Conformance extensions for Digital Storage Media Command and Control (DSM-CC)
 Part 11: IPMP on MPEG-2 systems

© ISO/IEC 2013 – All rights reserved iii

CONTENTS
Page
Introduction vi
1 Scope 1
2 Normative references 1
3 Definitions 1
4 Abbreviations and symbols 7
4.1 Arithmetic operators 7
4.2 Logical operators 8
4.3 Relational operators 8
4.4 Bitwise operators 8
4.5 Assignment 8
4.6 Mnemonics 8
4.7 Constants 8
5 Conventions 9
5.1 Method of describing bitstream syntax 9
5.2 Definition of functions 9
5.3 Reserved, forbidden and marker_bit 10
5.4 Arithmetic precision 10
6 Video bitstream syntax and semantics 10
6.1 Structure of coded video data 10
6.2 Video bitstream syntax 20
6.3 Video bitstream semantics 38
7 The video decoding process 68
7.1 Higher syntactic structures 69
7.2 Variable length decoding 69
7.3 Inverse scan 72
7.4 Inverse quantization 73
7.5 Inverse DCT 77
7.6 Motion compensation 77
7.7 Spatial scalability 91
7.8 SNR scalability 100
7.9 Temporal scalability 107
7.10 Data partitioning 110
7.11 Hybrid scalability 111
7.12 Output of the decoding process 112
8 Profiles and levels 115
8.1 ISO/IEC 11172-2 compatibility 117
8.2 Relationship between defined profiles 117
8.3 Relationship between defined levels 119
8.4 Scalable layers 119
8.5 Parameter values for defined profiles, levels and layers 122
8.6 Compatibility requirements on decoders 124
9 Registration of copyright identifiers 126
9.1 General 126
9.2 Implementation of a Registration Authority (RA) 126
Annex A Inverse discrete cosine transform 128
Annex B Variable length code tables 129
B.1 Macroblock addressing 129
B.2 Macroblock type 130
B.3 Macroblock pattern 135
B.4 Motion vectors 136
iv Rec. ITU-T H.262 (02/2012)
B.5 DCT coefficients 137
Annex C Video buffering verifier 146
Annex D Frame packing arrangement signalling for stereoscopic 3D content 151
Annex E Profile and level restrictions 155
E.1 Syntax element restrictions in profiles 155
E.2 Permissible layer combinations 167
Annex F Features supported by the algorithm 189
F.1 Overview 189
F.2 Video formats 189
F.3 Picture quality 190
F.4 Data rate control 190
F.5 Low delay mode 190
F.6 Random access/channel hopping 191
F.7 Scalability 191
F.8 Compatibility 197
F.9 Differences between this Specification and ISO/IEC 11172-2 197
F.10 Complexity 199
F.11 Editing encoded bitstreams 200
F.12 Trick modes 200
F.13 Error resilience 201
F.14 Concatenated sequences 208
Annex G Registration procedure 209
G.1 Procedure for the request of a Registered Identifier (RID) 209
G.2 Responsibilities of the Registration Authority 209
G.3 Responsibilities of parties requesting an RID 209
G.4 Appeal procedure for denied applications 210
Annex H Registration application form 211
H.1 Contact information of organization requesting a Registered Identifier (RID) 211
H.2 Statement of an intention to apply the assigned RID 211
H.3 Date of intended implementation of the RID 211
H.4 Authorized representative 211
H.5 For official use only of the Registration Authority 211
Annex I Registration authority – diagram of administration structure 212
Annex J 4:2:2 Profile test results 213
J.1 Introduction 213
J.2 Test sequences 213
J.3 Test procedures 214
J.4 Subjective assessment 214
J.5 Test results 215
Annex K The impact of practices for non-progressive sequence bitstreams in consideration of progressive-scan
display 218
K.1 Progressive and non-progressive encoding 218
K.2 Video source timing information syntax 218
K.3 Content generation practices 218
K.4 Post-encoding editing of the progressive frame flag in video bitstreams 221
K.5 Post-processing for systems with progressive scan displays 221
K.6 Use of capture timecode information 221
Annex L Bibliography 224
Rec. ITU-T H.262 (02/2012) v
Introduction
Intro. 1 Purpose
This Part of this Recommendation | International Standard was developed in response to the growing need for a generic
coding method of moving pictures and of associated sound for various applications such as digital storage media,
television broadcasting and communication. The use of this Specification means that motion video can be manipulated as
a form of computer data and can be stored on various storage media, transmitted and received over existing and future
networks and distributed on existing and future broadcasting channels.
Intro. 2 Application
The applications of this Specification cover, but are not limited to, such areas as listed below:
BSS Broadcasting Satellite Service (to the home)
CATV Cable TV Distribution on optical networks, copper, etc.
CDAD Cable Digital Audio Distribution
DSB Digital Sound Broadcasting (terrestrial and satellite broadcasting)
DTTB Digital Terrestrial Television Broadcasting
EC Electronic Cinema
ENG Electronic News Gathering (including SNG, Satellite News Gathering)
FSS Fixed Satellite Service (e.g. to head ends)
HTT Home Television Theatre
IPC Interpersonal Communications (videoconferencing, videophone, etc.)
ISM Interactive Storage Media (optical disks, etc.)
MMM Multimedia Mailing
NCA News and Current Affairs
NDB Networked Database Services (via ATM, etc.)
RVS Remote Video Surveillance
SSM Serial Storage Media (digital VTR, etc.)
Intro. 3 Profiles and levels
This Specification is intended to be generic in the sense that it serves a wide range of applications, bit rates, resolutions,
qualities and services. Applications should cover, among other things, digital storage media, television broadcasting and
communications. In the course of creating this Specification, various requirements from typical applications have been
considered, necessary algorithmic elements have been developed, and they have been integrated into a single syntax.
Hence, this Specification will facilitate the bitstream interchange among different applications.
Considering the practicality of implementing the full syntax of this Specification, however, a limited number of subsets
of the syntax are also stipulated by means of "profile" and "level". These and other related terms are formally defined in
clause 3.
A "profile" is a defined subset of the entire bitstream syntax that is defined by this Specification. Within the bounds
imposed by the syntax of a given profile it is still possible to require a very large variation in the performance of
encoders and decoders depending upon the values taken by parameters in the bitstream. For instance, it is possible to
14 14
specify frame sizes as large as (approximately) 2 samples wide by 2 lines high. It is currently neither practical nor
economic to implement a decoder capable of dealing with all possible frame sizes.
In order to deal with this problem, "levels" are defined within each profile. A level is a defined set of constraints imposed
on parameters in the bitstream. These constraints may be simple limits on numbers. Alternatively they may take the form
of constraints on arithmetic combinations of the parameters (e.g. frame width multiplied by frame height multiplied by
frame rate).
Bitstreams complying with this Specification use a common syntax. In order to achieve a subset of the complete syntax,
flags and parameters are included in the bitstream that signal the presence or otherwise of syntactic elements that occur
later in the bitstream. In order to specify constraints on the syntax (and hence define a profile), it is thus only necessary
to constrain the values of these flags and parameters that specify the presence of later syntactic elements.
vi Rec. ITU-T H.262 (02/2012)
Intro. 4 The scalable and the non-scalable syntax
The full syntax can be divided into two major categories: One is the non-scalable syntax, which is structured as a super
set of the syntax defined in ISO/IEC 11172-2. The main feature of the non-scalable syntax is the extra compression tools
for interlaced video signals. The second is the scalable syntax, the key property of which is to enable the reconstruction
of useful video from pieces of a total bitstream. This is achieved by structuring the total bitstream in two or more layers,
starting from a standalone base layer and adding a number of enhancement layers. The base layer can use the non-
scalable syntax, or in some situations conform to the ISO/IEC 11172-2 syntax.
Intro. 4.1 Overview of the non-scalable syntax
The coded representation defined in the non-scalable syntax achieves a high compression ratio while preserving good
image quality. The algorithm is not lossless as the exact sample values are not preserved during coding. Obtaining good
image quality at the bit rates of interest demands very high compression, which is not achievable with intra picture
coding alone. The need for random access, however, is best satisfied with pure intra picture coding. The choice of the
techniques is based on the need to balance a high image quality and compression ratio with the requirement to make
random access to the coded bitstream.
A number of techniques are used to achieve high compression. The algorithm first uses block-based motion
compensation to reduce the temporal redundancy. Motion compensation is used both for causal prediction of the current
picture from a previous picture, and for non-causal, interpolative prediction from past and future pictures. Motion vectors
are defined for each 16-sample by 16-line region of the picture. The prediction error, is further compressed using the
Discrete Cosine Transform (DCT) to remove spatial correlation before it is quantized in an irreversible process that
discards the less important information. Finally, the motion vectors are combined with the quantized DCT information,
and encoded using variable length codes.
Intro. 4.1.1 Temporal processing
Because of the conflicting requirements of random access and highly efficient compression, three mai
...


INTERNATIONAL ISO/IEC
STANDARD 13818-2
Third edition
2013-10-01
Information technology — Generic coding
of moving pictures and associated audio
information —
Part 2:
Video
Technologies de l'information — Codage générique des images
animées et du son associé —
Partie 2: Données vidéo
Reference number
©
ISO/IEC 2013
©  ISO/IEC 2013
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2013 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 13818-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration with
ITU-T. The identical text is published as ITU-T Rec. H.262 (2012).
This third edition cancels and replaces the second edition (ISO/IEC 13818-2:2000), which has been
technically revised. It also incorporates the Amendments ISO/IEC 13818-2:2000/Amd.1:2001, ISO/IEC 13818-
2:2000/Amd.2:2007 and ISO/IEC 13818-2:2000/Amd.3:2010, and the Technical Corrigenda ISO/IEC 13818-
2:2000/Cor.1:2002 and ISO/IEC 13818-2:2000/Cor.2:2007.
ISO/IEC 13818 consists of the following parts, under the general title Information technology — Generic
coding of moving pictures and associated audio information:
 Part 1: Systems
 Part 2: Video
 Part 3: Audio
 Part 4: Conformance testing
 Part 5: Software simulation
 Part 6: Extensions for DSM-CC
 Part 7: Advanced Audio Coding (AAC)
 Part 9: Extension for real time interface for systems decoders
 Part 10: Conformance extensions for Digital Storage Media Command and Control (DSM-CC)
 Part 11: IPMP on MPEG-2 systems

© ISO/IEC 2013 – All rights reserved iii

CONTENTS
Page
Introduction vi
1 Scope 1
2 Normative references 1
3 Definitions 1
4 Abbreviations and symbols 7
4.1 Arithmetic operators 7
4.2 Logical operators 8
4.3 Relational operators 8
4.4 Bitwise operators 8
4.5 Assignment 8
4.6 Mnemonics 8
4.7 Constants 8
5 Conventions 9
5.1 Method of describing bitstream syntax 9
5.2 Definition of functions 9
5.3 Reserved, forbidden and marker_bit 10
5.4 Arithmetic precision 10
6 Video bitstream syntax and semantics 10
6.1 Structure of coded video data 10
6.2 Video bitstream syntax 20
6.3 Video bitstream semantics 38
7 The video decoding process 68
7.1 Higher syntactic structures 69
7.2 Variable length decoding 69
7.3 Inverse scan 72
7.4 Inverse quantization 73
7.5 Inverse DCT 77
7.6 Motion compensation 77
7.7 Spatial scalability 91
7.8 SNR scalability 100
7.9 Temporal scalability 107
7.10 Data partitioning 110
7.11 Hybrid scalability 111
7.12 Output of the decoding process 112
8 Profiles and levels 115
8.1 ISO/IEC 11172-2 compatibility 117
8.2 Relationship between defined profiles 117
8.3 Relationship between defined levels 119
8.4 Scalable layers 119
8.5 Parameter values for defined profiles, levels and layers 122
8.6 Compatibility requirements on decoders 124
9 Registration of copyright identifiers 126
9.1 General 126
9.2 Implementation of a Registration Authority (RA) 126
Annex A Inverse discrete cosine transform 128
Annex B Variable length code tables 129
B.1 Macroblock addressing 129
B.2 Macroblock type 130
B.3 Macroblock pattern 135
B.4 Motion vectors 136
iv Rec. ITU-T H.262 (02/2012)
B.5 DCT coefficients 137
Annex C Video buffering verifier 146
Annex D Frame packing arrangement signalling for stereoscopic 3D content 151
Annex E Profile and level restrictions 155
E.1 Syntax element restrictions in profiles 155
E.2 Permissible layer combinations 167
Annex F Features supported by the algorithm 189
F.1 Overview 189
F.2 Video formats 189
F.3 Picture quality 190
F.4 Data rate control 190
F.5 Low delay mode 190
F.6 Random access/channel hopping 191
F.7 Scalability 191
F.8 Compatibility 197
F.9 Differences between this Specification and ISO/IEC 11172-2 197
F.10 Complexity 199
F.11 Editing encoded bitstreams 200
F.12 Trick modes 200
F.13 Error resilience 201
F.14 Concatenated sequences 208
Annex G Registration procedure 209
G.1 Procedure for the request of a Registered Identifier (RID) 209
G.2 Responsibilities of the Registration Authority 209
G.3 Responsibilities of parties requesting an RID 209
G.4 Appeal procedure for denied applications 210
Annex H Registration application form 211
H.1 Contact information of organization requesting a Registered Identifier (RID) 211
H.2 Statement of an intention to apply the assigned RID 211
H.3 Date of intended implementation of the RID 211
H.4 Authorized representative 211
H.5 For official use only of the Registration Authority 211
Annex I Registration authority – diagram of administration structure 212
Annex J 4:2:2 Profile test results 213
J.1 Introduction 213
J.2 Test sequences 213
J.3 Test procedures 214
J.4 Subjective assessment 214
J.5 Test results 215
Annex K The impact of practices for non-progressive sequence bitstreams in consideration of progressive-scan
display 218
K.1 Progressive and non-progressive encoding 218
K.2 Video source timing information syntax 218
K.3 Content generation practices 218
K.4 Post-encoding editing of the progressive frame flag in video bitstreams 221
K.5 Post-processing for systems with progressive scan displays 221
K.6 Use of capture timecode information 221
Annex L Bibliography 224
Rec. ITU-T H.262 (02/2012) v
Introduction
Intro. 1 Purpose
This Part of this Recommendation | International Standard was developed in response to the growing need for a generic
coding method of moving pictures and of associated sound for various applications such as digital storage media,
television broadcasting and communication. The use of this Specification means that motion video can be manipulated as
a form of computer data and can be stored on various storage media, transmitted and received over existing and future
networks and distributed on existing and future broadcasting channels.
Intro. 2 Application
The applications of this Specification cover, but are not limited to, such areas as listed below:
BSS Broadcasting Satellite Service (to the home)
CATV Cable TV Distribution on optical networks, copper, etc.
CDAD Cable Digital Audio Distribution
DSB Digital Sound Broadcasting (terrestrial and satellite broadcasting)
DTTB Digital Terrestrial Television Broadcasting
EC Electronic Cinema
ENG Electronic News Gathering (including SNG, Satellite News Gathering)
FSS Fixed Satellite Service (e.g. to head ends)
HTT Home Television Theatre
IPC Interpersonal Communications (videoconferencing, videophone, etc.)
ISM Interactive Storage Media (optical disks, etc.)
MMM Multimedia Mailing
NCA News and Current Affairs
NDB Networked Database Services (via ATM, etc.)
RVS Remote Video Surveillance
SSM Serial Storage Media (digital VTR, etc.)
Intro. 3 Profiles and levels
This Specification is intended to be generic in the sense that it serves a wide range of applications, bit rates, resolutions,
qualities and services. Applications should cover, among other things, digital storage media, television broadcasting and
communications. In the course of creating this Specification, various requirements from typical applications have been
considered, necessary algorithmic elements have been developed, and they have been integrated into a single syntax.
Hence, this Specification will facilitate the bitstream interchange among different applications.
Considering the practicality of implementing the full syntax of this Specification, however, a limited number of subsets
of the syntax are also stipulated by means of "profile" and "level". These and other related terms are formally defined in
clause 3.
A "profile" is a defined subset of the entire bitstream syntax that is defined by this Specification. Within the bounds
imposed by the syntax of a given profile it is still possible to require a very large variation in the performance of
encoders and decoders depending upon the values taken by parameters in the bitstream. For instance, it is possible to
14 14
specify frame sizes as large as (approximately) 2 samples wide by 2 lines high. It is currently neither practical nor
economic to implement a decoder capable of dealing with all possible frame sizes.
In order to deal with this problem, "levels" are defined within each profile. A level is a defined set of constraints imposed
on parameters in the bitstream. These constraints may be simple limits on numbers. Alternatively they may take the form
of constraints on arithmetic combinations of the parameters (e.g. frame width multiplied by frame height multiplied by
frame rate).
Bitstreams complying with this Specification use a common syntax. In order to achieve a subset of the complete syntax,
flags and parameters are included in the bitstream that signal the presence or otherwise of syntactic elements that occur
later in the bitstream. In order to specify constraints on the syntax (and hence define a profile), it is thus only necessary
to constrain the values of these flags and parameters that specify the presence of later syntactic elements.
vi Rec. ITU-T H.262 (02/2012)
Intro. 4 The scalable and the non-scalable syntax
The full syntax can be divided into two major categories: One is the non-scalable syntax, which is structured as a super
set of the syntax defined in ISO/IEC 11172-2. The main feature of the non-scalable syntax is the extra compression tools
for interlaced video signals. The second is the scalable syntax, the key property of which is to enable the reconstruction
of useful video from pieces of a total bitstream. This is achieved by structuring the total bitstream in two or more layers,
starting from a standalone base layer and adding a number of enhancement layers. The base layer can use the non-
scalable syntax, or in some situations conform to the ISO/IEC 11172-2 syntax.
Intro. 4.1 Overview of the non-scalable syntax
The coded representation defined in the non-scalable syntax achieves a high compression ratio while preserving good
image quality. The algorithm is not lossless as the exact sample values are not preserved during coding. Obtaining good
image quality at the bit rates of interest demands very high compression, which is not achievable with intra picture
coding alone. The need for random access, however, is best satisfied with pure intra picture coding. The choice of the
techniques is based on the need to balance a high image quality and compression ratio with the requirement to make
random access to the coded bitstream.
A number of techniques are used to achieve high compression. The algorithm first uses block-based motion
compensation to reduce the temporal redundancy. Motion compensation is used both for causal prediction of the current
picture from a previous picture, and for non-causal, interpolative prediction from past and future pictures. Motion vectors
are defined for each 16-sample by 16-line region of the picture. The prediction error, is further compressed using the
Discrete Cosine Transform (DCT) to remove spatial correlation before it is quantized in an irreversible process that
discards the less important information. Finally, the motion vectors are combined with the quantized DCT information,
and encoded using variable length codes.
Intro. 4.1.1 Temporal processing
Because of the conflicting requirements of random access and highly efficient compression, three main picture types are
defined. Intra-coded pictures (I-pictures) are coded without reference to other pictures. They provide access points to the
coded sequence where decoding can begin, but are coded with only moderate compression. Predictive coded pictures (P-
pictures) are coded more efficiently using motion compensated prediction from a past intra or predictive coded picture
and are generally used as a reference for further prediction. Bidirectionally-predictive coded pictures (B-pictures)
provide the highest degree of compression but require both past and future reference pictures for motion compensation.
Bidirectionally-predictive coded pictures are never used as references for prediction (except in the case that the resulting
picture is used as a reference in a spatially scalable enhancement layer). The organization of the three picture types in a
sequence is very flexible. The choice is left to the encoder and will depend on the requirements of the application. Figure
Intro. 1 illustrates an example of the relationship among the three different picture types.

Figure Intro.1 – Example of temporal picture structure
Intro. 4.1.2 Coding interlaced video
Each frame of interlaced video consists of two fields wh
...


INTERNATIONAL ISO/IEC
STANDARD 13818-2
Third edition
2013-10-01
Information technology — Generic coding
of moving pictures and associated audio
information —
Part 2:
Video
Technologies de l'information — Codage générique des images
animées et du son associé —
Partie 2: Données vidéo
Reference number
©
ISO/IEC 2013
©  ISO/IEC 2013
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2013 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 13818-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration with
ITU-T. The identical text is published as ITU-T Rec. H.262 (2012).
This third edition cancels and replaces the second edition (ISO/IEC 13818-2:2000), which has been
technically revised. It also incorporates the Amendments ISO/IEC 13818-2:2000/Amd.1:2001, ISO/IEC 13818-
2:2000/Amd.2:2007 and ISO/IEC 13818-2:2000/Amd.3:2010, and the Technical Corrigenda ISO/IEC 13818-
2:2000/Cor.1:2002 and ISO/IEC 13818-2:2000/Cor.2:2007.
ISO/IEC 13818 consists of the following parts, under the general title Information technology — Generic
coding of moving pictures and associated audio information:
 Part 1: Systems
 Part 2: Video
 Part 3: Audio
 Part 4: Conformance testing
 Part 5: Software simulation
 Part 6: Extensions for DSM-CC
 Part 7: Advanced Audio Coding (AAC)
 Part 9: Extension for real time interface for systems decoders
 Part 10: Conformance extensions for Digital Storage Media Command and Control (DSM-CC)
 Part 11: IPMP on MPEG-2 systems

© ISO/IEC 2013 – All rights reserved iii

CONTENTS
Page
Introduction vi
1 Scope 1
2 Normative references 1
3 Definitions 1
4 Abbreviations and symbols 7
4.1 Arithmetic operators 7
4.2 Logical operators 8
4.3 Relational operators 8
4.4 Bitwise operators 8
4.5 Assignment 8
4.6 Mnemonics 8
4.7 Constants 8
5 Conventions 9
5.1 Method of describing bitstream syntax 9
5.2 Definition of functions 9
5.3 Reserved, forbidden and marker_bit 10
5.4 Arithmetic precision 10
6 Video bitstream syntax and semantics 10
6.1 Structure of coded video data 10
6.2 Video bitstream syntax 20
6.3 Video bitstream semantics 38
7 The video decoding process 68
7.1 Higher syntactic structures 69
7.2 Variable length decoding 69
7.3 Inverse scan 72
7.4 Inverse quantization 73
7.5 Inverse DCT 77
7.6 Motion compensation 77
7.7 Spatial scalability 91
7.8 SNR scalability 100
7.9 Temporal scalability 107
7.10 Data partitioning 110
7.11 Hybrid scalability 111
7.12 Output of the decoding process 112
8 Profiles and levels 115
8.1 ISO/IEC 11172-2 compatibility 117
8.2 Relationship between defined profiles 117
8.3 Relationship between defined levels 119
8.4 Scalable layers 119
8.5 Parameter values for defined profiles, levels and layers 122
8.6 Compatibility requirements on decoders 124
9 Registration of copyright identifiers 126
9.1 General 126
9.2 Implementation of a Registration Authority (RA) 126
Annex A Inverse discrete cosine transform 128
Annex B Variable length code tables 129
B.1 Macroblock addressing 129
B.2 Macroblock type 130
B.3 Macroblock pattern 135
B.4 Motion vectors 136
iv Rec. ITU-T H.262 (02/2012)
B.5 DCT coefficients 137
Annex C Video buffering verifier 146
Annex D Frame packing arrangement signalling for stereoscopic 3D content 151
Annex E Profile and level restrictions 155
E.1 Syntax element restrictions in profiles 155
E.2 Permissible layer combinations 167
Annex F Features supported by the algorithm 189
F.1 Overview 189
F.2 Video formats 189
F.3 Picture quality 190
F.4 Data rate control 190
F.5 Low delay mode 190
F.6 Random access/channel hopping 191
F.7 Scalability 191
F.8 Compatibility 197
F.9 Differences between this Specification and ISO/IEC 11172-2 197
F.10 Complexity 199
F.11 Editing encoded bitstreams 200
F.12 Trick modes 200
F.13 Error resilience 201
F.14 Concatenated sequences 208
Annex G Registration procedure 209
G.1 Procedure for the request of a Registered Identifier (RID) 209
G.2 Responsibilities of the Registration Authority 209
G.3 Responsibilities of parties requesting an RID 209
G.4 Appeal procedure for denied applications 210
Annex H Registration application form 211
H.1 Contact information of organization requesting a Registered Identifier (RID) 211
H.2 Statement of an intention to apply the assigned RID 211
H.3 Date of intended implementation of the RID 211
H.4 Authorized representative 211
H.5 For official use only of the Registration Authority 211
Annex I Registration authority – diagram of administration structure 212
Annex J 4:2:2 Profile test results 213
J.1 Introduction 213
J.2 Test sequences 213
J.3 Test procedures 214
J.4 Subjective assessment 214
J.5 Test results 215
Annex K The impact of practices for non-progressive sequence bitstreams in consideration of progressive-scan
display 218
K.1 Progressive and non-progressive encoding 218
K.2 Video source timing information syntax 218
K.3 Content generation practices 218
K.4 Post-encoding editing of the progressive frame flag in video bitstreams 221
K.5 Post-processing for systems with progressive scan displays 221
K.6 Use of capture timecode information 221
Annex L Bibliography 224
Rec. ITU-T H.262 (02/2012) v
Introduction
Intro. 1 Purpose
This Part of this Recommendation | International Standard was developed in response to the growing need for a generic
coding method of moving pictures and of associated sound for various applications such as digital storage media,
television broadcasting and communication. The use of this Specification means that motion video can be manipulated as
a form of computer data and can be stored on various storage media, transmitted and received over existing and future
networks and distributed on existing and future broadcasting channels.
Intro. 2 Application
The applications of this Specification cover, but are not limited to, such areas as listed below:
BSS Broadcasting Satellite Service (to the home)
CATV Cable TV Distribution on optical networks, copper, etc.
CDAD Cable Digital Audio Distribution
DSB Digital Sound Broadcasting (terrestrial and satellite broadcasting)
DTTB Digital Terrestrial Television Broadcasting
EC Electronic Cinema
ENG Electronic News Gathering (including SNG, Satellite News Gathering)
FSS Fixed Satellite Service (e.g. to head ends)
HTT Home Television Theatre
IPC Interpersonal Communications (videoconferencing, videophone, etc.)
ISM Interactive Storage Media (optical disks, etc.)
MMM Multimedia Mailing
NCA News and Current Affairs
NDB Networked Database Services (via ATM, etc.)
RVS Remote Video Surveillance
SSM Serial Storage Media (digital VTR, etc.)
Intro. 3 Profiles and levels
This Specification is intended to be generic in the sense that it serves a wide range of applications, bit rates, resolutions,
qualities and services. Applications should cover, among other things, digital storage media, television broadcasting and
communications. In the course of creating this Specification, various requirements from typical applications have been
considered, necessary algorithmic elements have been developed, and they have been integrated into a single syntax.
Hence, this Specification will facilitate the bitstream interchange among different applications.
Considering the practicality of implementing the full syntax of this Specification, however, a limited number of subsets
of the syntax are also stipulated by means of "profile" and "level". These and other related terms are formally defined in
clause 3.
A "profile" is a defined subset of the entire bitstream syntax that is defined by this Specification. Within the bounds
imposed by the syntax of a given profile it is still possible to require a very large variation in the performance of
encoders and decoders depending upon the values taken by parameters in the bitstream. For instance, it is possible to
14 14
specify frame sizes as large as (approximately) 2 samples wide by 2 lines high. It is currently neither practical nor
economic to implement a decoder capable of dealing with all possible frame sizes.
In order to deal with this problem, "levels" are defined within each profile. A level is a defined set of constraints imposed
on parameters in the bitstream. These constraints may be simple limits on numbers. Alternatively they may take the form
of constraints on arithmetic combinations of the parameters (e.g. frame width multiplied by frame height multiplied by
frame rate).
Bitstreams complying with this Specification use a common syntax. In order to achieve a subset of the complete syntax,
flags and parameters are included in the bitstream that signal the presence or otherwise of syntactic elements that occur
later in the bitstream. In order to specify constraints on the syntax (and hence define a profile), it is thus only necessary
to constrain the values of these flags and parameters that specify the presence of later syntactic elements.
vi Rec. ITU-T H.262 (02/2012)
Intro. 4 The scalable and the non-scalable syntax
The full syntax can be divided into two major categories: One is the non-scalable syntax, which is structured as a super
set of the syntax defined in ISO/IEC 11172-2. The main feature of the non-scalable syntax is the extra compression tools
for interlaced video signals. The second is the scalable syntax, the key property of which is to enable the reconstruction
of useful video from pieces of a total bitstream. This is achieved by structuring the total bitstream in two or more layers,
starting from a standalone base layer and adding a number of enhancement layers. The base layer can use the non-
scalable syntax, or in some situations conform to the ISO/IEC 11172-2 syntax.
Intro. 4.1 Overview of the non-scalable syntax
The coded representation defined in the non-scalable syntax achieves a high compression ratio while preserving good
image quality. The algorithm is not lossless as the exact sample values are not preserved during coding. Obtaining good
image quality at the bit rates of interest demands very high compression, which is not achievable with intra picture
coding alone. The need for random access, however, is best satisfied with pure intra picture coding. The choice of the
techniques is based on the need to balance a high image quality and compression ratio with the requirement to make
random access to the coded bitstream.
A number of techniques are used to achieve high compression. The algorithm first uses block-based motion
compensation to reduce the temporal redundancy. Motion compensation is used both for causal prediction of the current
picture from a previous picture, and for non-causal, interpolative prediction from past and future pictures. Motion vectors
are defined for each 16-sample by 16-line region of the picture. The prediction error, is further compressed using the
Discrete Cosine Transform (DCT) to remove spatial correlation before it is quantized in an irreversible process that
discards the less important information. Finally, the motion vectors are combined with the quantized DCT information,
and encoded using variable length codes.
Intro. 4.1.1 Temporal processing
Because of the conflicting requirements of random access and highly efficient compression, three main picture types are
defined. Intra-coded pictures (I-pictures) are coded without reference to other pictures. They provide access points to the
coded sequence where decoding can begin, but are coded with only moderate compression. Predictive coded pictures (P-
pictures) are coded more efficiently using motion compensated prediction from a past intra or predictive coded picture
and are generally used as a reference for further prediction. Bidirectionally-predictive coded pictures (B-pictures)
provide the highest degree of compression but require both past and future reference pictures for motion compensation.
Bidirectionally-predictive coded pictures are never used as references for prediction (except in the case that the resulting
picture is used as a reference in a spatially scalable enhancement layer). The organization of the three picture types in a
sequence is very flexible. The choice is left to the encoder and will depend on the requirements of the application. Figure
Intro. 1 illustrates an example of the relationship among the three different picture types.

Figure Intro.1 – Example of temporal picture structure
Intro. 4.1.2 Coding interlaced video
Each frame of interlaced video consists of two fields wh
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.