ISO/IEC 10646:2020
(Main)Information technology — Universal coded character set (UCS)
Information technology — Universal coded character set (UCS)
This document specifies the architecture of the UCS; defines terms used for the UCS; describes the general structure of the UCS codespace; specifies the assigned planes of the UCS: the Basic Multilingual Plane (BMP) of the UCS, the Supplementary Multilingual Plane (SMP), the Supplementary Ideographic Plane (SIP), the Tertiary Ideographic Plane (TIP), and the Supplementary Special-purpose Plane (SSP); defines a set of graphic characters used in scripts and the written form of languages on a world-wide scale; specifies the names for the graphic characters and format characters of the BMP, SMP, SIP, TIP, SSP and their coded representations within the UCS codespace; specifies the coded representations for control characters and private use characters; specifies three encoding forms of the UCS: UTF-8, UTF-16, and UTF-32; specifies seven encoding schemes of the UCS: UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, and UTF-32LE; specifies the management of future additions to this coded character set. NOTE The determination of suitability of these characters for use as identifiers in programming languages is not specified by this document but can be found in an external reference. See Annex U.
Technologies de l'information — Jeu universel de caractères codés (JUC)
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 10646
Sixth edition
2020-12
Information technology — Universal
coded character set (UCS)
Technologies de l'information — Jeu universel de caractères codés (JUC)
Reference number
©
ISO/IEC 2020
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved
CONTENTS
1 Scope .1
2 Normative references .1
3 Terms and definitions .2
4 Conformance .8
4.1 General .8
4.2 Conformance of information interchange .8
4.3 Conformance of devices.8
5 Electronic data attachments .9
6 General structure of the UCS . 10
7 Basic structure and nomenclature . 11
7.1 Structure . 11
7.2 Coding of characters . 12
7.3 Types of code points . 12
7.4 Naming of characters . 13
7.5 Short identifiers for code points (UIDs) . 14
7.6 UCS Sequence Identifiers . 14
7.7 Octet sequence identifiers . 15
8 Revision and updating of the UCS . 15
9 Subsets . 15
9.1 General . 15
9.2 Limited subset . 15
9.3 Selected subset. 15
10 UCS encoding forms . 15
10.1 General . 15
10.2 UTF-8 . 15
10.3 UTF-16 . 16
10.4 UTF-32 . 17
11 UCS encoding schemes . 17
11.1 General . 17
11.2 UTF-8 . 17
11.3 UTF-16BE . 17
11.4 UTF-16LE . 18
11.5 UTF-16 . 18
11.6 UTF-32BE . 18
11.7 UTF-32LE . 18
11.8 UTF-32 . 19
12 Use of control functions with the UCS . 19
13 Declaration of identification of features . 20
13.1 Purpose and context of identification . 20
13.2 Identification of a UCS encoding scheme . 20
13.3 Identification of subsets of graphic characters . 21
© ISO/IEC 2020 – All rights reserved iii
13.4 Identification of control function set . 21
13.5 Identification of the coding system of ISO/IEC 2022 . 21
14 Structure of the code charts and lists . 22
15 Block and collection names . 22
15.1 Block names . 22
15.2 Collection names . 23
16 Mirrored characters in bidirectional context . 23
16.1 Mirrored characters . 23
16.2 Directionality of bidirectional text . 23
17 Special characters . 23
17.1 General . 23
17.2 Space characters . 23
17.3 Currency symbols . 24
17.4 Format characters . 24
17.5 Ideographic description characters . 24
17.6 Variation selectors and variation sequences . 25
18 Presentation forms of characters . 27
19 Compatibility characters . 27
20 Order of characters . 27
21 Combining characters . 28
21.1 Order of combining characters . 28
21.2 Combining class and canonical ordering . 28
21.3 Appearance in code charts . 28
21.4 Alternate coded representations . 28
21.5 Multiple combining characters . 28
21.6 Collections containing combining characters . 29
21.7 Combining Grapheme Joiner . 29
22 Normalization forms. 29
23 Special features of individual scripts and symbol repertoires . 30
23.1 Hangul syllable composition method . 30
23.2 Features of scripts used in India and some other South Asian countries . 30
23.3 Byzantine musical symbols . 31
23.4 Source references for pictographic symbols . 31
24 Source references for CJK ideographs . 32
24.1 List of source references. 32
24.2 Source references file for CJK ideographs . 35
24.3 Source reference presentation for CJK Unified ideographs . 37
24.4 Source references presentation for CJK Compatibility ideographs . 40
25 Source references for Tangut ideographs . 40
25.1 List of source references. 40
25.2 Source reference file for Tangut ideographs . 41
25.3 Source reference presentation for Tanguts ideographs . 42
26 Source references for Nüshu characters . 42
iv © ISO/IEC 2020 – All rights reserved
26.1 List of source references. 42
26.2 Source reference file for Nüshu characters . 42
27 Character names and annotations . 43
27.1 Entity names . 43
27.2 Name formation . 43
27.3 Single name . 44
27.4 Name immutability . 44
27.5 Name uniqueness . 44
27.6 Character names for CJK ideographs . 45
27.7 Character names for Tangut ideographs . 45
27.8 Character names for Nüshu characters . 45
27.9 Character names for Khitan Small Script characters . 46
27.10 Character names for Hangul syllables . 46
28 Named UCS Sequence Identifiers . 47
29 Structure of the Basic Multilingual Plane .
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.