ISO 24613-2:2020
(Main)Language resource management — Lexical markup framework (LMF) — Part 2: Machine-readable dictionary (MRD) model
Language resource management — Lexical markup framework (LMF) — Part 2: Machine-readable dictionary (MRD) model
This document describes the machine-readable dictionary (MRD) model, a metamodel for representing data stored in a variety of electronic dictionary subtypes, ranging from direct support for human translators to support for machine processing.
Gestion des ressources linguistiques — Cadre de balisage lexical (LMF) — Partie 2: Modèle de dictionnaire lisible par ordinateur (MRD)
Upravljanje jezikovnih virov - Ogrodje za označevanje leksikonov (LMF) - 2. del: Model za strojno berljiv slovar (MRD)
General Information
Relations
Standards Content (Sample)
SLOVENSKI STANDARD
01-marec-2021
Upravljanje jezikovnih virov - Ogrodje za označevanje leksikonov (LMF) - 2. del:
Model za strojno berljiv slovar (MRD)
Language resource management -- Lexical markup framework (LMF) -- Part 2: Machine
Readable Dictionary (MRD) model
Gestion de ressources linguistiques -- Cadre de balisage lexical -- Partie 2: Titre manque
Ta slovenski standard je istoveten z: ISO 24613-2:2020
ICS:
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
INTERNATIONAL ISO
STANDARD 24613-2
First edition
2020-07
Language resource management —
Lexical markup framework (LMF) —
Part 2:
Machine-readable dictionary (MRD)
model
Gestion des ressources linguistiques — Cadre de balisage lexical
(LMF) —
Partie 2: Modèle de dictionnaire lisible par ordinateur (MRD)
Reference number
©
ISO 2020
© ISO 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2020 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Key standards used by LMF . 1
5 The machine-readable dictionary (MRD) model . 1
5.1 General . 1
5.2 MRD class model . 2
5.2.1 Set of classes . 2
5.2.2 Class selection and multiplicity . 2
5.2.3 Generalization . 3
5.2.4 Object realization . 3
5.3 Data category selection and class population . 3
5.4 CrossREF allocation . 3
5.5 Form subclasses . 4
5.5.1 WordForm class . 4
5.5.2 Lemma class . 4
5.5.3 Stem class . 4
5.5.4 WordPart class . 4
5.5.5 RelatedForm class . 4
5.6 FormRepresentation class . 4
5.7 TextRepresentation class . 5
5.8 Translation class . 5
5.9 Example class . 5
5.10 SubjectField class . 5
5.11 Bibliography class . 5
5.12 Multiword Expression (MWE) Analysis . 6
Annex A (informative) Data category examples . 7
Annex B (informative) Machine-readable dictionary examples . 9
Bibliography .21
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 4, Language resource management.
1) 1)
This first edition of ISO 24613-2, together with ISO 24613-1:2019, ISO 24613-3 , ISO 24613-4 ,
1) 2) 2)
ISO 24613-5 , ISO 24613-6 and ISO 24613-7 , cancels and replaces ISO 24613:2008, which has been
divided into several parts and technically revised.
The main changes compared to the previous edition are as follows.
This edition merges two normative annexes from the previous edition, Annex A, Morphology extension,
and Annex C, Machine-readable dictionary extension, providing a more cohesive description of the
key structures (classes and associations) found in that edition. The cross-reference (CrossREF) model
introduced in Part 1, Core model, of this edition, provides a new capability for correlating lexical
features across different form and sense classes. In addition, the CrossREF model has replaced the
ListOfComponents and Component classes, enabling a more extensible and flexible capability for
managing multiword expressions. The metamodel of generalization by typing introduced in Part 1
provides a more rigorous and unambiguous framework for applying LMF modelling mechanisms in
ways that enable greater editorial freedom and support the comparison of different LMF conformant
designs. This edition has kept most of the informative examples found in the previous edition (deleting
only a few redundant examples) and has added new examples to illustrate new modelling features.
There have been some class name changes (e.g. OrthographicRepresentation for Representation and
Translation for Equivalent), but no changes in the underlying concepts of the previously existing classes.
A list of all parts in the ISO 24613 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
1) Under preparation.
2) Planned.
iv © ISO 2020 – All rights reserved
Introduction
The ISO 24613 series is based upon the definition of an implementation-independent metamodel
combining a core model and additional models that onomasiological (form-oriented) and semasiological
(concept-oriented) lexical content can take.
It provides guidelines for various implementation use cases, and where appropriate describes LMF
compliant serializations that fit various application contexts.
This document extends ISO 24613-1, the LMF core model, through the use of the processes and
mechanisms described in ISO 24613-1. The objective is to enable flexible design methods to support
the development of machine-readable dictionaries for different purposes while enabling cross-
comparisons of different designs and a basis for developing assessments of standards conformance.
The scope of supported design goals ranges from simple to complex human-oriented MRDs, both
monolingual and bilingual, lexicons that support conceptual-lexical systems through links with
ontological resources, rigorously constrained lexicons for supporting machine processes, and lexicons
that provide an extensional description of the morphology of lexical entries. Since this document is
based on ISO 24613-1, the LMF core model, it is designed to interchange data with other parts of the
ISO 24613 series where applicable.
INTERNATIONAL STANDARD ISO 24613-2:2020(E)
Language resource management — Lexical markup
framework (LMF) —
Part 2:
Machine-readable dictionary (MRD) model
IMPORTANT — The electronic file of this document contains colours which are considered to be
useful for the correct understanding of the document. Users should therefore consider printing
this document using a colour printer.
1 Scope
This document describes the machine-readable dictionary (MRD) model, a metamodel for representing
data stored in a variety of electronic dictionary subtypes, ranging from direct support for human
translators to support for machine processing.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 24613
...
SLOVENSKI STANDARD
01-marec-2021
Upravljanje jezikovnih virov - Ogrodje za označevanje leksikonov (LMF) - 2. del:
Model za strojno berljiv slovar (MRD)
Language resource management -- Lexical markup framework (LMF) -- Part 2: Machine
Readable Dictionary (MRD) model
Gestion de ressources linguistiques -- Cadre de balisage lexical -- Partie 2: Titre manque
Ta slovenski standard je istoveten z: ISO 24613-2:2020
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
INTERNATIONAL ISO
STANDARD 24613-2
First edition
2020-07
Language resource management —
Lexical markup framework (LMF) —
Part 2:
Machine-readable dictionary (MRD)
model
Gestion des ressources linguistiques — Cadre de balisage lexical
(LMF) —
Partie 2: Modèle de dictionnaire lisible par ordinateur (MRD)
Reference number
©
ISO 2020
© ISO 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2020 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Key standards used by LMF . 1
5 The machine-readable dictionary (MRD) model . 1
5.1 General . 1
5.2 MRD class model . 2
5.2.1 Set of classes . 2
5.2.2 Class selection and multiplicity . 2
5.2.3 Generalization . 3
5.2.4 Object realization . 3
5.3 Data category selection and class population . 3
5.4 CrossREF allocation . 3
5.5 Form subclasses . 4
5.5.1 WordForm class . 4
5.5.2 Lemma class . 4
5.5.3 Stem class . 4
5.5.4 WordPart class . 4
5.5.5 RelatedForm class . 4
5.6 FormRepresentation class . 4
5.7 TextRepresentation class . 5
5.8 Translation class . 5
5.9 Example class . 5
5.10 SubjectField class . 5
5.11 Bibliography class . 5
5.12 Multiword Expression (MWE) Analysis . 6
Annex A (informative) Data category examples . 7
Annex B (informative) Machine-readable dictionary examples . 9
Bibliography .21
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 4, Language resource management.
1) 1)
This first edition of ISO 24613-2, together with ISO 24613-1:2019, ISO 24613-3 , ISO 24613-4 ,
1) 2) 2)
ISO 24613-5 , ISO 24613-6 and ISO 24613-7 , cancels and replaces ISO 24613:2008, which has been
divided into several parts and technically revised.
The main changes compared to the previous edition are as follows.
This edition merges two normative annexes from the previous edition, Annex A, Morphology extension,
and Annex C, Machine-readable dictionary extension, providing a more cohesive description of the
key structures (classes and associations) found in that edition. The cross-reference (CrossREF) model
introduced in Part 1, Core model, of this edition, provides a new capability for correlating lexical
features across different form and sense classes. In addition, the CrossREF model has replaced the
ListOfComponents and Component classes, enabling a more extensible and flexible capability for
managing multiword expressions. The metamodel of generalization by typing introduced in Part 1
provides a more rigorous and unambiguous framework for applying LMF modelling mechanisms in
ways that enable greater editorial freedom and support the comparison of different LMF conformant
designs. This edition has kept most of the informative examples found in the previous edition (deleting
only a few redundant examples) and has added new examples to illustrate new modelling features.
There have been some class name changes (e.g. OrthographicRepresentation for Representation and
Translation for Equivalent), but no changes in the underlying concepts of the previously existing classes.
A list of all parts in the ISO 24613 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
1) Under preparation.
2) Planned.
iv © ISO 2020 – All rights reserved
Introduction
The ISO 24613 series is based upon the definition of an implementation-independent metamodel
combining a core model and additional models that onomasiological (form-oriented) and semasiological
(concept-oriented) lexical content can take.
It provides guidelines for various implementation use cases, and where appropriate describes LMF
compliant serializations that fit various application contexts.
This document extends ISO 24613-1, the LMF core model, through the use of the processes and
mechanisms described in ISO 24613-1. The objective is to enable flexible design methods to support
the development of machine-readable dictionaries for different purposes while enabling cross-
comparisons of different designs and a basis for developing assessments of standards conformance.
The scope of supported design goals ranges from simple to complex human-oriented MRDs, both
monolingual and bilingual, lexicons that support conceptual-lexical systems through links with
ontological resources, rigorously constrained lexicons for supporting machine processes, and lexicons
that provide an extensional description of the morphology of lexical entries. Since this document is
based on ISO 24613-1, the LMF core model, it is designed to interchange data with other parts of the
ISO 24613 series where applicable.
INTERNATIONAL STANDARD ISO 24613-2:2020(E)
Language resource management — Lexical markup
framework (LMF) —
Part 2:
Machine-readable dictionary (MRD) model
IMPORTANT — The electronic file of this document contains colours which are considered to be
useful for the correct understanding of the document. Users should therefore consider printing
this document using a colour printer.
1 Scope
This document describes the machine-readable dictionary (MRD) model, a metamodel for representing
data stored in a variety of electronic dictionary subtypes, ranging from direct support for human
translators to support for machine processing.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, th
...
INTERNATIONAL ISO
STANDARD 24613-2
First edition
2020-07
Language resource management —
Lexical markup framework (LMF) —
Part 2:
Machine-readable dictionary (MRD)
model
Gestion des ressources linguistiques — Cadre de balisage lexical
(LMF) —
Partie 2: Modèle de dictionnaire lisible par ordinateur (MRD)
Reference number
©
ISO 2020
© ISO 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2020 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Key standards used by LMF . 1
5 The machine-readable dictionary (MRD) model . 1
5.1 General . 1
5.2 MRD class model . 2
5.2.1 Set of classes . 2
5.2.2 Class selection and multiplicity . 2
5.2.3 Generalization . 3
5.2.4 Object realization . 3
5.3 Data category selection and class population . 3
5.4 CrossREF allocation . 3
5.5 Form subclasses . 4
5.5.1 WordForm class . 4
5.5.2 Lemma class . 4
5.5.3 Stem class . 4
5.5.4 WordPart class . 4
5.5.5 RelatedForm class . 4
5.6 FormRepresentation class . 4
5.7 TextRepresentation class . 5
5.8 Translation class . 5
5.9 Example class . 5
5.10 SubjectField class . 5
5.11 Bibliography class . 5
5.12 Multiword Expression (MWE) Analysis . 6
Annex A (informative) Data category examples . 7
Annex B (informative) Machine-readable dictionary examples . 9
Bibliography .21
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 4, Language resource management.
1) 1)
This first edition of ISO 24613-2, together with ISO 24613-1:2019, ISO 24613-3 , ISO 24613-4 ,
1) 2) 2)
ISO 24613-5 , ISO 24613-6 and ISO 24613-7 , cancels and replaces ISO 24613:2008, which has been
divided into several parts and technically revised.
The main changes compared to the previous edition are as follows.
This edition merges two normative annexes from the previous edition, Annex A, Morphology extension,
and Annex C, Machine-readable dictionary extension, providing a more cohesive description of the
key structures (classes and associations) found in that edition. The cross-reference (CrossREF) model
introduced in Part 1, Core model, of this edition, provides a new capability for correlating lexical
features across different form and sense classes. In addition, the CrossREF model has replaced the
ListOfComponents and Component classes, enabling a more extensible and flexible capability for
managing multiword expressions. The metamodel of generalization by typing introduced in Part 1
provides a more rigorous and unambiguous framework for applying LMF modelling mechanisms in
ways that enable greater editorial freedom and support the comparison of different LMF conformant
designs. This edition has kept most of the informative examples found in the previous edition (deleting
only a few redundant examples) and has added new examples to illustrate new modelling features.
There have been some class name changes (e.g. OrthographicRepresentation for Representation and
Translation for Equivalent), but no changes in the underlying concepts of the previously existing classes.
A list of all parts in the ISO 24613 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
1) Under preparation.
2) Planned.
iv © ISO 2020 – All rights reserved
Introduction
The ISO 24613 series is based upon the definition of an implementation-independent metamodel
combining a core model and additional models that onomasiological (form-oriented) and semasiological
(concept-oriented) lexical content can take.
It provides guidelines for various implementation use cases, and where appropriate describes LMF
compliant serializations that fit various application contexts.
This document extends ISO 24613-1, the LMF core model, through the use of the processes and
mechanisms described in ISO 24613-1. The objective is to enable flexible design methods to support
the development of machine-readable dictionaries for different purposes while enabling cross-
comparisons of different designs and a basis for developing assessments of standards conformance.
The scope of supported design goals ranges from simple to complex human-oriented MRDs, both
monolingual and bilingual, lexicons that support conceptual-lexical systems through links with
ontological resources, rigorously constrained lexicons for supporting machine processes, and lexicons
that provide an extensional description of the morphology of lexical entries. Since this document is
based on ISO 24613-1, the LMF core model, it is designed to interchange data with other parts of the
ISO 24613 series where applicable.
INTERNATIONAL STANDARD ISO 24613-2:2020(E)
Language resource management — Lexical markup
framework (LMF) —
Part 2:
Machine-readable dictionary (MRD) model
IMPORTANT — The electronic file of this document contains colours which are considered to be
useful for the correct understanding of the document. Users should therefore consider printing
this document using a colour printer.
1 Scope
This document describes the machine-readable dictionary (MRD) model, a metamodel for representing
data stored in a variety of electronic dictionary subtypes, ranging from direct support for human
translators to support for machine processing.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 24613-1, Language resource management — Lexical markup framework (LMF) — Part 1: Core model
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 24613-1 apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
4 Key standards used by LMF
The key standards applicable to this document are described in ISO 24613-1, the LMF core model.
5 The machine-readable dictionary (MRD) model
5.1 General
The MRD model is represented by UML classes, associations among the classes (the structure), sets
of data categories (attribute-value pairs), and links (cross-references). Subclauses 5.2 through 5.12
describe each of these features, their interdependencies, and their implementation.
Figure 1 — MRD class model
5.2 MRD class model
5.2.1 Set of classes
The classes defined in
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.