IEC 60268-16:2003
(Main)Sound system equipment - Part 16: Objective rating of speech intelligibility by speech transmission index
Sound system equipment - Part 16: Objective rating of speech intelligibility by speech transmission index
defines objective methods for rating the transmission quality of speech with respect to intelligibility. The four methods, which are closely related, are referred to as the "STI," the "STITEL", the "STIPA" and the "RASTI" methods (see Clause 3).
General Information
Relations
Standards Content (Sample)
INTERNATIONAL IEC
STANDARD
60268-16
Third edition
2003-05
Sound system equipment –
Part 16:
Objective rating of speech intelligibility
by speech transmission index
Equipements pour systèmes électroacoustiques –
Partie 16:
Evaluation objective de l'intelligibilité de la parole
au moyen de l'indice de transmission de la parole
Reference number
Publication numbering
As from 1 January 1997 all IEC publications are issued with a designation in the
60000 series. For example, IEC 34-1 is now referred to as IEC 60034-1.
Consolidated editions
The IEC is now publishing consolidated versions of its publications. For example,
edition numbers 1.0, 1.1 and 1.2 refer, respectively, to the base publication, the
base publication incorporating amendment 1 and the base publication incorporating
amendments 1 and 2.
Further information on IEC publications
The technical content of IEC publications is kept under constant review by the IEC,
thus ensuring that the content reflects current technology. Information relating to
this publication, including its validity, is available in the IEC Catalogue of
publications (see below) in addition to new editions, amendments and corrigenda.
Information on the subjects under consideration and work in progress undertaken
by the technical committee which has prepared this publication, as well as the list
of publications issued, is also available from the following:
• IEC Web Site (www.iec.ch)
• Catalogue of IEC publications
The on-line catalogue on the IEC web site (http://www.iec.ch/searchpub/cur_fut.htm)
enables you to search by a variety of criteria including text searches, technical
committees and date of publication. On-line information is also available on
recently issued publications, withdrawn and replaced publications, as well as
corrigenda.
• IEC Just Published
This summary of recently issued publications (http://www.iec.ch/online_news/
justpub/jp_entry.htm) is also available by email. Please contact the Customer
Service Centre (see below) for further information.
• Customer Service Centre
If you have any questions regarding this publication or need further assistance,
please contact the Customer Service Centre:
Email: custserv@iec.ch
Tel: +41 22 919 02 11
Fax: +41 22 919 03 00
INTERNATIONAL IEC
STANDARD
60268-16
Third edition
2003-05
Sound system equipment –
Part 16:
Objective rating of speech intelligibility
by speech transmission index
Equipements pour systèmes électroacoustiques –
Partie 16:
Evaluation objective de l'intelligibilité de la parole
au moyen de l'indice de transmission de la parole
IEC 2003 Copyright - all rights reserved
No part of this publication may be reproduced or utilized in any form or by any means, electronic or
mechanical, including photocopying and microfilm, without permission in writing from the publisher.
International Electrotechnical Commission, 3, rue de Varembé, PO Box 131, CH-1211 Geneva 20, Switzerland
Telephone: +41 22 919 02 11 Telefax: +41 22 919 03 00 E-mail: inmail@iec.ch Web: www.iec.ch
PRICE CODE
Commission Electrotechnique Internationale
U
International Electrotechnical Commission
Международная Электротехническая Комиссия
For price, see current catalogue
– 2 – 60268-16 IEC:2003(E)
CONTENTS
FOREWORD . 4
1 Scope . 6
2 Normative references. 6
3 Definitions and abbreviations. 6
4 Description of the methods . 7
4.1 General . 7
4.2 The STI method. 8
4.3 The STITEL method. 9
4.4 The STIPA method .10
4.5 The RASTI method .10
4.6 Methods of measurement.13
5 Methods of determining intelligibility .15
5.1 Word tests .15
5.2 Modified rhyme tests.15
5.3 Speech Intelligibility Index .15
5.4 Articulation loss of consonants .15
Annex A (normative) Speech transmission index (STI) and revised (STI ) methods .16
r
A.1 Background .16
A.2 The STI method .19
A.3 The test signals .23
Annex B (informative) The STITEL method.24
Annex C (informative) The STIPA method .25
Annex D (informative) The RASTI method .26
Annex E (informative) Qualification of the STI and relation with some subjective
intelligibility measures.27
Bibliography.28
Figure 1 – Modulation transfer function: input/output comparison . 7
Figure 2 – Relationship between the theoretical STI by the RASTI method and the STI
measured by a proprietary equipment with a measurement time of 12 s approximately .11
Figure 3 – Conditions under which RASTI results do not differ by more than 0,05 .12
Figure A.1 – Envelope function (panel A) of a 10 s speech signal for the 250 Hz
octave band and corresponding envelope spectrum (panel B) .16
Figure A.2 – Theoretical expression of the MTF .18
Figure A.3 – The measurement system and frequencies for the STI method.19
Figure A.4 – Auditory masking strength of octave band (k – 1) on that above (k).20
Figure A.5 – The relationship between effective signal-to-noise ratio and transmission
index for a shift of 15 dB and a range of 30 dB.22
Figure D.1 – Illustration of a practical RASTI test signal.26
Figure E.1 – Qualification of the STI and relation with some subjective intelligibility
measures .27
60268-16 IEC:2003(E) – 3 –
Table A.1 – Octave level specific slope of masking and corresponding auditory
masking factor (amf) .21
Table A.2 – STI octave band specific male and female weighting factors .23
r
Table A.3 – Octave band levels (dB) relative to the A-weighted long-term speech level.23
Table B.1 – STITEL: modulation frequencies for the seven octave bands.24
Table C.1 – STIPA: modulation frequencies for the seven octave bands .25
Table C.2 – STI octave band specific male and female weighting factors adopted
r
to STIPA .25
– 4 – 60268-16 IEC:2003(E)
INTERNATIONAL ELECTROTECHNICAL COMMISSION
____________
SOUND SYSTEM EQUIPMENT –
Part 16: Objective rating of speech intelligibility
by speech transmission index
FOREWORD
1) The IEC (International Electrotechnical Commission) is a worldwide organization for standardization comprising
all national electrotechnical committees (IEC National Committees). The object of the IEC is to promote
international co-operation on all questions concerning standardization in the electrical and electronic fields. To
this end and in addition to other activities, the IEC publishes International Standards. Their preparation is
entrusted to technical committees; any IEC National Committee interested in the subject dealt with may
participate in this preparatory work. International, governmental and non-governmental organizations liaising
with the IEC also participate in this preparation. The IEC collaborates closely with the International
Organization for Standardization (ISO) in accordance with conditions determined by agreement between the
two organizations.
2) The formal decisions or agreements of the IEC on technical matters express, as nearly as possible, an
international consensus of opinion on the relevant subjects since each technical committee has representation
from all interested National Committees.
3) The documents produced have the form of recommendations for international use and are published in the form
of standards, technical specifications, technical reports or guides and they are accepted by the National
Committees in that sense.
4) In order to promote international unification, IEC National Committees undertake to apply IEC International
Standards transparently to the maximum extent possible in their national and regional standards. Any
divergence between the IEC Standard and the corresponding national or regional standard shall be clearly
indicated in the latter.
5) The IEC provides no marking procedure to indicate its approval and cannot be rendered responsible for any
equipment declared to be in conformity with one of its standards.
6) Attention is drawn to the possibility that some of the elements of this International Standard may be the subject
of patent rights. The IEC shall not be held responsible for identifying any or all such patent rights.
International Standard IEC 60268-16 has been prepared by IEC technical committee 100:
Audio, video and multimedia systems and equipment.
This third edition cancels and replaces the second edition, published in 1998. This third
edition constitutes a technical revision.
The text of this standard is based on the following documents:
FDIS Report on voting
100/650/FDIS 100/677/RVD
Full information on the voting for the approval of this standard can be found in the report on
voting indicated in the above table.
This publication has been drafted in accordance with the ISO/IEC Directives, Part 2.
60268-16 IEC:2003(E) – 5 –
The committee has decided that the contents of this publication will remain unchanged until
2005. At this date, the publication will be
• reconfirmed;
• withdrawn;
• replaced by a revised edition, or
• amended.
A bilingual edition of this standard may be issued at a later date
– 6 – 60268-16 IEC:2003(E)
SOUND SYSTEM EQUIPMENT –
Part 16: Objective rating of speech intelligibility
by speech transmission index
1 Scope
This part of IEC 60268 defines objective methods for rating the transmission quality of speech
with respect to intelligibility. The four methods, which are closely related, are referred to as
the “STI,” the “STITEL”, the “STIPA” and the “RASTI” methods (see Clause 3). The methods
are intended for rating speech transmission with or without sound systems.
A survey of other methods of determining or predicting speech intelligibility is also included,
together with a method of correlating the results of different methods of determination.
2 Normative references
The following referenced documents are indispensable for the application of this document.
For dated references, only the edition cited applies. For undated references, the latest edition
of the referenced document (including any amendments) applies.
ISO 4870:1991, Acoustics – The construction and calibration of speech intelligibility tests
ITU-T Recommendation P.51:1996, Artificial mouth
3 Definitions and abbreviations
For the purpose of this document, the following definitions apply.
3.1
speech transmission index (STI)
physical quantity representing the transmission quality of speech with respect to intelligibility
3.2
speech transmission index for telecommunication systems (STITEL)
index obtained by a condensed version of the STI method but still responsive to distortions
found in communication systems
3.3
speech transmission index for public address systems (STIPA)
index obtained by a condensed version of the STI method but still responsive to distortions
found in room acoustics including public address systems
3.4
room acoustics speech transmission index (RASTI)
index obtained by a condensed version of the STI method, to be used for screening
purposes and focused on direct communication between persons without making use of a
communication system. RASTI accounts for noise interference and distortions in the time
domain (echoes, reverberation)
60268-16 IEC:2003(E) – 7 –
4 Description of the methods
4.1 General
The methods can be used to compare speech transmission quality at various positions and for
various conditions within the same listening space, in particular for assessing the effect of
changes in the acoustic properties. This includes effects from the presence of an audience or
1)
of changes in any sound system [1] .The methods are also able to predict the absolute rating
of the speech transmission quality with respect to intelligibility when comparing different
listening spaces under similar conditions or assessing a speech communication channel.
Annex A provides a more detailed description of the basis of the speech transmission index.
The determination of the transmission quality of speech with respect to intelligibility is based
on the reduction of the modulation index m of a test signal, simulating the speech
i
characteristics of a real talker, when sounded in a room or through a communication channel.
The test signal is transmitted by a sound source situated at the talker's position to a
microphone at any listener's position, where the modulation index is m .
o
For the sound source, the important characteristics are the physical size, the directivity, the
position and the sound pressure level.
The typical test signal consists of a noise carrier with a speech-shaped frequency spectrum
and a sinusoidal intensity modulation with modulation frequency F (see Figure 1).
Echoes,
Input Output
reverberation,
1/F 1/F
noise
Time Time
Ī (1+ m cos 2πFt) Ī (1+ m cos 2πF(t +τ))
i i 0 0
Modulation transfer function m (F)
1,0
0,8
m
0,6
D
m =
m
i
0,4
0,2
0 0,5 1 2 4 8 16
Modulation frequency F Hz
IEC 1572/03
NOTE m and m are the modulation indices of the input and the output signals, respectively. I and I are the
i o
i o
input and output intensities, the intensities being equal to the square of the sound pressure levels (p ).
Figure 1 – Modulation transfer function: input/output comparison
____________
1)
Figures in square brackets refer to the bibliography.
– 8 – 60268-16 IEC:2003(E)
The reduction in the modulation index is quantified by the modulation transfer function m(F)
which is determined by
mo
m(F)=
m
i
and is interpreted in terms of an apparent signal-to-noise ratio (SNR), irrespective of the cause
of the reduction which can be reverberation, echoes, non-linear distortion components or
interfering noise, determined by
m(F )
SNR = 10lg
App
1− m(F)
The values of the apparent signal-to-noise ratio are limited to the range ±15 dB. Values less
than –15 dB are given the value of –15 dB and values greater than 15 dB are given the value
of 15 dB.
4.2 The STI method
4.2.1 General
The STI method, described in Annex A, is based on the determination of the modulation
transfer function m(F) for 98 data points, obtained for 14 modulation frequencies at one-third
octave intervals ranging from 0,63 Hz up to and including 12,5 Hz and for seven octave bands
with centre frequencies ranging from 125 Hz up to and including 8 kHz (see Figure A.3).
4.2.2 Precision of the STI method
Because the test signal is band-limited random or pseudo-random noise, repetition of
measurement does not normally produce identical results, even under conditions of steady
interference. The results centre on a mean with a certain standard deviation. This depends,
amongst other factors, on the number of discrete measurements of the modulation transfer
function (usually 98 for the STI method) and the measuring time involved. Typically, the value
of the standard deviation is about 0,02 for a measuring time of 10 s for each m(F) and with
stationary noise interference. With fluctuating noise (for example, a babble of voices), higher
standard deviations may be found possibly with a systematic error. This can be checked by
carrying out a measurement in the absence of the test signal. This should result in a residual
STI value less than 0,20. An estimate of the standard deviation should be made by repeating
measurements for at least a restricted set of conditions.
4.2.3 Limitations of the STI method
Due to the form of the test signals and the analysis, the types of distortion not accounted for
are frequency shifts (such as those found with devices for preventing acoustic feedback and
with single sideband radio transmissions), frequency multiplication (for example, analogue
tape recordings played at incorrect speed) and systems such as vocoders that encode speech
fragments (for example, linear predictive coding which might use code-book related synthesis
or the introduction of errors related to voiced/unvoiced speech fragments and pitch errors).
The method should not be used for transmission channels
a) which introduce frequency shifts or frequency multiplication, or
b) which include vocoders (i.e. linear predictive speech coder (LPC), code-excited linear
predictive coder (CELP), residually excited linear predictive coder (RELP), etc.).
Without specific corrections, the STI method is not a reliable prediction measure of the
intelligibility of speech for hearing-impaired listeners [17] or to the wearers of ear defenders.
60268-16 IEC:2003(E) – 9 –
4.3 The STITEL method
4.3.1 General
A simplification can be applied to the test signal if the uncorrelated (speech-like) modulations
required for the correct interpretation of non-linear distortions, are omitted. This opens up the
possibility of modulating and parallel processing all seven frequency bands simultaneously,
thus reducing measuring time. The STITEL method, described in Annex B, employs this
simplification and takes 10 s to 15 s for a measurement.
4.3.2 Precision of the STITEL method
As with the STI method (see 4.2.2), results are mean values with a certain standard deviation,
due to the randomness of noise. The standard deviation depends on the number of discrete
measurements of the modulation transfer function (typically seven for the STITEL method)
and the measuring time involved. The standard deviation should be estimated by performing
repeated measurements, at least for a restricted number of conditions.
4.3.3 Limitations of the STITEL method
The STITEL method should not be used for transmission channels
a) which introduce frequency shifts or frequency multiplication;
b) which include vocoders (i.e. LPC, CELP, RELP, etc.);
c) which introduce strong non-linear distortion components;
d) for which reverberation time is strongly frequency-dependent. Over the range of centre
frequencies 125 Hz to 8 kHz, the uniformities of the octave-band early decay times and
signal-to-noise ratios should fall within the permitted area shown in Figure 3;
e) having echoes stronger than –10 dB referred to the primary signal;
f) if the background noise has audible tones and/or marked peaks or troughs in the octave-
band spectrum;
g) if the background noise is impulsive and/or the space is not substantially free of discrete
echoes, particularly flutter echoes whose repetition frequency is an integral multiple of
one or more of the modulation frequencies [2].
If c), d), or e) or all three apply, or possibly apply, the STI method should be used instead, or
used to verify the results obtained by the STITEL method.
– 10 – 60268-16 IEC:2003(E)
4.4 The STIPA method
4.4.1 General
A simplification can be applied to the test signal if the uncorrelated (speech-like) modulations,
required for the correct interpretation of non-linear distortions, are omitted. This opens up the
possibility of modulating and parallel processing of all frequency bands simultaneously, thus
reducing measuring time. For each frequency band the modulation transfer is determined
for two modulation frequencies. The STIPA method, described in Annex C, employs this
simplification and takes 10 s to 15 s for a measurement
4.4.2 Precision of the STIPA method
As with the STI method (see 4.2), results are mean values with a certain standard deviation,
due to the randomness of noise. The standard deviation depends on the number of discrete
measurements of the modulation transfer function (typically 12 for the STIPA method) and the
measuring time involved. The standard deviation should be estimated by performing repeated
measurements, at least for a restricted number of conditions.
4.4.3 Limitations of the STIPA method
The STIPA method should not be used for public address systems
a) which introduce frequency shifts or frequency multiplication;
b) which include vocoders (i.e. LPC, CELP, RELP, etc.);
c) if the background noise is impulsive;
d) which introduce strong non-linear distortion components.
If d) applies, or possibly applies, the STI method should be used instead or used to verify
the results obtained by the STIPA method.
4.5 The RASTI method
4.5.1 General
Another simplification that can be applied is a reduction in the number of octave bands. This
is the case with the RASTI method, described in Annex D, in which the analysis is restricted
to only two octave bands with centre frequencies 500 Hz and 2 kHz, and to only four and five
modulation frequencies, respectively, in these bands. This implies that bandpass limiting and
background noise with an irregular spectrum are not accounted for correctly, nor is the effect
of non-linear distortion included. The RASTI method can, however, be used as a screening
approach for most person-to-person communications in room acoustic applications. As with
the STI method, certain distortions, particularly those from reverberation, if smooth and
monotonic, are accounted for correctly [8].
4.5.2 Precision of the RASTI method
As with the STI method (see 4.2), results are mean values with a certain standard deviation,
due to the randomness of noise. The standard deviation depends upon the measuring time
involved, amongst other factors. The standard deviation should be estimated by performing
repeated measurements, at least for a restricted number of conditions. In practice, a
measuring time of 10 s is a useful compromise between speed and accuracy. Figure 2
illustrates the accuracy obtainable with a measuring time of that order.
60268-16 IEC:2003(E) – 11 –
1,0
0,8
0,6
0,4
0,2
0 0,2 0,4 0,6 0,8 1,0
Theoretical RASTI index
–15 0 +15
Signal-to-noise ratio dB
IEC 1573/03
Figure 2 – Relationship between the theoretical STI by the RASTI method
and the STI measured by a proprietary equipment with a measurement time
of 12 s approximately
4.5.3 Limitations of the RASTI method
The application of the RASTI method is limited by factors concerned with speech
transmission, background noise and reverberation. Therefore, its use should be restricted to
cases where the following conditions are met.
a) No frequency shifts or frequency multiplication.
b) No use of vocoders (i.e. LPC, CELP, RELP, etc.).
c) Essentially linear speech transmission (any amplitude compression or expansion limited to
1 dB) and no peak clipping of a sinusoidal signal giving the same sound pressure level at
the measuring position as the test signal.
d) Overall system frequency response between the octave bands centred on 125 Hz and
8 kHz is uniform, i.e. the difference in transmission between any two adjacent octave
bands should not exceed 5 dB.
e) Background noise is free of audible tones and of marked peaks or troughs in the octave-
band spectrum.
f) Background noise is not impulsive and the space is substantially free of discrete echoes,
particularly flutter echoes whose repetition frequency is an integral multiple of one or more
of the modulation frequencies [2].
g) Reverberation time is not strongly frequency-dependent. Over the range of centre
frequencies 125 Hz to 8 kHz, the uniformities of the octave-band early decay times (first
5 dB) and signal-to-noise ratios should fall within the permitted area shown in Figure 3.
h) Background noise does not vary substantially with time.
Measured RASTI index
– 12 – 60268-16 IEC:2003(E)
1,6
1,4
1,2
Acceptable area
1,0
0,8
0,6
0 2 4 6 8 10
Aggregate deviation of the
signal-to-noise ration dB
IEC 1574/03
NOTE 1 The absolute aggregate deviation of the signal-to-noise ratio is the algebraic sum of the differences of
the octave-band signal-to-noise ratios in the five octave bands centred on 125 Hz, 250 Hz, 1 kHz, 4 kHz and 8 kHz
from the arithmetic mean of the signal-to-noise ratios of the 500 Hz and 2 kHz octave bands. Signal-to-noise ratios
which exceed ±15 dB are set to +15 dB or –15 dB, respectively.
NOTE 2 The early decay time mean ratio is the average early decay time over the octave bands centred on
125 Hz, 250 Hz, 1 kHz, 4 kHz and 8 kHz, divided by the average early decay time of the 500 Hz and 2 kHz octave
bands.
Figure 3 – Conditions under which RASTI results
do not differ by more than 0,05
The STI, STITEL and the STIPA methods may be considered where conditions c), d) or e) are
not met. Where f) or g) are not met, the STI and STIPA methods should be used.
In cases where conditions a), b) or h) are not met, it is necessary to use another method such
as subjective test methods. These can be based on words in a carried phrase as phonetically
balanced or equally balanced words or modified rhyme tests.
Early decay time
mean ratio
60268-16 IEC:2003(E) – 13 –
4.6 Methods of measurement
Any compression or non-linear amplitude or non-stationary frequency or temporal processing
should be bypassed before carrying out RASTI measurements, but it is essential to ensure
that any consequent effects on the sound pressure levels produced by the system under test
are compensated.
Generally, in a listening space, speech intelligibility depends upon the directivity of the
source; therefore, a mouth simulator having similar directivity characteristics to those of the
human head/mouth (see ITU-T Recommendation P.51) should be used for the highest
accuracy when assessing the intelligibility of unamplified talkers. When speech is relayed
through a sound system, a simulator is not normally required unless a close talking or noise
cancelling microphone is involved.
4.6.1 Method of measurement using an acoustic
excitation signal
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...