Communications and speech intelligibilitySpeech Intelligibility is usually expressed as a percentage of words, sentences or phonemes (speech sounds making up words) correctly identified by a listener or group of listeners when spoken by a talker or a number of talkers. It is an important measure of the effectiveness or adequacy of a communication system or of the ability of people to communicate in noisy environments.
Speech intelligibility can be measured directly. A number of talkers will speak words or sentences and a number of listeners will indicate what they hear. Often tape recordings of the speakers will be used rather than live speech, so that different communications systems can be compared with exactly the same speech material. There are a number of different intelligibility tests and ISVR Consulting has used phonemically balanced (PB) wordlists, 'Harvard' sentence tests and the Diagnostic Rhyme Test (DRT). Intelligibility tests are time consuming and consequently expensive. Sometimes there is no alternative to full intelligibility testing but often, at least in the more straightforward cases, it is possible to estimate speech intelligibility from physical measurements and to dispense with the need for listeners and talkers.
Speech communications can be divided into 3 categories:
- unamplified speech, normally face to face,
- amplified speech or communications where the speech waveform is transmitted, and
- vocoded or synthetic speech where the waveform is not transmitted.
Unamplified unaided speech is important in offices, workshops, vehicles and many other situations. Many factors influence the intelligibility, but the background noise is usually the most important to consider. The most widely used physical measure is the Speech Interference Level or SIL. The SIL is calculated very simply as the arithmetic mean of the noise levels measured in the 500 Hz, 1 kHz, 2 kHz and 4 kHz octave bands. (Sometimes only 3 frequency bands are used, but all current standards specify the four.) Using a graph in ANSI Standard S3.14-1977 or a table in ISO TR 3352:1974 it is simple to look up the maximum distance between talkers and listeners at which 'just reliable' communication is possible. The SIL is purely a measure of background noise. Speech is not measured but it is assumed that speakers will automatically adopt a vocal effort which is appropriate in the noise. A more involved procedure has recently been published as an International Standard, ISO 9921-1:1996. This standard allows for additional circumstances, such as when a talker is wearing hearing protection and consequently uses a lower than normal vocal effort.
With face to face speech it is also possible to estimate intelligibility from the speech-to-noise ratio, where both the speech and noise are A-weighted. This method is not always reliable, depending upon the noise spectrum.
The Articulation Index, AI, can be used with face to face conversation but is more complicated than SIL and although possibly more reliable, the extra reliability does not warrant the extra complication. The Articulation Index is frequently used with amplified speech where the SIL is not appropriate, and is described below.
A more recent measure is the Speech Transmission Index, or STI, usually implemented in a simplified version known as 'RASTI' - RApid Speech Transmission Index. This method uses a transmitter to broadcast a special modulated noise test signal from a loudspeaker at the talker's location. A receiver with a microphone gives a direct read out of the RASTI value at the receiver position. The receiver is light and portable and is generally moved around to survey the whole of the possible listening area, identifying any difficult locations. The RASTI method can also take account of the effects of reverberation, as well as background noise, on intelligibility. A RASTI value can be in the range 0 to 1. Generally a RASTI value above 0.75 is regarded as excellent, 0.6 to 0.75 as good, 0.45 to 0.6 as fair, 0.3 to 0.45 as poor, and below 0.3 as unsatisfactory. These values can only be a guide: what is regarded as good in some circumstances with trained talkers, experienced listeners and a limited number of possible messages may be poor in other circumstances with inexperienced talkers or listeners, or where messages are unpredictable.
Amplified speech systems include public address systems, telephones, radio links, intercoms and the like. For systems subject mainly to noise at the speaker's and or listener's position, or bandwidth limitations or some analogue distortions, the simplest alternative to intelligibility testing is the Articulation Index, or AI. The AI is calculated by measuring the speech to noise ratios in a number of frequency bands in the speech frequency range. The speech to noise ratio in each band is multiplied by a weighting factor according to how important that band is to speech intelligibility, then the weighted values for all the bands are added up to give a single number from 0 to 1. Generally an AI greater than 0.7 is excellent, an AI between 0.5 and 0.7 is good, an AI between 0.3 and 0.5 is acceptable for some applications, and less than 0.3 is generally unsatisfactory. These values again can only be a guide: what is good in some circumstances may be poor in other circumstances.
The RASTI method is increasingly used for amplified speech systems though care must be taken to match the level of the RASTI test signal to levels of normal speech used with the system. There are some circumstances under which the RASTI method cannot be used reliably, so it cannot always be used in place of the AI.
Vocoded or analysis-synthesis speech systems, and some digital radio systems cannot be tested using RASTI, AI or similar straightforward physical measurements. Instead, full scale intelligibility testing has to be carried out with panels of talkers and listeners. The reason is that acoustic features of this unnatural speech which listeners use to discriminate between phonemes may be fewer than with real speech and differently distributed in the spectrum. Generally word lists are used rather than sentence lists as speech material because sentences provide grammatical and contextual clues enabling listeners to guess words. Sometimes however, when investigating effects of some types of radio interference on intelligibility, sentence lists are the most appropriate.
Examples of applications
ISVR Consulting has used AI and RASTI to evaluate Passenger Public Address systems in helicopters and fixed wing aircraft, including measurements to the Civil Aviation Authority's Specification No 15 which is a requirement for certification in the UK and some Commonwealth countries. We have also carried out intelligibility tests of Private Mobile Radio (PMR) systems subjected to various forms of interference for the Radiocommunications Agency of the DTI, and Diagnostic Rhyme Tests of low bit rate communication systems. ISVR Consulting has also generated and supplied recorded speech material for use by others in their own test programmes.