Mean Opinion Score

From OnnoWiki
Jump to navigation Jump to search

In multimedia (audio, voice telephony, or video) especially when codecs are used to compress the bandwidth requirement (for example, of a digitized voice connection from the standard 64 kilobit/second PCM modulation), the mean opinion score (MOS) provides a numerical indication of the perceived quality of received media after compression and/or transmission. The MOS is expressed as a single number in the range 1 to 5, where 1 is lowest perceived audio quality, and 5 is the highest perceived audio quality measurement.

MOS tests for voice are specified by ITU-T recommendation P.800

The MOS is generated by averaging the results of a set of standard, subjective tests where a number of listeners rate the heard audio quality of test sentences read aloud by both male and female speakers over the communications medium being tested. A listener is required to give each sentence a rating using the following rating scheme:


Mean opinion score (MOS)
MOS Quality Impairment
5 Excellent Imperceptible
4 Good Perceptible but not annoying
3 Fair Slightly annoying
2 Poor Annoying
1 Bad Very annoying

The MOS is the arithmetic mean of all the individual scores, and can range from 1 (worst) to 5 (best).

Compressor/decompressor (codec) systems and digital signal processing (DSP) are commonly used in voice communications, and can be configured to conserve bandwidth, but there is a trade-off between voice quality and bandwidth conservation. The best codecs provide the most bandwidth conservation while producing the least degradation of voice quality. Bandwidth can be measured quantitatively, but voice quality requires human interpretation, although estimates of voice quality can be made by automatic test systems.

A similar process can be used to evaluate subjective video quality.

As an example, the following are mean opinion scores for one implementation of different codecs:


Codec Data rate
[kbit/s]
Mean opinion score
(MOS)
G.711 (ISDN) 64 4.3
iLBC 15.2 4.14
AMR 12.2 4.14
G.729 8 3.92
G.723.1 r63 6.3 3.9
GSM EFR 12.2 3.8
G.726 ADPCM 32 3.8
G.729a 8 3.7
G.723.1 r53 5.3 3.65
GSM FR 12.2 3.5

A drawback of obtaining MOS estimations is that it may be more time-consuming and expensive as it requires hiring experts to make estimations. When a voice coding system is under development, or the developer has to test and compare a couple of audio systems, it's very important to have a possibility for a quick check.

Some suitable English-language phrases used for determining a MOS as suggested by ITU-T recommendation P.800 are:

  • You will have to be very quiet.
  • There was nothing to be seen.
  • They worshipped wooden idols.
  • I want a minute with the inspector.
  • Did he need any money?

See also

  • Subjective video quality
  • MUSHRA ITU BS.1534 Recommendation
  • PSQM Perceptual Speech Quality Measure (ITU-T P.861 - withdrawn and replaced with PESQ ITU-T P.862)
  • PESQ Perceptual Evaluation of Speech Quality, is mechanism for automated assessment of the speech quality enjoyed by the user of a telephone system. It is standardised as ITU-T recommendation P.862 (02/01).
  • PEVQ Perceptual Evaluation of Video Quality, a measurement algorithm for the automated assessment of video quality.
  • PEAQ Perceptual Evaluation of Audio Quality, a measurement algorithm for the automated assessment of audio quality.

External links