Vocal Quality

So far we have been looking at pitch and loudness. Now we'll take a look at vocal quality. As a review, from the diagram below, you can see the perceptual, acoustic, and physiologic attributes of pitch, loudness, and quality.


Voice Characteristics

Perceptual Attribute

Acoustic Correlate

Primary Physiologic Correlate


Fundamental Frequency

Rate of Vocal Fold Vibration


Intensity or Sound Pressure Level

Subglottal Pressure


Spectral Distribution

Mode of Vocal Fold Vibration


Voice quality is a perceptual construct, studied in the realm of voice disorders. It is acoustically measured by spectral distribution, and has to do with the synchrony of vocal fold movement. We can look at spectral distribution by looking at a few graphs in the figure below, which are various ways to produce the sound [a].

tau.jpg In this figure, T represents tau, which is the period of a wave form which is repeated. If the period are fairly steady or regular, we get a smooth-sounding voice. If the periods are exactly the same, we get a computerized voice. Looking at the figure, you can see several types of voices.


Breathy: This voice has lost the richess of the waveform. The vocal folds are probably vibrating, but are not making good contact. The breathy voice looks like the glottal, or triangular waveform. The air flowing between the vocal folds is turbulent and noisy.


Laryngealized: This would be a harsh or rough voice. T2 is slightly longer than T1 . A rough or hoarse voice can be the first indication of a voice problem. The vocal folds are vibrating in a less periodic way, and this adds noise to the signal, which is perceived as hoarseness or roughness


Whispered: Thre is no voicing here, so there is no periodicity. Basically, the vocal folds are staying away from each other in the midline. This shape is a noise.


Pulsated: This shape is a glottal fry, which sounds like this.

The closure of the vocal folds is loose, air bubbles up throught the glottis, and the sound of the voice is gravely.


Voiceless: The voicess graph really looks like silence. If it were truly voiceless, there would be some wiggles in the line, as sound is produced.


Here are some more terms associated with voice quality:


Jitter vs. shimmer. Jitter is cycle-to-cycle perturbations in frequency, and shimmer is cycle-to-cylce perturbations in amplitude. We know that the larygealized voice in the figure on the left is harsh or rough. The differences betweeen T1 and T2 is jitter--cycle-to-cycle fluctuations in frequency. For a smooth voice, you don't want too much jitter or shimmer. You want frequency and amplitudes to be similar peak to peak.



Vibrato is a singing voice, a change in frequency.  This is a controlled or purposeful variation in frequency and usually intensity.  It is usually at a rate of 5 Hz, which means 5 times per second. For example, if [a] is said, the cycles are constant in amplitude and frequency.  But if it is sung, we can look at hundreds of these cycles and see a waxing and waning of cycles.  With vibrato, an increase and decrease in the cycles would occur at about 5 times per second.


Tremor: We see this in a neurologic impairment or in an older voice.  Unlike vibrato, tremor is NOT controlled, and NOT purposeful, and has to do with the characteristics of the nervous system.  It is a bit faster than vibrato—about 8 Hz.  So, over 100 cycles, see the waxing and waning at about 8 Hz.


For vocal quality, the synchrony of the vocal folds is important. If one vocal fold is paralyzed, for examle, the voice will be breathy.  The non-paralyzed fold CAN cross the midline to some degree, and it is common for a paralyzed fold to be somewhere in the middle.  You also get asynchrony if there is a tumor in one of the vocal folds.


The tightness of vocal fold adduction 

We also need to consider the tightness of the adduction for vocal quality.  Tightness is on a continuum.  During a whistle, there is no oscillation—no vibration of the vocal folds.  During quiet breathing, the vocal folds are far apart, so there is no sound.  A phenomonon called stridor is seen in infants. The vocal folds are too close together, and you can hear the infant breathe.  Stridor is a noisy turbulence.  This is in contrast to a a whisper, where there is no voicing.  The vocal folds are staying away from each other in the midline.  This shape is a noise. It has no fundamental frequency, no harmonic structure, and can't be easily inflected.  Whispering is not an efficient way to use the breath supply.  We can phonate for 30 seconds but we can whisper for only 10 seconds before having to take a breath.


Two terms are used to designate the tightness of vocal fold adduction. In a normal voice,the vocal folds come together, but not excessively, so there is not a lot of strain on the vocal folds.

With hypoadduction, we get a breathy voice. As you recall, during a breathy voice, the vocal folds are probably vibrating, but not making good contact.  We don't get a rich waveform, and it looks a little like the triangular glottal waveform.  The air flowing between the folds is turbulent and noisy. 


With hyperadduction, the folds come together very tightly, and we get a strained voice.  In the disorder spasmodic ( or spastic) dysphonia, there is 0 airflow when the vocal folds are adducted.  On occasions, a glottal stop will be produced.  The /o/ in "owe" starts with a glottal stop.  But in /ho/, the glottis is open and the folds gradually start vibrating as they come together.  If there is too much adduction, one technique in therapy is to have clients start words with /h/.


Now that you've learned about various voices, take a short quiz about glottal size.



Match the items.

The task is to match the lettered items with the correct numbered items. Appearing below is a list of lettered items. Following that is a list of numbered items. Each numbered item is followed by a drop-down. Select the letter in the drop down that best matches the numbered item with the lettered alternatives.

a. small or medium glottis

b. No glottis

c. large glottis