Return to list
Acoustic Phonetics I: Basic Acoustics
Allard Jongman | The University of Kansas
Antônio R. M. Simões | The University of Kansas

Phonetics, the scientific study of human speech sounds, is a multifaceted field that can be broadly categorized into three fundamental domains: Acoustic Phonetics decomposes the speech signal into physical parameters such as frequency and amplitude; Articulatory Phonetics investigates the physiological aspects involved in creating speech sounds, exploring the gestures and mechanisms of the vocal tract; and Auditory Phonetics explores how individuals perceive and interpret speech. This overview takes a closer look at Acoustic Phonetics and its role in the broader context of Phonetics.

Imagine your voice as a symphony. Acoustic phonetics explores the physics of this symphony—the production, the sound waves, and how they travel. Articulatory Phonetics: Picture your voice as an instrument. Articulatory phonetics studies how your "voice instrument" is played—how your vocal folds, tongue, and mouth move to create different sounds. Auditory Phonetics: Now, envision your voice as a story. Auditory phonetics probes how this story is received and interpreted, exploring the auditory journey from sound waves to comprehension.

The human voice is like a complex musical instrument, consisting of three primary components: Power Supply: Comparable to the energy source of an instrument, the lungs provide the necessary power to generate sounds; Oscillator: Just as a guitar string creates music, the vocal folds act as the oscillator, generating the initial sounds. Resonator: Similar to adjusting an instrument for the perfect pitch, the larynx, pharynx, and mouth shape and refine these sounds into the melodies of speech.

The Dance of Sounds: Periodic vs. Aperiodic - Sounds can be classified into two distinct types based on whether or not the vocal folds are vibrating: Periodic (voiced) Sounds: These sounds are created by the cyclic opening and closing of our vocal folds, akin to the steady vibration of a guitar string. Vowel sounds in speech are classic examples of periodic sounds. Aperiodic (voiceless) Sounds: These sounds do not involve vocal fold vibration and therefore have no distinct pattern or cycle, just like the random rustling of leaves in the wind. Consonants, like [s] and [f], fall under this category.

So far, we have contextualized Acoustic Phonetics within the wider realm of Phonetics. While we aimed to simplify the use of technical terminology in these sections, discussing Phonetics or any specialized field without employing its precise jargon may risk a loss of accuracy. Keeping this in mind, the following sections endeavor to maintain accessibility for a general audience while integrating the common terminology associated with Acoustic Phonetics.

Acoustic phonetics is the study of the acoustic characteristics of speech, that is, the study of the physical attributes of speech sounds. In essence, speech involves intricate variations in air pressure resulting from the disturbance of air molecules during the exhalation of air from the lungs. This airflow sets the air molecules in a rhythmic motion, alternately congregating and dispersing, resulting in oscillations that translate to periodic fluctuations in air pressure. These changes in pressure are then conveyed as a sound wave, transmitting the message from speaker to listener. Understanding sound waves entails a grasp of fundamental physical properties—cycles, periods, frequencies, and amplitudes—resembling the orchestration of musical notes. These concepts can be effectively demonstrated by examining a simple wave corresponding to a pure tone, much like the two pure tones depicted in Figure 1. Pure tones are periodic as illustrated by the repetitive pattern.

Figure 1. Two pure tones of different amplitudes, but identical frequencies. (Adapted from Simões, 2022)

Figure 1 illustrates fundamental acoustic parameters, including cycle, period, amplitude, and, indirectly, frequency. The solid horizontal line indicates average atmospheric air pressure, and the red and blue waves show periodic increases and decreases in pressure. Going from left to right, for both tones the pressure increases from atmospheric pressure to maximum pressure, then decreases to atmospheric pressure again, then further decreases to minimum pressure and then increases to reach atmospheric pressure again. This is known as one cycle of the wave. Figure 1 shows three complete cycles. A typical tone (or periodic speech sound) would have many more cycles. The duration of one cycle is known as the period of the wave (expressed in seconds or milliseconds (ms)), while frequency is the number of periods within one second, expressed in hertz (Hz). Perceptually, a higher frequency often corresponds to a higher perceived pitch. Conversely, amplitude refers to the magnitude of vibrations. Larger vibrations result in greater amplitude. Note that the blue wave has a greater amplitude than the red wave. A greater amplitude will typically result in a perceived increase in loudness.

Speech sounds, in contrast to pure tones, which are seldom encountered in the natural environment, consist of complex waves that include many frequencies and amplitudes. Nonetheless, as initially articulated by the French mathematician Fourier in the early 19th century, any complex wave can be represented as a combination of simple waves. This is illustrated in Figure 2. Fourier analysis applied to the complex red waveform on the left reveals the three simple sine waves on the right (and their amplitudes) that make up the complex wave.

Figure 2. Complex wave on the left and the three sine waves on the right that make up the complex wave, as revealed by Fourier analysis. (Adapted from Simões, 2022)

Complex waves exhibit a consistent repetition rate referred to as the fundamental frequency (F0). F0 represents the rate (frequency) of vocal fold vibration, and modifications in F0 contribute to variations in perceived voice pitch. Likewise, adjustments in the number of individual simple wave components and their amplitude relations yield distinctions in perceived timbre or quality.

Fourier’s theorem enables us to represent speech sounds in terms of the frequency and amplitude of each of their constituent simple waves. Such a representation, depicted in Figure 3, is known as the spectrum of a sound. A spectrum is visually displayed as a plot of frequency vs. amplitude, with frequency represented from low to high along the horizontal axis and amplitude from low to high along the vertical axis.

Panel (a) of Figure 3 shows the waveform of a pure tone or simple wave. As indicated, the period of this waveform is 5 ms (the wave repeats itself every 5 ms) and its frequency (the number of periods in one second (1000 ms) is 200 Hz. Panel (b) shows its corresponding spectrum. Note that a simple wave only has one frequency; the spectrum shows this – there is one component (vertical line), at a frequency of 200 Hz.

Panel (c) shows the waveform of a brief stretch of the vowel [i]. The waveform is clearly more complex than that in panel (a). This is because in this complex waveform, many more frequency components are present. Yet, there is a basic pattern of repetition: the wave repeats itself every 8 ms, corresponding to a frequency of 125 Hz. Note that this 125 Hz frequency component is one of many that are present in this vowel. This is clear in the corresponding spectrum in panel (d), which shows a large number of frequencies (bumps).

Figure 3. Examples of waveforms and spectra for a simple wave/pure tone (a, b) and a complex wave (vowel) (c, d). The spectra illustrate the Fourier analysis: sounds are decomposed into their constituent frequencies, one in (b), and many in (d).

Further Reading

Ladefoged, P. (1996). Elements of Acoustic Phonetics. Chicago: The University of Chicago Press.

Reetz, H., and Jongman, A. (2020). Phonetics: Transcription, Production, Acoustics, and Perception. Oxford: Wiley-Blackwell.

Simões, A.R.M. (2022). Spanish and Brazilian Portuguese Pronunciation - The Mainstream Pronunciation of Spanish and Brazilian Portuguese: From Sound Segments to Speech Melodies. Singapore: Springer, Series: Prosody, Phonology and Phonetics.