Phonetics vs. Phonology

Phonetics: the physical manifestation of language in sound waves; how these sounds are articulated and perceived phonology: the mental representation of sounds as part of a symbolic cognitive system; how abstract sound categories are manipulated in the processing of language.

The sound structure of language encompasses quite a lot of topics, including the following.
the anatomy, physiology, and acoustics of the human vocal tract;
the nomenclature for the vocal articulations and sounds used in speech, as represented by the International Phonetic Alphabet; hypotheses about the nature of phonological features and their organization into segments, syllables and words;
the often-extreme changes in the sound of morphemes in different contexts;
the way that knowledge of language sound structure unfolds as children learn to speak;
the variation in sound structure across dialects and across time.
Instead of giving a whirlwind tour of the whole of phonetics and phonology, this lecture has two more limited goals. The first goal is to put language sound structure in context.

Why do human languages have a sound structure about which we need to say anything more than that vocal communication is based on noises made with the eating and breathing apparatus?
What are the apparent "design requirements" for this system, and how are they fulfilled?
The second goal is to give you a concrete sense of what the sound systems of languages are like. In order to do this, we will go over examples of sound alternations in various languages. Along the way, a certain amount of the terminology and theory of phonetics and phonology will emerge.

Phonetics: the sounds of language

While our discussion will range back and forth somewhat between the two subdisciplines, we will essentially be progressing from the nuts and bolts mechanics of speech sounds through their classification and representation and on to their systematic organization within a given language. Thus we can divide up the lecture into a more or less phonetic half and a more or less phonological half.

Vocal tract anatomy

The vocal tract is what we use to articulate sounds. It includes the oral cavity (essentially the mouth), the nasal cavity (inside the nose), and the pharyngeal cavity (in the throat, behind the tongue). For most speech sounds, the airstream that passes through this tract is generated by the lungs.

A number of anatomical features of humans that originated for quite different functions have been recruited to serve the purposes of language. Many of these same recruitments have been made by other animals for vocalization.

Organ Survival function Speech function
Lungs exhange oxygen and carbon dioxide supply airstream
Vocal cords prevent food and liquids from entering the lungs produce vibration in resonating cavity
Tongue move food within the mouth articulate sounds
Teeth break up food provide passive articulator and acoustic baffle
Lips seal oral cavity articulate sounds

In some cases the anatomy seems to have evolved specifically to serve language independent (and even contrary to) the original function.

For example, the vocal cords in humans are more muscular and less fatty than in other primates such as chimps and gorillas. This permits greater control over their precise configuration.

Strikingly, the lowering of the larynx, which permits a greater variety of articulations with the tongue, has the consequence of making it much easier for humans to choke. These X-rays and diagrams show the vocal tracts of the gorilla, chimp, and human, highlighting the tongue, larynx, and air sacs (the last for the apes only).

The longer vocal tract (seen behind the tongue in the human) separates the soft palate and epiglottis, so that airflow between the larnyx and the nose cannot avoid passing through the oral cavity. This is why humans choke more easily than other primates. Obviously the selective advantage of increased articulatory ability must have been quite strong to justify the increase in the likelihood of choking.

The following illustration is called a midsagittal section: it's what the head would look like if you cut it in half along the front-back dimension.

From the ultimate Visual Dictionary p. 245

This diagram includes many detailed anatomical features that you certainly don't need to learn, but it should give you an idea of the complex context in which speech sounds are articulated.

Here is a less detailed diagram showing the most important parts of the vocal tract.

From Language Files (7th ed.), p. 40

We'll be referring to these places in the vocal tract when describing the way various sounds are produced.

Basic sounds: buzz, hiss, and pop.

There are three basic modes of sound production in the human vocal tract that play a role in speech: the buzz of vibrating vocal cords, the hiss of air pushed past a constriction, and the pop of a closure released.

Laryngeal buzz

The larynx is a rather complex little structure of cartilage, muscle and connective tissue, sitting on top of the trachea (windpipe). It is what lies behind the "adam's apple," the protrusion in the front of the throat (usually more prominent in males). The original role of the larynx is to seal off the airway, in order to keep food, liquid and other unwanted things out of the lungs, and also to permit the torso to be pressurized (by holding in air) to provide a more rigid framework for heavy lifting and pushing. Part of the airway-sealing system in the larynx is a pair of muscular flaps, the vocal folds (also called "vocal cords"), which can be brought together to form a seal, or moved apart to permit free motion of air in and out of the lungs. Here are the vocal cords seen when they are open to allow free passage of air. The front of the body is toward the top of the photo; we're looking down into the dark trachea.

Now for a little aerodynamics. When any elastic seal is not quite strong enough to resist the pressurized air it restricts, the result is an erratic release of the pressure through the seal, creating a sound.

Some homely examples of a similar sound source are the raspberry, where the leaky seal is provided by the lips; the burp, where the opening of the esophagus provides the leaky seal or the fart sounds you can make with your hands under your armpits.

The mechanism of this sound production is very simple and general:

the air pressure forces an opening, through which air begins to flow; the flow of air generates a so-called Bernoulli force at right angles to the flow (which in other circumstances helps airplanes to fly); this force combines with the elasticity of the tissue to close the opening again; and then the cycle repeats, as air pressure again forces an opening.

In many such sounds, the pattern of opening and closing is irregular, producing a belch-like sound without a clear pitch -- think of the air being released from a balloon.

However, if the circumstances are right, a regular oscillation can be set up, giving a periodic sound that we perceive as having a pitch. Many animals have developed their larynxes so as to be able to produce particularly loud sounds, often with a clear pitch that they are able to vary for expressive purposes.

When the vocal cords are vibrating regularly in this manner, we say that the sound is voiced. Without the vibration, the sound is voiceless (or equivalently, unvoiced). This is exactly the property that distinguishes many sounds in English and other languages. A few examples:

Voiceless Voiced
s z
f v
p b

If you hold your hand to your throat, you will feel vibration for sounds like [z] but not for [s]. You will also feel it for nasals like [m, n] and for vowels like [a]; they are all voiced. (There is another difference between sounds like [p] and sounds like [b]. The former are accompanied by the puff of breath called aspiration, while the latter are not.)

The hiss of turbulent flow

Another source of sound in the vocal tract -- for humans and for other animals -- is the hiss generated when a volume of air is forced through a passage that is too small to permit it to flow smoothly. The result is turbulence, a complex pattern of swirls and eddies at a wide range of spatial and temporal scales. We hear this turbulent flow as some sort of hiss.

In the vocal tract, this turbulent flow can be created at many points of constrictions. For instance, the upper teeth can be pressed against the lower lip -- if air is forced past this constriction, it makes the sound associated with the letter [f].

When this kind of turbulent flow is used in speech, phoneticians call it frication, and sounds that involve frication are called fricatives. Some English examples are the sounds written "f, v, s, z, sh, th."

The pop of closure and release

When a constriction somewhere in the vocal tract is complete, so that air can't get past it as the speaker continues to breath out, pressure is built up behind the constriction. If the constriction is abruptly released, the sudden release of pressure creates a sort of a pop. When this kind of closure and release is used as a speech sound, phoneticians call it a stop (focusing on the closure) or a plosive (focusing on the release).

As with frication, a plosive constriction can be made anywhere along the vocal tract, from the lips to the larynx. Three common examples:

It is difficult to make a firm enough seal in the pharyngeal region to produce a stop, although a narrow fricative constriction in the pharynx is possible.

The phonetic alphabet

The human vocal apparatus can produce a great variety of sounds. As we look at words in other languages -- and study the sounds of English in more detail -- we need a way to write these sounds down. That's what phonetic alphabets are for.

Historical background

In the mid-19th century, Melville Bell invented a writing system that he called "Visible Speech." Bell was a teacher of the deaf, and he intended his writing system to be a teaching and learning tool for helping deaf students learn spoken language. However, Visible Speech was more than a pedagogical tool for deaf education -- it was the first system for notating the sounds of speech independent of the choice of particular language or dialect. This was an extremely important step -- without this step, it is nearly impossible to study the sound systems of human languages in any sort of general way.

In the 1860's, Melville Bell's three sons -- Melville, Edward and Alexander -- went on a lecture tour of Scotland, demonstrating the Visible Speech system to appreciative audiences. In their show, one of the brothers would leave the auditorium, while the others brought volunteers from the audience to perform interesting bits of speech -- words or phrases in a foreign language, or in some non-standard dialect of English. These performances would be notated in Visible Speech on a blackboard on stage.

When the absent brother returned, he would imitate the sounds produced by the volunteers from the audience, solely by reading the Visible Speech notations on the blackboard. In those days before the phonograph, radio or television, this was interesting enough that the Scots were apparently happy to pay money to see it!

There are some interesting connections between the "visible speech" alphabet and the later career of one of the three performers, Alexander Graham Bell, who began following in his father's footsteps as a teacher of the deaf, but then went on to invent the telephone. Look especially at the discussion of Bell's "Ear Phonautograph" and artificial vocal tract.

After Melville Bell's invention, notations like Visible Speech were widely used in teaching students (from the provinces or from foreign countries) how to speak with a standard accent. This was one of the key goals of early phoneticians like Henry Sweet (said to have been the model for Henry Higgins, who teaches Eliza Doolittle to speak "properly" in Shaw's Pygmalion and its musical adaptation My Fair Lady).


The International Phonetic Association (IPA) was founded in 1886 in Paris, and has been ever since the official keeper of the Inernational Phonetic Alphabet (also IPA), the modern equivalent of Bell's Visible Speech. Although the IPA's emphasis has shifted in a more descriptive direction, there remains a lively tradition in Great Britain of teaching "received pronunciation" using explicit training in the IPA.

While other phonetic alphabetic notations are in use, the IPA alphabet is the most widely used by linguists.


Many of these symbols have their familiar value, but don't confuse spelling with pronunciation. When we write a phonetic transcription, i.e. how a sound or word is pronounced, we'll enclose it in [square brackets] so we know to interpret the symbols in the phonetic alphabet. Notice that the chart (like the main IPA chart) is organized along two main dimensions. Only terms needed for English are listed here.

In addition, the obstruent sounds (stops, affricates, fricatives) come in voiced and voiceless varieties. The sonorant sounds (nasals, liquids, glides) are normally voiced.

The glottal stop, which is written as , has a limited role in English. It is the catch in the throat between the two vowels in uh-oh.
The patterning of sounds in languages generally depends on the "natural classes" of sounds defined by these articulatory labels. For example, in English, the plural suffix spelled "(e)s" is realized in three different ways, depending on the preceding sound.

voiceless fricative [s] following another voiceless sound

p, t, k, f, θ
examples caps, hats, rocks, reefs, births

voiced fricative [z] following another voiced sound (including vowels)

b, d, g, v, ð, m, n, ŋ, l, r, w, y
tabs, rods, dogs, caves, lathes, drums, pins, songs, pills, cars, cows, eyes

voiced, but with a vowel inserted before it when it follows a "sibilant", i.e. an alveolar or palatal fricative or affricate.

s, z, č, , š, ž
kisses, gazes, churches, judges, wishes, rouges

So the rule determining how you pronounce the plural suffix makes reference to the classes voiced, voiceless and sibilant, not to specific sounds like [b], [p] and [s].
Similarly, the past-tense suffix spelled "ed" is realized in three different ways, again depending on the preceding sound.

voiceless stop [t] following another voiceless sound

p, k, f, θ, s, č, š
hopped, kicked, riffed, frothed, kissed, reached, wished

voiced stop [d] following another voiced sound (including vowels)

b, g, v, ð, z, , ž, m, n, ŋ, l, r, w, y
robbed, rigged, raved, bathed, razed, raged, rouged, hummed, sinned, longed, filled, marred, plowed, eyed

voiced, but with a vowel inserted before it when it follows an alveolar stop (t or d).

t, d
hated, rented, belted, loaded, grounded, welded

For both suffixes, the inserted vowel serves to separate similar sounds (i.e. it occurs when the stem ends in a consonant similar to the suffixal consonant).
As Pinker discusses, these generalizations can extend to new sounds borrowed from other languages. These German words, which end in voiceless fricatives not found in English (velar and palatal), follow the patterns just discussed when the final consonant is pronounced in the German way.

He out-Bachs Bach with voiceless [s]

She out-Bached Bach with voiceless [t]

The extension of patterns in this way confirms that what speakers understand out these processes is not the arbitrary list of sounds that cause a pattern to arise, but rather the class of sounds -- which could contain members not yet heard in the language.