Header:
Zoom:

Acoustic Vowels-Reference

2026-02-28 - Volker Schubert

For each vowel (IPA/APhA), a set of audio samples and corresponding vocal tract spectra is provided.
Vowel realizations are ordered by the first two formants and further differentiated by spectral profiles.

The table below is an F1–F2 grid with cells subdivided into 4 quadrants for spectral profiles.
Click the table cell quadrants to see and hear the vowel realizations!
Hide this header by clicking on it or by clicking the button in the menu bar!

In the table below, the rows represent F1 frequency centers, the columns F2 frequency centers, and each F1-F2 table cell is split into quarters representing certain high-formants-patterns (F3-F6).
Click the table cells to see and hear the vowels! Headphones are highly recommended.

On small displays, vowel selection is split into two independent selections of
- high-formants-pattern, via the radio buttons (cell quarters) and
- F1-F2 pair, via the table cell.

Hide this header by clicking it (or the visibility toggle button in the menu bar)!

Acoustic Vowels-Reference

2026-02-28 - Volker Schubert

Guide

Which vowels are considered is determined by the choice of vowel set or alphabet. The alphabets IPA and APhA are available. The alphabet also defines the symbol used to represent each vowel.

For each vowel, audio examples of different vowel realizations are provided. Here, it is not the speaker or the speaking context that varies, but only technical criteria. In fact, the audio examples are synthetically generated, meaning they have only a single (synthetic) speaker.

To better systematize the vowels, they are not simply listed, but arranged according to an ordering criterion. More precisely, it is the vowel realizations of the synthetic speaker that are arranged. The main criterion is the frequency peaks of the first two formants, F1 and F2, which is why the reference has the external form of a table.

Another criterion is the spectral shape of the vowel, whereby the many possible shapes are classified into a few spectral profiles. Within the table, this is represented by dividing each F1-F2 cell into 4 quadrants. These quadrants represent certain spectral profiles of that F1-F2 cell. The possible spectral profiles are not global, but depend on the surrounding F1-F2 region and are displayed depending on the cursor position in the row above the table.

Clicking a quadrant selects the vowel realization of the displayed vowel. The audio sample plays. The spectral power distribution of the vocal tract model is displayed.