Phonetic transcription of English (US)

The V2T systems work with a list of words (a lexicon), where each word is given its pronunciation. E.g., a word five is pronounced [fajv]. It is convenient for some words to include more pronunciations, such as Muslim [mazlEm, muslEm, muzlEm], either [ajDEr, íDEr].

Words are marked in the lexicon using letters, numbers, or symbols. Their pronunciations are transcribed using phonemes. Each phoneme is always represented by a single character. E.g., the word M1, which is composed of letter M and numeral 1, is pronounced [emwan], which is transcribed using the respective five phonemes.

The purpose of the phonetic transcription is to convert the word entry into the corresponding pronunciation using phonemes. Since a word in the lexicon may contain digits and other special symbols, phonetic transcription is defined over a limited set of graphemes, which consists of letters of the given alphabet.

It is easy to convert a text entry into graphemes. Transcribe a word using the characters of the given alphabet as it is pronounced. The graphemic transcription of common words is often the same as the word is normally written (and read). On the contrary, attention should be paid to foreign words, abbreviations and words containing numerals or symbols.

The phonetic transcription of the word or phrase takes place in two steps:

  1. step: The word is transcribed using graphemes (alphabet characters) in the way it is pronounced.

  2. step: The phonetic transcription rules (described briefly at the end of the manual) or the appropriate automatic tool or program (the G2P converter) are used.

Examples of graphemic and phonetic transcription of native and foreign words, acronyms, abbreviations, numbers, etc.

Word type Example Graphemic transcription Phonetic transcription
Native word meeting --- mítiN
about --- Ebaut
actor --- AktEr
Foreign word mascarpone --- mAskErpounej
Grenada --- grEnejdE
Macron --- mAkron
Acronym UNESCO --- juneskou
ZIP --- zip
Click4Sky click for sky klikfErskaj
Abbreviation USA --- júesej
ABC --- ejbísí
3D three d Trídí
Multi-word expression The Beatles --- dEbítls
World War II --- wErldWórtú
Numeral expressions 25th twenty fifth twentififT
7x seven times sevntajmz
B52 b fifty two bífiftitú

The list of phonemes and corresponding alphabet characters along with explanatory notes is given below, followed by rules of phonetic transcription from graphemes to phonemes (G2P).

Phoneme Grapheme(s) Example - text Example - phonetic transcription IPA equivalent(s)
a a, u, o, ... run r a n ʌ
á a, au, o, ua, ... father f á D r ɑ/ɑː
A a cat k A t æ
b b, bb blue b l ú b
č ch, tch, tu, ti, ... chat č A t
d d, dd, ed day d e j d
D th then D e n ð
e e, ea, ie, ai, a ... bed b e d ɛ
f f, ff, ph, gh, ... film f i l m f
g g, gg, gh, ... globe g l o u b g
h h, wh house h a u s h
Č j, ge, g, ... joke Č o u k
i i, e, o, u, y, ... sit s i t i/ɪ
í e, ee, ea, ey, ... see s í i:
j y, ... yes j e s j
k k, c, ch, ck, ... cake k e j k k
l l, ll line l a j n l
m m, mm, mb, ... money m a n i i
n n, nn, kn, gn, ... nail n e j l n
N ng, n, nk sing s i N ŋ
o o, uo boy b o j ɔ/o
ó a, au, ou, aw, ... four f ó r ɔː/ɒː
p p, pp place p l e j s p
r r, rr, wr, rh rose r o u z r
s s, ss, c, sc, ... saw s ó s
š sh, ce, s, ci, ... shop š á p ʃ
t t, tt, th, ed time t a j m t
T th thick T i k θ
u u, oo, w, ou, ... put p u t ʊ
ú o, oo, ue, oe, ... lose l ú z
v v, ve, f, ph very v e r i v
w w, wh, u, o wine w a j n w
z z, zz, s, x, ... zone z o u n z
ž s, si, z, ... pleasure p l e ž E r ʒ
E a, e, er, i, ar, or, ... the D E ə/ɜ

In addition to the phonemes described above, the system also works with other acoustic models that represent a variety of noises and silence. Some of them can be used to create word pronunciations - see [list] (noises.html).

A set of allowed graphemes that can be converted into phonemes:

English alphabet letters (both upper-case and lower-case):

a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z

Other characters (foreign alphabet characters or digits) must be, prior to phonetic transcription, converted to a grapheme above (for example, German ü in Zürich is transcribed as its closest pronunciation neighbour, grapheme u).

Phonetic transcription rules (grapheme to phoneme = G2P rules)

