Introduction to Cantonese


Cantonese, one of the most popularly spoken languages in China, uses the same Han characters. Like any other spoken dialects in China, it is mono-syllabic, that is, one character, one sound.

Let's look at one sound, or one syllable, the '-aa1' syllable. This '-aa1' syllable is represented with the first '-' character to indicate that there is no consonant, 'aa' is the symbol of the vowel and '1' stands for the tone level which can be said to be the highest frequency.

Let take a look at the wave of this sound scientifically:

This syllable spans about 500 milli-seconds or half a second.

Let's take a closer look to examine the wave in a microscopic view:

It looks like this syllable has a mix of various amplitudes and frequencies.

We will look at its audio spectrum to see the magnitude of all frequencies that it has:

We will do the same thing using the syllable 'ji1' or just 'i':

The principal vowels

'aa' and 'i' are 2 of 7 principal vowels. The principal vowels are much like the base colours which can use used to create all other colors:

In Colour, realm, the prime colours are R (red), G (green) and B (blue) and they are mixed to have many different colours.

In Cantonese, using alphabets, the prime or the principal Vowels are .

These vowels in chinese character representation will be:

These 7 principal vowels are much like the 7 colours .


The Diphthongs

By mixing the principal vowels like mixing the prime colours, we can get 10 more vowels called the diphthongs or compound vowels. They are:

aai, aau, ai, au, iu, ui, ou, oi, ei, oey. The diphthongs oey can closely resemble oei. Just see i and y are very much the same, except that y is closer to u. The vowel y is like iu except that we use jiu to distinguish iu from y or yu. The vowel oe is pronounced with mouth rounded which is closer to yu than i and therefore using oey is more appropriate than oei. But that is not much of a problem, we can treat them as one (oey=oei)

In chinese representation, we have:

 ╉ 羕 稼 竬 肮 緿 玸 召 店


To see the relationship, we have this table:

The Endings

The principals vowels can mix with each other to form diphthongs. They can also be ended with the following ending sound with these alphabets: m, n, ng, p, k, t.

Ending with m is like . (-am)

Ending with n is like . (gan)

Ending with ng is like . (ngang)

Ending with p is like . (-aap)

Ending with k is like κ. (baak)

Ending with t is like . (caat)

Note: Just the ending only for m and ng. These two are the special vowels and does not mixed with any other vowels or consonants.

The m, n, ng, p, k, t will also used to represent the consonant. Further more, endings with p, k and t is called the “entering sound” in Chinese literature. They are sharp and rapid in finishing the syllable.

The “entering sound”, one ending with pkt, has only 3 tone levels called 7,8 and 9 which is much like the level 1,3,6 of the normal 6 tones levels. Level 1,2,3 are the high level notes and level 4,5,6 level are the low level notes. With this regards, 1,2,3 are high 1, high 2 and high 3 and 4,5,6 are low 1, low 2, low 3. However using 1,2,3 for high and low can be confused. They are rather be called flat, rising, going and the level 1 is high flat, level 2 is high rising and level 3 is high going, level 4 is low flat, level 5 is low rising and level 6 is low going. More discussion will be done in the TONE session.


  1. Listen and name the vowel

  2. Given the vowel, speak or pronounce the syllable

  3. Identity the principal vowels in alphabets and with their Chinese characters

  4. Identity the diphthongs vowels in alphabets and with their Chinese characters

  5. Identity the vowels with special endings in alphabets.