Language on the Orient Express: A Guide to Mandarin for English Teachers

Foreigners living in 1920s Shanghai apparently counselled new arrivals that “those who learn Chinese go mad” (Kane, 2006, p. 17). Certainly there has been a long tradition of outsiders looking on the Chinese language as a confusing mass of chicken-scratch writing and syncopated bursts of rhyming syllables. In turn, Chinese people themselves were often portrayed by Westerners as aloof, inscrutable and incomprehensible to outsiders. The reasons for surmounting such racist representations are pressing for both English- speaking students of Chinese and teachers of English to Chinese, particularly in light of the crucial role China will play on the global stage in the twenty-first century. With this in mind, this article provides a general overview of Mandarin Chinese geared towards teachers and students who have had limited exposure to the language. This overview is, by necessity, cursory and in several areas I have had to simplify what in other contexts would be questions of great complexity, but I hope this article can at least offer an entry point to those new to the language.

Mandarin Chinese, called putonghua (common speech) in mainland China and guoyu (national language) in Taiwan—although differences exist between the two national varieties—is the most widely spoken language in China today. Although many consider Mandarin and Cantonese to be dialects of the same language, this is analogous to calling French and Italian dialects of “European.” Most linguists argue that we should describe the linguistic situation in China as consisting of a host of spoken Chinese languages, including at the very least Mandarin, Cantonese, Wu, Hakka, Fujianese and a few others (Ramsey, 2002). At the same time though, all more-or-less share a common written script.

The Script

The modern Chinese script has its origins in a logographic system of writing that was incised on turtle shells and animal bones during the Bronze Age as a means of   divination   (DeFrancis,   1984).   The   earliest   oracle   bone   writing   consisted   of pictograms representing objects and animals, but these eventually came to represent words   and   concepts.   Some   modern   Chinese   characters   derive   from   these   early forms, such as ma (马) “horse”, which is a stylized drawing of the animal, and ri (日) “day”, representing an image of the sun. The majority of characters today, however, are not pictographic representations but can be more accurately described as picto-phonetic; that is, they typically feature both semantic and phonetic components (Yin and Rohsenow, 1994, p. 21). The character bo (波) “wave”, for instance, is composed of two parts (or radicals). On the left side of the character is a semantic component, three strokes representing “water”, which is also found in words like piao (漂) “float” and jiang (江) “river”. On the right side is a phonetic component, indicating that this word is pronounced like other characters that share its form, including (跛) “lame” and (菠) “spinach.” These similarities originally emerged as rebus associations among similar sounding words, but as certain pronunciations have changed over the centuries, phonetic radicals now often represent a group of related sounds; related to the characters above is the word po (坡) “slope”.

There are several different Chinese scripts currently in use. Hong Kong, Taiwan, Macao, and many overseas communities use a form of Chinese characters called fantizi (“complicated script”), which are unchanged from the imperial period, while in Mainland China a script reform movement in the late 1950s resulted in the adoption of jiantizi (“simplified script”). The other main scripts are the various systems of romanization such as Wade-Giles for Cantonese and the standard pinyin for Mandarin. The lack of congruity between these ways of representing Chinese languages alphabetically often leads to several different renderings of the same word, such as the name of China’s capital city: it is Peking in earlier systems of romanization and Beijing in pinyin.


English teachers may wish to note that the phonemic system of Mandarin does not include many common English sounds, so these often present difficulties for Chinese students. Notable among these are the vowel in words like ship that for Chinese students is often articulated more like sheep, the v which is often realized as a w, and the voiced and unvoiced th in words like there and think respectively. None of these are part of the Mandarin phonemic system. Mandarin also has a relatively limited set of vowel sounds: five in comparison to English’s fourteen. There are also, however, a few semi-vowels (or glides) and many of the vowels combine into diphthongs and triphthongs.

Nevertheless, the Mandarin Chinese phonological system and its pinyin orthography are remarkably similar to English, with a few exceptions. First, whereas paired English consonants are typically differentiated on the basis of whether they are voiced or unvoiced (compare, for instance, the initial consonants in tin and din), in Chinese the key difference is whether they are aspirated or not (Wiedenhof, 2015, p. 34). The Chinese word da features an unaspirated initial consonant while ta features an aspirated one; both consonants are unvoiced, meaning that they do involve vibration of the vocal cords.

Three consonant minimal pairs also present difficulty for native English speakers, and their orthographic representation is not entirely obvious to the uninitiated. The three contrasts as they appear in pinyin are:

  • ch q
  • zh j
  • sh x

The three sets of consonants contrast in terms of place of articulation. Ch-, zh-, and sh– sounds are all unvoiced retroflex consonants, produced by placing the tip of the tongue just behind the alveolar ridge (ie. the ridge behind the upper teeth). Ch– and zh– contrast as aspirated and unaspirated affricates, while sh– is a fricative. The q-, j-, and x– sounds are all unvoiced palatal consonants, produced with the blade of the tongue touching the hard palate (ie. the roof of the mouth). Q– and j– contrast as aspirated and unaspirated affricates, while x– is a fricative. To put it more simply, ch– and sh– are pronounced much as they would be in English. In contrast, x– and q– represent palatalized versions of the same sound, pronounced slightly further back in the mouth. J– is pronounced much as it would be in English, while zh– is a retroflex version of the same sound. Many novice learners of Mandarin have trouble differentiating these contrasts as they are not perceived as salient in English phonology; thus the initial consonants in ju (居) “reside,” and zhu (猪) “pig” may be perceived as allophones by a non-speaker but are meaningfully differentiated by the palatal-retroflex contrast in Mandarin. One other phoneme not shared with English is represented orthographically with the letter c, indicating an unvoiced aspirated alveolar affricate pronounced much like the ts in tsunami or cats.

Mandarin has a relatively limited syllable system, generally consisting of either consonant + vowel (ie. bu, ge, pai) or consonant + vowel + nasal consonant (lan, rong, jiang).1 It is therefore common for Mandarin speakers to add a schwa after a final consonant or to insert a vowel in long consonant clusters when speaking English. Because of the limited set of syllables Mandarin provides for constructing words, tonality allows the language to differentiate four forms of each syllable. The four tones, often described as high (1st tone), rising (2nd tone), falling-rising (3rd tone) and falling (4th tone), can be represented orthographically in pinyin by placing a diacritic over the main vowel in each syllable.2 Each diacritic is an iconic representation of the tone (ie. 1st marked by a bar above the vowel, 2nd by a rising slash, 3rd by what looks like a v, & 4th by a falling slash). For example, the word lao (pronounced similar to the first three letters in the word loud) can have the following four meanings according to tone:3

  • lāo (捞) “dredge”
  • láo (劳) “labour”
  • lǎo (老) – “old”
  • lào (烙) – “bake”; “iron”;

Despite the ability of   the   tone   system  to   expand  the   potential  semantic identities   of   Mandarin   syllables,   each   character   still   usually   has   at   least   a   few homonyms, and many jokes rely as much on soundplay as they do on wordplay. It also allows Chinese internet users to locate information on banned topics by searching   for   homonyms,   essentially   substituting   other   sound-alike   words   for sensitive names or topics. The word dāng (裆), for instance, means “crotch” or “seat of the pants,” but is a popular online search string because it sounds like dǎng (党) meaning “political party” or, more specifically, “the Communist Party of China.” Another example is the seemingly innocuous phrase “grass-mud-horse” (cǎonímǎ) that sounds like one of the crassest insults in the Mandarin. When the dissident artist Ai Weiwei posted a picture of himself online dancing naked with a stuffed horse covering his crotch, it was widely understood to be an insult to the Chinese government.


A common assumption among non-Chinese speakers is that characters are monosyllables and thus can act as a kind of mystical shorthand for broad ideas or feelings. For instance, although ai (爱) can be used as a verb meaning “love,” as in wo ai ni (“I love you”), when referring to the concept of “love,” ai is combined with qing (情), meaning “emotion,” to form a compound word. Most Chinese “words” are, in fact, compound words composed of two characters that may have related or relatively broad meanings in isolation. In its simplest form, this can consist of a syllable followed by zi (子), indicating an object or noun. Dictionaries define the character mao (帽) as “hat”, but in actual speech a hat is referred to as maozi (帽子). Most words, however, combine two separate morphological syllables to generate exact meanings, such as huanjing (环境) “environment”, which brings together huan “encircle” and jing “area”. Note that to use “encircle” as a verb in actual spoken Mandarin, one would typically say huanrao (环绕), where rao means “coil” or “move in a circle”. In each case, the monosyllabic characters are semantic elements on their own, but most words exist as disyllabic compounds.

Mandarin can be classified as an isolating language, meaning one that avoids inflectional affixes to express pluralization, tense, gender, and so forth. Many linguists characterize the language as having almost no morphological structure (Li & Thompson, 1989). Instead, Mandarin employs a class of function words that indicate these qualities at the level of sentence structure. For instance, questions are not formed through the rearrangement of words as in English (e.g., I am dreaming Am I dreaming?), but through the addition of the interrogative auxiliary ma at the end of a sentence. The statement ta you (“he has [it]”) can be transformed into a question in this way: ta you ma? (“does he have [it]?”). We can also see in this example how, where they can be inferred from context, Mandarin speakers frequently omit the subject or object of a sentence.

Mandarin does not possess a tense system, but speakers can indicate the temporal quality of actions in two main ways. One is through the use of function words to mark grammatical aspect, such as completion (the auxiliary le) or continuation (the auxiliary zhe) of actions. In practice these operate much as past and progressive tenses do in English. The other way time is indicated is through topic-fronting, which positions the timeframe of the action at the beginning of a sentence. Teachers will note that Chinese students often begin English sentences this way, using words such as yesterday or in the future to start a sentence rather than placing them at the end as is more common in spoken English.

Discourse and Social Aspects of Use

Mandarin relies heavily on idiomatic sayings and expressions called chengyu. These pithy phrases, oftentimes drawn from literature or historical sources, act as evaluative statements on the situation at hand, allowing speakers to refer to their interpretations obliquely rather than directly. “East one sentence, west one sentence,” refers to incoherence. “A donkey’s lips do not match a horse’s mouth,” means an answer that is not relevant to the question. One chengyu dictionary from Taiwan contains over thirty thousand entries (Allen 2011, p. 73).

Chengyu arise regularly in conversation and often laminate contemporary language with classical expression. By alluding to this past and to Chinese culture as a whole, chengyu are not merely poetic aphorisms but also tools of persuasion and argument. Consequently, Chinese students often overuse idioms as a rhetorical technique when writing in English.

Turning to names, in Mandarin surnames come before given names; asking a Chinese for his or her “first name” can therefore elicit confusion. Names usually take the form of a single character (and thus syllable) surname followed by a one- or two-character given name. Almost all surnames and given names are characters with other semantic referents. It is common for people to have many different names that facilitate everyday social interactions. Parents often refer to their children with a diminutive or “small name,” such as Lili or Baobao. Friends may employ a nickname or append a descriptor to a person’s real name, such as Little Wang or Old Peng. Similarly, at work people are often referred to by their position, such as Teacher Xu or Department Head Zhang. And many people adopt kinship terms to refer to particular types of relationship in addressing others; the most senior female teacher at a school, for instance, might be addressed by her co-workers as “elder sister”. I have heard Canadian teachers wonder why their Chinese students adopt English “nicknames” or “aliases,” and it is often assumed that this is motivated either by a desire to fit in or a fear that their “real” names cannot be pronounced. In effect though, English names are an extension of this logic. An English name represents a Chinese person’s global persona and is no less authentic than any of the other names they carry.

Finally, although I have focused largely on standard Mandarin, it should be noted that there is a great deal of linguistic diversity across China. Every region, province and city has its own idiosyncratic variety of the language, and these can diverge almost to the point of mutual incomprehension. The difference between urban and rural speakers is especially marked, and urbanites often told me that they had to read subtitles when rural citizens were interviewed on television. Local dialects are a matter of some pride to the area, and are generally used informally to develop feelings of belonging and camaraderie. Nevertheless, there is a heavy emphasis on correctness, and consequently a stigma on local dialects, in formal settings such as at work or school. Many people will shift between local and standard Mandarin as they travel from home to the office or from a casual lunch to the classroom.


Mandarin Chinese is a rich, complex language that offers challenges for English speakers who are often more familiar with European languages as a whole. Chinese characters, in particular, present a radically different approach to literacy than an alphabetic script. Nevertheless, I hope this outline of the language can serve to demystify some of the common elements of Mandarin and make it more accessible to a new generation of learners.


1 A few syllables in Mandarin lack an initial consonant, such as the words ai ‘love’ and en ‘to press with the finger,’ but these are relatively rare.
2 Although common for pedagogical purposes, in many texts the tone diacritics are omitted and I have included them here only where relevant.
3 Actually, my basic Chinese dictionary lists ten meanings, because each of these tonal syllables has multiple homophonous characters (see the next paragraph) – I have merely chosen four here as representative examples.



