|Pronunciation||IPA: [pʰiːəsaː kʰmaːe]|
|Native to||Cambodia, Vietnam, Thailand|
|Ethnicity||Khmer, Northern Khmer|
|16 million (2007)
1 million L2 speakers (no date)
Khmer Krom (Southern Khmer)
|Khmer script (abugida)
Official language in
khm – Central Khmer
kxm – Northern Khmer
Khmer (//; ភាសាខ្មែរ, IPA: [pʰiːəsaː kʰmaːe]; or more formally, ខេមរភាសា, IPA: [kʰeɛmaʔraʔ pʰiːəsaː]), or Cambodian, is the language of the Khmer people and the official language of Cambodia. With approximately 16 million speakers, it is the second most widely spoken Austroasiatic language (after Vietnamese). Khmer has been considerably influenced by Sanskrit and Pali, especially in the royal and religious registers, through the vehicles of Hinduism and Buddhism. It is also the earliest recorded and earliest written language of the Mon–Khmer family, predating Mon and by a significant margin Vietnamese. The Khmer language has influenced, and also been influenced by, Thai, Lao, Vietnamese, Chinese and Cham, all of which, due to geographical proximity and long-term cultural contact, form a sprachbund in peninsular Southeast Asia.
Khmer is primarily an analytic, isolating language. There are no inflections, conjugations or case endings. Instead, particles and auxiliary words are used to indicate grammatical relationships. General word order is subject–verb–object. Many words conform to the typical Mon–Khmer pattern of a "main" syllable preceded by a minor syllable.
The Khmer language is written with an abugida known in Khmer as អក្សរខ្មែរ (IPA: [aʔksɑː kʰmaːe]), "Khmer script". Khmer differs from neighboring languages such as Thai, Burmese, Lao and Vietnamese in that it is not a tonal language.
Khmer is a member of the Austroasiatic language family, the most archaic family in an area that stretches from the Malay Peninsula through Southeast Asia to East India. Austroasiatic, which also includes Mon, Vietnamese and Munda, has been studied since 1856 and was first proposed as a language family in 1907. Despite the amount of research, there is still doubt about the internal relationship of the languages of Austroasiatic. Most classifications place Khmer in the eastern branch of a Mon-Khmer sub-grouping. In these classification schemes Khmer's closest genetic relatives are the Bahnaric and Pearic languages. More recent classifications doubt the validity of the Mon-Khmer sub-grouping and place the Khmer language as its own branch of Austroasiatic equidistant from the other 12 branches of the family.
Khmer is spoken by some 13 million people in Cambodia, where it is the official language. It is also a second language for most of the minority groups and indigenous hill tribes there. Additionally there are a million speakers of Khmer native to southern Vietnam (1999) and 1.4 million in northeast Thailand (2006).
Khmer dialects, although mutually intelligible, are sometimes quite marked. Notable variations are found in speakers from Phnom Penh (which is the capital city), the rural Battambang area, the areas of Northeast Thailand adjacent to Cambodia such as Surin province, the Cardamom Mountains, and in southern Vietnam. The dialects form a continuum running roughly north to south. Standard Cambodian Khmer is mutually intelligible with the others but a Khmer Krom speaker from Vietnam, for instance, may have great difficulty communicating with a Khmer native to Sisaket Province in Thailand.
Standard Khmer, or Central Khmer, the language as taught in schools and used by the media is based on the Battambang dialect spoken throughout the plains of the northwest and central provinces.
Northern Khmer (called Khmer Surin in Khmer) refers to the dialects spoken by many in several border provinces of present-day Northeast Thailand. After the fall of the Khmer Empire in the early 15th century, the Dongrek Mountains served as a natural border leaving the Khmer north of the mountains under the sphere of influence of the Kingdom of Lan Xang. The conquests of Cambodia by Naresuan the Great for Ayutthaya furthered their political and economic isolation from Cambodia proper, leading to a dialect that developed relatively independently from the midpoint of the Middle Khmer period. This has resulted in a distinct accent influenced by the surrounding tonal languages, Lao and Thai, lexical differences, and phonemic differences in both vowels and distribution of consonants. Additionally, syllable-final /r/, which has become silent in other dialects of Khmer, is still pronounced in Northern Khmer. Some linguists classify Northern Khmer as a separate, but closely related language rather than a dialect.
Western Khmer, also called Cardamom Khmer or Chanthaburi Khmer, spoken by a very small, isolated population in the Cardamom mountain range extending from western Cambodia into eastern Central Thailand, although little studied, is unique in that it maintains a definite system of vocal register that has all but disappeared in other dialects of modern Khmer.
Phnom Penh Khmer is spoken in the capital and surrounding areas. This dialect is characterized by merging or complete elision of syllables, considered by speakers from other regions to be a "relaxed" pronunciation. For instance, "Phnom Penh" will sometimes be shortened to "m'Penh". Another characteristic of Phnom Penh speech is observed in words with an "r" either as an initial consonant or as the second member of a consonant cluster (as in the English word "bread"). The "r", trilled or flapped in other dialects, is either pronounced as an uvular trill or not pronounced at all. This alters the quality of any preceding consonant causing a harder, more emphasized pronunciation. Another unique result is that the syllable is spoken with a low-rising or "dipping" tone much like the "hỏi" tone in Vietnamese. For example, some people pronounce /trəj/ (meaning "fish") as /təj/, the "r" is dropped and the vowel begins by dipping much lower in tone than standard speech and then rises, effectively doubling its length. Another example is the word /riən/ ("study, learn"). It is pronounced /ʀiən/, with the "uvular r" and the same intonation described above.
Khmer Krom or Southern Khmer is spoken by the indigenous Khmer population of the Mekong Delta, formerly controlled by the Khmer Empire but part of Vietnam since 1698. Khmers are persecuted by the Vietnamese government for using their native language and, since the 1950s, have been forced to take Vietnamese names. Consequently very little research has been published regarding this dialect. It generally has been influenced by Vietnamese for three centuries and accordingly displays a pronounced accent, tendency toward monosyllablic words and lexical differences from the standard.
Linguistic study of the Khmer language divides its history into four periods one of which, the Old Khmer period, is subdivided into pre-Angkorian and Angkorian. Pre-Angkorian Khmer, the language after its divergence from Proto-Mon–Khmer until the ninth century, is only known from words and phrases in Sanskrit texts of the era. Old Khmer (or Angkorian Khmer) is the language as it was spoken in the Khmer Empire from the 9th century until the weakening of the empire sometime in the 13th century. Old Khmer is attested by many primary sources and has been studied in depth by a few scholars, most notably Saveros Pou, Phillip Jenner and Heinz-Jürgen Pinnow. Following the end of the Khmer Empire the language lost the standardizing influence of being the language of government and accordingly underwent a turbulent period of change in morphology, phonology and lexicon. The language of this transition period, from about the 14th to 18th centuries, is referred to as Middle Khmer and saw borrowing from Thai, Lao and, to a lesser extent, Vietnamese. The changes during this period are so profound that the rules of Modern Khmer can not be applied to correctly understand Old Khmer. The language became recognizable as Modern Khmer, spoken from the 19th century till today.
The following table shows the conventionally accepted historical stages of Khmer.
|Pre- or Proto-Khmer||Before 600 CE|
|Pre-Angkorian Old Khmer||600–800 CE|
|Angkorian Old Khmer||800 to mid-1300s|
|Middle Khmer||Mid-1300s to 1700s|
Just as modern Khmer was emerging from the transitional period represented by Middle Khmer, Cambodia fell under the influence of French colonialism. In 1887 Cambodia was fully integrated into French Indochina which brought in a French-speaking aristocracy. This led to French becoming the language of higher education and the intellectual class. Many native scholars in the early 20th century, led by a monk named Chuon Nath, resisted the French influence on their language and championed Khmerization, using Khmer roots (and Pali and Sanskrit) to coin new words for modern ideas, instead of French. Nath cultivated modern Khmer-language identity and culture, overseeing the translation of the entire Pali Buddhist canon into Khmer and creating the modern Khmer language dictionary that is still in use today, thereby ensuring that Khmer would survive, and indeed flourish, during the French colonial period.
|Plosive||p (pʰ)||t (tʰ)||c (cʰ)||k (kʰ)||ʔ|
|Implosive||ɓ ~ b||ɗ ~ d|
Khmer is frequently described as having aspirated stops. However, these may be analyzed as consonant clusters, /ph, th, ch, kh/, as infixes can occur between the stop and the aspiration (phem, p⟨an⟩hem), or as non-distinctive phonetic detail in other consonant clusters, such as the khm in Khmer. [b] and [d] are occasional allophones of the implosives.
In addition, the consonants /f/, /ʃ/, /z/ and /ɡ/ may occasionally occur in recent loan words in the speech of Cambodians familiar with French and other languages. These non-native sounds are not represented in the Khmer script, although combinations of letters otherwise unpronounceable are used to represent them when necessary. In the speech of those who are not bilingual, these sounds are approximated with natively occurring phonemes:
|Foreign Sound (IPA)||Khmer Representation||Khmer Approximation (IPA)|
|/f/||ហ្វ||/h/ or /pʰ/|
Various researchers have proposed slightly different analyses of the vowels. This may be in part because political centralization has not yet been achieved, so standard Khmer does not prevail throughout Cambodia. Additionally, the Cambodian Civil War resulted in massive internal population upheaval. As such, many speakers of even the same community may have different phonological inventories. Two proposals follow. The first is Huffman's analysis of Standard Khmer, and the second is Wayland's analysis of Battambang Khmer, the dialect upon which the standard is based.
|Treated as long vowels||iəj||iəw||ɨəj||əːj||aːj||aoj||uəj|
|Treated as short vowels||ɨw||əw||aj||aw||uj|
The precise number and the phonetic value of vowel nuclei vary from dialect to dialect. Short and long vowels of equal quality are distinguished solely by duration.
Khmer words are predominantly either monosyllabic or sesquisyllabic, with stress falling on the final syllable. There are two possible clusters of three consonants at the beginning of syllables, /str skr/, and 85 possible two-consonant clusters:
Syllables begin with one of these consonants or consonant clusters, followed by one of the vowel nuclei. The aspiration in some clusters is allophonic. When the vowel nucleus is short, there has to be a final consonant. /p, t, c, k, ʔ, m, n, ɲ, ŋ, l, h, j, ʋ/ can exist in a syllable coda, while /h/ and /ʋ/ approach [ç] and [w] respectively. The stops /p, t, c, k/ have no audible release when occurring as syllable finals.
The most common word structure in Khmer is a full syllable as described above, which may be preceded by an unstressed, "minor" syllable that has a consonant-vowel structure of CV-, CrV-, CVN- or CrVN- (N is any nasal in the Khmer inventory). The vowel in these preceding syllables is usually reduced in conversation to [ə], however in careful or formal speech and in television and radio, they are always clearly articulated.
Words with three or more syllables exist, particularly those pertaining to science, the arts, and religion. However, these words are loanwords, usually derived from Pali, Sanskrit, or more recently, French.
Khmer once had a phonation distinction in its vowels, which was indicated in writing by choosing between two sets of letters for the preceding consonant according to the historical source of the phonation. However, phonation has been lost in all but the most archaic dialect of Khmer (Western Khmer). For example, Old Khmer distinguished voiced and unvoiced pairs as in *kaa vs *ɡaa. The vowels after voiced consonants became breathy voiced and diphthongized: *kaa, *ɡe̤a. When consonant voicing was lost, the distinction was maintained by the vowel: *kaa, *ke̤a, and later the phonation disappeared as well: [kaː], [kiə].
Stress in Khmer is non-phonemic (does not distinguish different meanings) and thus is considered to depend entirely on syllable structure. Owing in part to the sesquisyllabic nature of Khmer, syllabic stress is also highly predictable. In native disyllabic words, the first syllable is always a minor syllable and the second syllable is stressed. Loan words and reduplications also tend to follow this pattern:
|សំពត់||/sɑm ˈpoə̯̆t/||"type of sarong"|
In words of more than two syllables, primary stress is always on the final syllable with secondary stress on every second syllable. Thus, in a three-syllable word, the first syllable exhibits secondary stress, while primary stress is on the third syllable:
|រតនា||/ˌreə̯̆ʔ taʔ ˈnaː/||"gem, jewel, precious stone"|
|រដ្ធបាល||/ˌroə̯̆t tʰaʔ ˈbaːl/||"caretaker of the government"|
|កម្ពុជា||/ˌkam puʔ ˈciə̯/||"Kampuchea"|
Compound words of three syllables, however, follow the stress of the constituent words:
|អំបិលម្ទេស||/ɑm ˌbɨl ˈmteːh/||"a dry dipping powder of peppers and salt"|
|អន្ទាក់រូត||/ɑn ˌteə̯̆ʔ ˈruːt/||"a snare"|
|សំបុកចាប||/sɑm ˌbok ˈcaːp/||"a kind of cookie" (lit. "bird's nest")|
In a four-syllable word, the second syllable carries secondary stress and primary stress is on the fourth syllable:
|សីមារេខា||/səj ˌmaː reə̯̆ʔ ˈkʰaː/||"boundary line"|
|សកម្មភាព||/saʔ ˌkam meə̯̆ʔ ˈpʰiə̯p/||"physical activity"|
|រណូងរណាង||/rɔ ˌnoːŋ rɔ ˈnaːŋ/||"hasty, hastily, angrily" (an example of reduplication)|
Words of five or more syllables are exceedingly rare in everyday conversation but do occur in academic, governmental, and religious contexts. These words are all derived from Sanskrit or Pali roots but follow Khmer pronunciation and stress patterns:
|មេឃវាហន||/meː ˌkʰeə̯̆ʔ viə̯ ˌhaʔ ˈnaʔ/||"A name of Indra" (lit. from Sanskrit "having clouds for a vehicle")|
As Khmer is primarily an analytic pro-drop language, intonation (non-phonemic pitch variation throughout clauses) often conveys semantic context. The intonation pattern of a typical Khmer declarative phrase is a steady rise throughout followed by an abrupt drop on the last syllable.
|ខ្ញុំមិនចង់បានទេ||/↗kʰɲom mɨn caŋ baːn | ↘teː/||"I don't want it."|
Other intonation contours signify a different type of phrase such as the "full doubt" interrogative, similar to "yes-no" questions in English or the exclamatory phrase. Full doubt interrogatives remain fairly even in tone throughout but rise sharply towards the end.
|អ្នកចង់ទៅលេងសៀមរាបទេ||/↗nea? caŋ | ↗tɨw leːŋ siəm riəp | ꜛteː/||"Do you want to go to Siem Reap?"|
Exclamatory phrases follow the typical steadily rising pattern, but rise sharply on the last syllable instead of falling.
|សៀវភៅនេះថ្លៃណាស់||/↗siəw pʰɨw nih| ↗tʰlaj | ꜛnahs/||"This book is expensive!"|
Khmer is generally a subject–verb–object (SVO) language with prepositions. Although primarily an analytic language, lexical derivation by means of prefixes and infixes occurs but is not always productive in the modern language.
Adjectives, demonstratives and numerals follow the noun they modify. Adverbs likewise follow the verb. Morphologically, adjectives and adverbs are not distinguished, with many words often serving either function. Similar to other languages of the region, intensity can be expressed by reduplication.
ស្រីស្អាតនោះ /srəj sʔaːt nuh/ (girl pretty that) = that pretty girl
ស្រីស្អាតស្អាត /srəj sʔaːt sʔaːt/ (girl pretty pretty) = a very pretty girl
As Khmer sentences rarely use a copula, adjectives are also employed as verbs. Comparatives are formed by the use of ciəng: "A X ciəng B" (A is more X than B). The most common way to express the idea of superlatives is the construction "A X ciəng kee" (A is X-est of all).
The noun has no grammatical gender or singular/plural distinction and is uninflected. Technically there are no articles, but indefiniteness is often expressed by the word for "one" following the noun. Plurality can be marked by postnominal particles, numerals, or reduplicating the adjective, which, although similar to intensification, is usually not ambiguous due to context.
ឆ្កែធំ /cʰkae tʰom/ (dog large) = large dog
ឆ្កែធំធំ /cʰkae tʰom tʰom/ (dog large large) = a very large dog or large dogs
ឆ្កែធំណាស់ /cʰkae tʰom nah/ (dog large very) = very large dog
ឆ្កែពីរ /cʰkae piː/ (dog two) = two dogs
Classifying particles for use between numerals and nouns exist although are not always obligatory as in, for example, Thai. Pronouns are subject to a complicated system of social register, the choice of pronoun depending on the perceived relationships between speaker, audience and referent (see Social registers below). Kinship terms, nicknames and proper names are often used as pronouns (including for the first person) among intimates. Frequently, subject pronouns are dropped in colloquial conversation.
As is typical of most East Asian languages, the verb does not inflect at all; tense and aspect can be shown by particles and adverbs or understood by context. Most commonly, time words such as "yesterday", "earlier", "tomorrow", indicate tense when not inferrable from context. There is no participle form. The gerund is formed by using /kəmpɔːŋ/: "A /kəmpɔːŋ/ V" (A is in the process of V). Serial verb construction is quite common. Negation is achieved by putting /min/ before them and /teː/ at the end of the sentence or clause. In normal speech verbs can also be negated without the need for an ending particle by putting /ʔɑt/ before them.
ខ្ញុំជឿ /kʰɲom cɨə/ – I believe
ខ្ញុំមិនជឿទេ /kʰɲom min cɨə teː/ – I don't believe
ខ្ញុំឥតជឿ /kʰɲom ʔɑt cɨə/ – I don't believe
The numbers are:
|6||៦||ប្រាំមូយ||(prăm muŏy)||/pram muːə̯j/|
|7||៧||ប្រាំពីរ||(prăm pi)||/pram piː/ (also /pram pɨl/)|
|8||៨||ប្រាំបី||(prăm bei)||/pram ɓəːj/|
|9||៩||ប្រាំបួន||(prăm buŏn)||/pram ɓuːə̯n/|
|100||១០០||មួយរយ||(muŏy rôy)||/muːə̯j rɔːj/|
|1,000||១០០០||មួយពាន់||(muŏy poăn)||/muːə̯j pɔə̯n/|
|10,000||១០០០០||មួយម៉ឺន||(muŏy mœn)||/muːə̯j məɨn/|
|100,000||១០០០០០||មួយសែន||(muŏy sên)||/muːə̯j saːe̯n/|
|1,000,000||១០០០០០០||មួយលាន||(muŏy léan)||/muːə̯j liːə̯n/|
Khmer employs a system of registers in which the speaker must always be conscious of the social status of the person spoken to. The different registers, which include those used for common speech, polite speech, speaking to or about royals and speaking to or about monks, employ alternate verbs, names of body parts and pronouns. This results in what appears to foreigners as separate languages and, in fact, isolated villagers often are unsure how to speak with royals and royals raised completely within the court do not feel comfortable speaking the common register. As an example, the word for "to eat" used between intimates or in reference to animals is /siː/. Used in polite reference to commoners, it's /ɲam/. When used of those of higher social status, it's /pisa/ or /tɔtuəl tiən/. For monks the word is /cʰan/ and for royals, /saoj/. Another result is that the pronominal system is complex and full of honorific variations, just a few of which are shown in the table below.
|Situational usage||"I, me"||IPA||"you"||IPA||"he, she, it"||IPA|
|Intimate or addressing an inferior||អញ||/ʔaɲ/||ឯង||/ʔaɛ̯ŋ/||វា||/ʋiə̯/|
(or kinship term, title or rank)
|Layperson to/about Buddhist clergy||ខ្ញុំព្រះករុណា||/kʰɲom preə̯̆h kaʔruʔnaː/||ព្រះតេជព្រះគុណ||/preə̯̆h daɛ̯c preə̯̆h kun/||ព្រអង្គ||/preə̯̆h ɑŋ/|
|Buddhist clergy to layperson||អាត្មា or
|ញោមស្រី (to female)
ញោមប្រុស (to male)
|ឧបាសក (to male)
ឧបាសិកា (to female)
|when addressing royalty||ខ្ញុំព្រះបាទអម្ចាស់ or ទូលបង្គុំ (male), ខ្ញុំម្ចាស់ (female)||/kʰɲom preə̯̆h baːt aʔmcah/||ព្រះករុណា||/preə̯̆h kaʔruʔnaː/||ទ្រង់||/truə̯̆ŋ/|
Khmer is written with the Khmer script, an abugida developed from the Pallava script of India before the 7th century when the first known inscription appeared. Written left-to-right with vowel signs that can be placed after, before, above or below the consonant they follow, the Khmer script is similar in appearance and usage to Thai and Lao, both of which were based on the Khmer system. The Khmer script is also distantly related to the Mon script, the ancestor of the modern Burmese script. Khmer numerals, which were inherited from Indian numerals, are used more widely than Hindu-Arabic numerals. Within Cambodia, literacy in the Khmer alphabet is estimated at 77.6%.
Consonant symbols in Khmer are divided into two groups, or series. The first series carries the inherent vowel /ɑː/ while the second series carries the inherent vowel /ɔː/. The Khmer names of the series, /aʔkʰosaʔ/ ("voiceless") and /kʰosaʔ/ ("voiced"), respectively, indicate that the second series consonants were used to represent the voiced phonemes of Old Khmer. As the voicing of stops was lost, however, the contrast shifted to the phonation of the attached vowels which, in turn, evolved into a simple difference of vowel quality, often by diphthongization. This process has resulted in the Khmer alphabet having two symbols for most consonant phonemes and each vowel symbol having two possible readings, depending on the series of the initial consonant:
|ត + ា||= តា||/ta/||"grandfather"|
|ទ + ា||= ទា||/tiə/||"duck"|
|Khmer edition of Wikipedia, the free encyclopedia|
|Wikimedia Commons has media related to Khmer language.|
|Wikivoyage has a travel guide for Khmer phrasebook.|