|360 million (2010)
L2: 430 million or more (2003)
|Latin script (English alphabet)
|Manually coded English
Official language in
27 non-sovereign entities
Countries where English is an official or de facto official language, or national language, and is spoken natively by the majority of the population
Countries where it is an official but not primary language
English is a West Germanic language that was first spoken in early medieval England and is now a global lingua franca. It is spoken as a first language by the majority populations of several sovereign states, including the United Kingdom, the United States, Canada, Australia, Ireland, New Zealand and a number of Caribbean nations; and it is an official language of almost 60 sovereign states. It is the third-most-common native language in the world, after Mandarin Chinese and Spanish. It is widely learned as a second language and is an official language of the European Union, many Commonwealth countries and the United Nations, as well as in many world organisations.
English arose in the Anglo-Saxon kingdoms of England and what is now southeast Scotland. Following the extensive influence of Great Britain and the United Kingdom from the 17th to mid-20th centuries through the British Empire, it has been widely propagated around the world. Through the spread of American-dominated media and technology, English has become the leading language of international discourse and the lingua franca in many regions.
Historically, English originated from the fusion of closely related dialects, now collectively termed Old English, which were brought to the eastern coast of Great Britain by Germanic settlers (Anglo-Saxons) by the 5th century; the word English is derived from the name of the Angles, and ultimately from their ancestral region of Angeln (in what is now Schleswig-Holstein). The language was also influenced early on by the Old Norse language through Viking invasions in the 9th and 10th centuries.
The Norman conquest of England in the 11th century gave rise to heavy borrowings from Norman French, and vocabulary and spelling conventions began to give the appearance of a close relationship with those of Latin-derived Romance languages (though English is not a Romance language itself) to what had then become Middle English. The Great Vowel Shift that began in the south of England in the 15th century is one of the historical events that mark the emergence of Modern English from Middle English.
In addition to its Anglo-Saxon and Norman French roots, a significant number of English words are constructed on the basis of roots from Latin, because Latin in some form was the lingua franca of the Christian Church and of European intellectual life and remains the wellspring of much modern scientific and technical vocabulary.
Owing to the assimilation of words from many other languages throughout history, modern English contains a very large vocabulary, with complex and irregular spelling, particularly of vowels. Modern English has not only assimilated words from other European languages, but from all over the world. The Oxford English Dictionary lists more than 250,000 distinct words, not including many technical, scientific, and slang terms.
The word English derives from the eponym Angle, the name of a Germanic tribe thought to originate from the Angeln area of Jutland, now in northern Germany. For possible etymologies of these words, see the articles Angeln and Angles.
Modern English, sometimes described as the first global lingua franca, is the dominant language or in some instances even the required international language of communications, science, information technology, business, seafaring, aviation, entertainment, radio, and diplomacy. Its spread beyond the British Isles began with the growth of the British Empire, and by the late 19th century its reach was global. Following British colonisation from the 16th to 19th centuries, it became the dominant language in the United States, Canada, Australia and New Zealand. The growing economic and cultural influence of the US and its status as a global superpower since World War II have significantly accelerated the spread of the language across the planet. English replaced German as the dominant language of science-related Nobel Prize laureates during the second half of the 20th century. English equalled and may have surpassed French as the dominant language of diplomacy during the second half of the 19th century.
A working knowledge of English has become a requirement in a number of fields, occupations and professions such as medicine and computing; as a consequence, more than a billion people speak English to at least a basic level (see English as a second or foreign language). It is one of six official languages of the United Nations.
One impact of the growth of English is the reduction of native linguistic diversity in many parts of the world. The influence of English continues to play an important role in language attrition. Conversely, the natural internal variety of English along with creoles and pidgins have the potential to produce new distinct languages from English over time.
English originated in those dialects of North Sea Germanic that were carried to Britain by Germanic settlers from various parts of what are now the Netherlands, northwest Germany, and Denmark. Up to that point, in Roman Britain the native population is assumed to have spoken Common Brittonic, a Celtic language, alongside the acrolectal influence of Latin, due to the 400-year period of Roman Britain. One of these incoming Germanic tribes was the Angles, whom Bede believed to have relocated entirely to Britain. The names 'England' (from Engla land "Land of the Angles") and English (Old English Englisc) are derived from the name of this tribe—but Saxons, Jutes and a range of Germanic peoples from the coasts of Frisia, Lower Saxony, Jutland and Southern Sweden also moved to Britain in this era.
Initially, Old English was a diverse group of dialects, reflecting the varied origins of Anglo-Saxon England but the West Saxon dialect eventually came to dominate, and it is in this that the poem Beowulf is written.
Old English was later transformed by two waves of invasion. The first was by speakers of the North Germanic language branch when Halfdan Ragnarsson and Ivar the Boneless started the conquering and colonisation of northern parts of the British Isles in the 8th and 9th centuries (see Danelaw). The second was by speakers of the Romance language Old Norman in the 11th century with the Norman conquest of England. Norman developed into Anglo-Norman, and then Anglo-French – and introduced a layer of words especially via the courts and government. As well as extending the lexicon with Scandinavian and Norman words, these two events simplified the grammar and transformed English into a borrowing language—unusually open to accepting new words from other languages.
The linguistic shifts in English following the Norman invasion produced what is now referred to as Middle English; Geoffrey Chaucer's The Canterbury Tales is its best-known work. Throughout this period, Latin in some form was the lingua franca of European intellectual life – first the Medieval Latin of the Christian Church, and later the humanist Renaissance Latin – and those who wrote or copied texts in Latin commonly coined new terms from that language to refer to things or concepts for which there was no native English word.
Modern English, which includes the works of William Shakespeare and the King James Version of the Bible, is generally dated from about 1550, and after the United Kingdom became a colonial power, English served as the lingua franca of the colonies of the British Empire. In the post-colonial period, some of the newly created nations that had multiple indigenous languages opted to continue using English as the lingua franca to avoid the political difficulties inherent in promoting any one indigenous language above the others. As a result of the growth of the British Empire, English was adopted in North America, India, Africa, Australia and many other regions – a trend that was reinforced by the emergence of the United States as a superpower in the mid-20th century.
The English language belongs to the Anglo-Frisian sub-group of the West Germanic branch of the Germanic languages, a member of the Indo-European languages. Modern English is the direct descendant of Middle English, itself a direct descendant of Old English, a descendant of the Proto-Germanic language. Typical of most Germanic languages, English is characterised by the use of modal verbs, the division of verbs into strong and weak classes, and common sound shifts from Proto-Indo-European known as Grimm's law. The closest living relatives of English (besides the English languages and English-based creole languages) are the Frisian languages of the southern fringes of the North Sea in the Netherlands, Germany, and Denmark.
After Frisian come those Germanic languages that are more distantly related: the non-Anglo-Frisian West Germanic languages (Dutch, Afrikaans, Low German, High German, Yiddish), and the North Germanic languages (Swedish, Danish, Norwegian, Icelandic, and Faroese). None of the Continental Germanic languages is mutually intelligible with English, owing in part to divergences in lexis, syntax, semantics, and phonology, and to the isolation afforded to English by the British Isles, although some, such as Dutch, do show strong affinities with English, especially to its earlier stages. Isolation has allowed English (as well as Icelandic and Faroese) to develop independently of the Continental Germanic languages and their influences.
In addition to isolation, lexical differences between English and other Germanic languages exist due to diachronic change, semantic drift, and to substantial borrowing in English of words from other languages, especially Latin and French (though borrowing is in no way unique to English). For example, compare "exit" (Latin), vs. Dutch uitgang and German Ausgang (literally "out-going", though outgang continues to survive dialectally) and "change" (French) vs. Dutch verandering and German Änderung (literally "elsing, othering", i.e. "alteration"); "movement" (French) vs. Dutch beweging and German Bewegung ("beway-ing", i.e. "proceeding along the way"); etc. With the exception of exit (a Modern English borrowing), Middle English had already distanced itself from other Germanic languages, having the terms wharf, schift (="shift"), and wending for "change"; and already by Old English times the word bewegan meant "to cover, envelop", rather than "to move". Preference of one synonym over another also causes differentiation in lexis, even where both words are Germanic, as in English care vs. German Sorge. Both words descend from Proto-Germanic *karō and *surgō respectively, but *karō has become the dominant word in English for "care" while in German, Dutch, and Scandinavian languages, the *surgō root prevailed. *Surgō still survives in English, however, as sorrow.
Despite extensive lexical borrowing, the workings of the English language are resolutely Germanic, and English remains classified as a Germanic language due to its structure and grammar. Borrowed words get incorporated into a Germanic system of conjugation, declension, and syntax, and behave exactly as though they were native Germanic words from Old English. For example, the word reduce is borrowed from Latin redūcere; however, in English one says "I reduce – I reduced – I will reduce" rather than "redūcō – redūxī – redūcam"; likewise, we say: "John's life insurance company" (cf. Dutch "Johns levensverzekeringsmaatschappij" [= leven (life) + verzekering (insurance) + maatschappij (company)] rather than "the company of insurance life of John", cf. the French: la compagnie d'assurance-vie de John). Furthermore, in English, all basic grammatical particles added to nouns, verbs, adjectives, and adverbs are Germanic. For nouns, these include the normal plural marker -s/-es (apple – apples; cf. Frisian appel – appels; Dutch appel – appels; Afrikaans appel – appels), and the possessive markers -'s (Brad's hat; German Brads Hut; Danish Brads hat) and -s' .
For verbs, these include the third-person present ending -s/-es (e.g. he stands/he reaches ), the present participle ending -ing (cf. Dutch -ende; German -end(e)), the simple past tense and past participle ending -ed (Swedish -ade/-ad), and the formation of the English infinitive using to (e.g. "to drive"; cf. Old English tō drīfenne; Dutch te drijven; Low German to drieven; German zu treiben). Adverbs generally receive an -ly ending (cf. German -lich; Swedish -ligt), and adjectives and adverbs are inflected for the comparative and superlative using -er and -est (e.g. hard/harder/hardest; cf. Dutch hard/harder/hardst), or through a combination with more and most (cf. Swedish mer and mest). These particles append freely to all English words regardless of origin (tsunamis; communicates; to buccaneer; during; calmer; bizarrely) and all derive from Old English. Even the lack or absence of affixes, known as zero or null (-Ø) affixes, derives from endings that previously existed in Old English (usually -e, -a, -u, -o, -an, etc.), that later weakened to -e, and have since ceased to be pronounced and spelt (e.g. Modern English "I sing" = I sing-Ø < I singe < Old English ic singe; "we thought" = we thought-Ø < we thoughte(n) < Old English wē þōhton).
Due to the Viking colonisation and influence of Old Norse on Middle English, English syntax follows a pattern similar to that of North Germanic languages, such as Danish, Swedish, and Icelandic, in contrast with other West Germanic languages, such as Dutch and German. This is especially evident in the order and placement of verbs. For example, English "I will never see you again" = Danish "Jeg vil aldrig se dig igen"; Icelandic "Ég mun aldrei sjá þig aftur", whereas in Dutch and German the main verb is placed at the end (e.g. Dutch "Ik zal je nooit weer zien"; German "Ich werde dich nie wieder sehen", literally, "I will you never again see"). This is also observable in perfect tense constructions, as in English "I have never seen anything in the square" = Danish "Jeg har aldrig set noget på torvet"; Icelandic "Ég hef aldrei séð neitt á torginu", where Dutch and German place the past participle at the end (e.g. Dutch "Ik heb nooit iets op het plein gezien"; German "Ich habe nie etwas auf dem Platz gesehen", literally, "I have never anything in the square seen"). As in most Germanic languages, English adjectives usually come before the noun they modify, even when the adjective is of Latinate origin (e.g. medical emergency, national treasure). English continues to make extensive use of self-explaining compounds (e.g. streetcar, classroom) and nouns that serve as modifiers (e.g. lamp post, life insurance company) – traits inherited from Old English (See also Kenning).
The kinship with other Germanic languages can also be seen in the tensing of English verbs (e.g. English fall/fell/fallen/will or shall fall, West Frisian fal/foel/fallen/sil falle, Dutch vallen/viel/gevallen/zullen vallen, German fallen/fiel/gefallen/werden fallen, Norwegian faller/falt/falt or falne/vil or skal falle), the comparatives of adjectives and adverbs (e.g. English good/better/best, West Frisian goed/better/best, Dutch goed/beter/best, German gut/besser/best), the treatment of nouns (English shoemaker, shoemaker's, shoemakers, shoemakers'; Dutch schoenmaker, schoenmakers, schoenmakers, schoenmakeren; Swedish skomakare, skomakares, skomakare, skomakares), and the large amount of cognates (e.g. English wet, Scots weet, West Frisian wiet, Swedish våt; English send, Dutch zenden, German senden; English meaning, Swedish mening, Icelandic meining, etc.).
It occasionally gives rise to false friends (e.g. English time vs Norwegian time, meaning "hour" [i.e. "a specific amount of time"]; English gift vs German Gift, meaning "poison" [i.e. "that which is given, dosage, dose"]), while differences in phonology can obscure words that really are related (tooth vs. German Zahn; compare also Danish tand, North Frisian toth). Sometimes both semantics and phonology are different (German Zeit ("time") is related to English "tide", but the English word, through a transitional phase of meaning "period"/"interval", has come primarily to mean gravitational effects on the ocean by the moon (formerly expressed by ebb), though the original meaning is preserved in forms like tidings and betide, and phrases such as to tide over). However, a few other Germanic languages, more closely related to English than German, also share this same semantic shift, namely Low German (i.e. Low German Tīde = "tide of the sea") and Dutch (Dutch getijde, tij = "tide of the sea").
Some North Germanic words entered English from the settlement of Viking raiders and Danish invasions that began around the 9th century (see Danelaw). Many of these words are common and are often mistaken for being native, which shows how close-knit the relations between the English and the Scandinavian settlers were (see below: Words of Old Norse origin). Dutch and Low German also had a considerable influence on English vocabulary, contributing common everyday terms and many nautical and trading terms (see below: Words of Dutch and Low German origin).
There are a some words in English, that probably are of Scandinavian origin, other than Old Norse, but which are difficult to trace more exactly.
English has been forming compound words and affixing existing words separately from the other Germanic languages for more than 1500 years but shows different patterns in this regard. For instance, abstract nouns in English may be formed from native words by the suffixes "‑hood", "-ship", "-dom" and "-ness". All of these suffixes have cognates in most or all other Germanic languages, but their usage has diverged, as German "Freiheit" and Dutch "vrijheid" vs. English "freedom" (the suffix "-heit"/"-heid" being cognate with English "-hood", while English "-dom" is cognate with German "-tum" and Dutch "-dom"; but note North Frisian fridoem and Norwegian fridom, "freedom", and Dutch vrijdom, "exemption"). The Germanic languages Icelandic and Faroese also follow English in this respect, since, like English, they developed independent of German influences.
Many French words are also intelligible to an English speaker, especially when they are seen in writing (as pronunciations are often quite different), because English absorbed a large vocabulary from Norman and French, via Anglo-Norman after the Norman Conquest, and directly from French in subsequent centuries. As a result, a large portion of English vocabulary is derived from French, with some minor spelling differences (e.g. inflectional endings, use of old French spellings, lack of diacritics, etc.), as well as occasional divergences in meaning of so-called false friends: for example, compare "library" with the French librairie, which means bookstore; in French, the word for "library" is bibliothèque. The pronunciation of most French loanwords in English (with the exception of a handful of more recently borrowed words such as mirage, genre, café; or phrases like coup d'état, rendez-vous, etc.) has become largely anglicised and follows a typically English phonology and pattern of stress (compare English "nature" vs. French nature, "button" vs. bouton, "table" vs. table, "hour" vs. heure, "reside" vs. résider, etc.).
Approximately 375 million people speak English as their first language. English today is probably the third largest language by number of native speakers, after Mandarin Chinese and Spanish. However, when combining native and non-native speakers it is probably the most commonly spoken language in the world, though possibly second to a combination of the Chinese languages (depending on whether distinctions in the latter are classified as "languages" or "dialects").
Estimates that include second language speakers vary greatly from 470 million to over a billion depending on how literacy or mastery is defined and measured. Linguistics professor David Crystal calculates that non-native speakers now outnumber native speakers by a ratio of 3 to 1.
The countries with the highest populations of native English speakers are, in descending order: the United States (226 million), the United Kingdom (61 million), Canada (18.2 million), Australia (15.5 million), Nigeria (4 million), Ireland (3.8 million), South Africa (3.7 million), and New Zealand (3.6 million) in a 2006 Census.
Countries such as the Philippines, Jamaica and Nigeria also have millions of native speakers of dialect continua ranging from an English-based creole to a more standard version of English. Of those nations where English is spoken as a second language, India has the most such speakers (see Indian English). Crystal claims that, combining native and non-native speakers, India now has more people who speak or understand English than any other country in the world.
|Country||Total||Percent of population||First language||As an additional language||Population||Comment|
|United States||267,444,149||95%||225,505,953||41,938,196||280,950,438||Source: American Community Survey: Language Use in the United States: 2007, Table 1. Figure for second language speakers are respondents who reported they do not speak English at home but know it "very well" or "well." Figures are for population age 5 and older.|
|India||125,344,736||12%||226,449||86,125,221 second language speakers.
38,993,066 third language speakers
|1,028,737,436||Source: Census 2001, Figures include both those who speak English as a second language and those who speak it as a third language. The figures include English speakers, but not English users.|
|Pakistan||88,690,000||49%||88,690,000||180,440,005||Source: Euromonitor International report 2009. "The Benefits of the English Language for Individuals and Societies: Quantitative Indicators from Cameroon,Nigeria, Rwanda, Bangladesh and Pakistan." 'A custom report compiled by Euromonitor International for the British Council'.|
|Nigeria||79,000,000||53%||4,000,000||75,000,000+||148,000,000||Figures are for speakers of Nigerian Pidgin, an English-based pidgin or creole. Ihemere gives a range of roughly 3 to 5 million native speakers; the midpoint of the range is used in the table. Ihemere, Kelechukwu Uchechukwu (2006). "A Basic Description and Analytic Treatment of Noun Clauses in Nigerian Pidgin". Nordic Journal of African Studies 15 (3): 296–313.|
|United Kingdom||59,600,000||98%||58,100,000||1,500,000||60,000,000||Source: Crystal (2005), p. 109.|
|Philippines||43,994,000||52%||20,000||43,974,000||84,566,000||Ethnologue lists 3.4 million native speakers with 52% of the population speaking it as an additional language.|
|Canada||25,246,220||85%||17,694,830||7,551,390||29,639,030||Source: 2001 Census – Knowledge of Official Languages and Mother Tongue. The native speakers figure comprises 122,660 people with both French and English as a mother tongue, plus 17,572,170 people with English and not French as a mother tongue.|
|Australia||18,172,989||92%||15,581,329||2,591,660||19,855,288||Source: 2006 Census. The figure shown in the first language English speakers column is actually the number of Australian residents who speak only English at home. The additional language column shows the number of other residents who claim to speak English "well" or "very well". Another 5% of residents did not state their home language or English proficiency.|
|South Africa||9.6%||4,892,623||51,770,560||Source: 2011 Census. Native speakers = people speaking English at home|
|Ireland||94%||4,588,252||Source: 2011 Census|
|New Zealand||3,673,626||91.2%||3,008,058||665,568||4,027,947||Source: 2006 Census. The figures are people who can speak English with sufficient fluency to hold an everyday conversation. The figure shown in the first language English speakers column is actually the number of New Zealand residents who reported to speak English only, while the additional language column shows the number of New Zealand residents who reported to speak English as one of two or more languages.|
|Note: Total = First language + Other language; Percentage = Total / Population|
English is the primary language in Anguilla, Antigua and Barbuda, Australia, the Bahamas, Barbados, Belize, Bermuda, the British Indian Ocean Territory, the British Virgin Islands, Canada, the Cayman Islands, Dominica, the Falkland Islands, Gibraltar, Grenada, Guam, Guernsey, Guyana, Ireland, the Isle of Man, Jamaica, Jersey, Montserrat, Nauru, New Zealand, Pitcairn Islands, Saint Helena, Ascension and Tristan da Cunha, Saint Kitts and Nevis, Saint Vincent and the Grenadines, Singapore, South Georgia and the South Sandwich Islands, Trinidad and Tobago, the Turks and Caicos Islands, the United Kingdom and the United States.
In some countries where English is not the most spoken language, it is an official language; these countries include Botswana, Cameroon, the Federated States of Micronesia, Fiji, Gambia, Ghana, Hong Kong, India, Kenya, Kiribati, Lesotho, Liberia, Malta, the Marshall Islands, Mauritius, Namibia, Nigeria, Pakistan, Palau, Papua New Guinea, the Philippines (Philippine English), Rwanda, Saint Lucia, Samoa, Seychelles, Sierra Leone, the Solomon Islands, Sri Lanka, Sudan, South Sudan, Swaziland, Tanzania, Uganda, Zambia, and Zimbabwe. Also there are countries where in a part of the territory English became a co-official language, e.g. Colombia's San Andrés y Providencia and Nicaragua's Mosquito Coast. This was a result of the influence of British colonisation in the area.
English is one of the 11 official languages that are given equal status in South Africa (South African English). It is also the official language in current dependent territories of Australia (Norfolk Island, Christmas Island and Cocos Island) and of the United States (American Samoa, Guam, Northern Mariana Islands, Puerto Rico (in Puerto Rico, English is co-official with Spanish), and the US Virgin Islands), and the former British colony of Hong Kong. (See List of countries where English is an official language for more details.)
Although the United States federal government has no official languages, English has been given official status by 30 of the 50 state governments. Although falling short of official status, English is also an important language in several former colonies and protectorates of the United Kingdom, such as Bahrain, Bangladesh, Brunei, Cyprus, Malaysia, and the United Arab Emirates.
Because English is so widely spoken, it has often been referred to as a "world language", the lingua franca of the modern era, and while it is not an official language in most countries, it is currently the language most often taught as a foreign language. It is, by international treaty, the official language for aeronautical and maritime communications. English is one of the official languages of the United Nations and many other international organisations, including the International Olympic Committee.
English is studied most often in the European Union, and the perception of the usefulness of foreign languages among Europeans is 67% in favour of English ahead of 17% for German and 16% for French (as of 2012). Among some of the non-English-speaking EU countries, the following percentages of the adult population claimed to be able to converse in English in 2012: 90% in the Netherlands, 89% in Malta, 86% in Sweden and Denmark, 73% in Cyprus and Austria, 70% in Finland, and over 50% in Greece, Luxembourg, Slovenia and Germany. In 2012, excluding native speakers, 38% of Europeans consider that they can speak English, but only 3% of Japanese people.
Books, magazines, and newspapers written in English are available in many countries around the world, and English is the most commonly used language in the sciences with Science Citation Index reporting as early as 1997 that 95% of its articles were written in English, even though only half of them came from authors in English-speaking countries.
English literature predominates considerably with 28% of all volumes published in the world [leclerc 2011] and 30% of web content in 2011 (from 50% in 2000).
This increasing use of the English language globally has had a large impact on many other languages, leading to language shift and even language death, and to claims of linguistic imperialism. English itself has become more open to language shift as multiple regional varieties feed back into the language as a whole.
English has been subject to a large degree of regional dialect variation for many centuries. Its global spread now means that a large number of dialects and English-based creole languages and pidgins can be found all over the world.
Several educated native dialects of English have wide acceptance as standards in much of the world. In the United Kingdom much emphasis is placed on Received Pronunciation, an educated dialect of South East England. General American, which is spread over most of the United States and much of Canada, is more typically the model for the American continents and areas (such as the Philippines) that have had either close association with the United States, or a desire to be so identified. In Oceania, the major native dialect of Australian English is spoken as a first language by the vast majority of the inhabitants of the Australian continent, with General Australian serving as the standard accent. The English of neighbouring New Zealand as well as that of South Africa have to a lesser degree been influential native varieties of the language.
Aside from these major dialects, there are numerous other varieties of English, which include, in most cases, several subvarieties, such as Cockney, Scouse and Geordie within British English; Newfoundland English within Canadian English; and African American Vernacular English ("Ebonics") and Southern American English within American English. English is a pluricentric language, without a central language authority like France's Académie française; and therefore no one variety is considered "correct" or "incorrect" except in terms of the expectations of the particular audience to which the language is directed.
Scots has its origins in early Northern Middle English and developed and changed during its history with influence from other sources. However, following the Acts of Union 1707 a process of language attrition began, whereby successive generations adopted more and more features from Standard English. Whether Scots is now a separate language or is better described as a dialect of English (i.e. part of Scottish English) is in dispute, although the UK government accepts Scots as a regional language and has recognised it as such under the European Charter for Regional or Minority Languages. There are a number of regional dialects of Scots, and pronunciation, grammar and lexis of the traditional forms differ, sometimes substantially, from other varieties of English.
English speakers have many different accents, which often signal the speaker's native dialect or language. For the most distinctive characteristics of regional accents, see Regional accents of English, and for a complete list of regional dialects, see List of dialects of the English language. Within England, variation is now largely confined to pronunciation rather than grammar or vocabulary. At the time of the Survey of English Dialects, grammar and vocabulary differed across the country, but a process of lexical attrition has led most of this variation to die out.
Just as English itself has borrowed words from many different languages over its history, English loanwords now appear in many languages around the world, indicative of the technological and cultural influence of its speakers. Several pidgins and creole languages have been formed on an English base, such as Jamaican Patois, Nigerian Pidgin, and Tok Pisin. There are many words in English coined to describe forms of particular non-English languages that contain a very high proportion of English words.
It is well-established that informal speech registers tend to be made up predominantly of words of Anglo-Saxon or Germanic origin, whereas the proportion of the vocabulary that is of Latinate origins is likely to be higher in legal, scientific, and otherwise scholarly or academic texts.
Child-directed speech, which is an informal speech register, also tends to rely heavily on vocabulary rife in words derived from Anglo-Saxon. The speech of mothers to young children has a higher percentage of native Anglo-Saxon verb tokens than speech addressed to adults. In particular, in parents' child-directed speech the clausal core  is built in the most part by Anglo-Saxon verbs, namely, almost all tokens of the grammatical relations subject-verb, verb-direct object and verb-indirect object that young children are presented with, are constructed with native verbs. The Anglo-Saxon verb vocabulary consists of short verbs, but its grammar is relatively complex. Syntactic patterns specific to this sub-vocabulary in present-day English include periphrastic constructions for tense, aspect, questioning and negation, and phrasal lexemes functioning as complex predicates, all of which also occur in child-directed speech.
The historical origin of vocabulary items affects the order of acquisition of various aspects of language development in English-speaking children. Latinate vocabulary is in general a later acquisition in children than the native Anglo-Saxon one. Young children almost exclusively use the native verb vocabulary in constructing basic grammatical relations, apparently mastering its analytic aspects at an early stage.
A version of the language almost universally agreed upon by educated English-speakers around the world is called formal written English. It takes virtually the same form regardless of where it is written, in contrast to spoken English, which differs significantly between dialects, accents, and varieties of slang and of colloquial and regional expressions. Local variations in the formal written version of the language are quite limited, being restricted largely to minor spelling, lexical and grammatical differences between different national varieties of English (e.g. British, American, Indian, Australian, South African, etc.).
Artificially simplified versions of the language have been created that are easier for non-native speakers to read. Basic English is a constructed language, with a restricted number of words, created by Charles Kay Ogden and described in his book Basic English: A General Introduction with Rules and Grammar (1930). Ogden said that it would take seven years to learn English, seven months for Esperanto, and seven weeks for Basic English. Thus, Basic English may be employed by companies that need to make complex books for international use, as well as by language schools that need to impart some knowledge of English in a short time.
Ogden did not include any words in Basic English that could be said instead with a combination of other words already in the Basic English lexicon, and he worked to make the vocabulary suitable for speakers of any other language. He put his vocabulary selections through a large number of tests and adjustments. Ogden also simplified the grammar but tried to keep it normal for English users. Although it was not built into a program, similar simplifications were devised for various international uses.
Simplified English is a controlled language originally developed for aerospace industry maintenance manuals. It employs a carefully limited and standardised subset of English. Simplified English has a lexicon of approved words and those words can only be used in certain ways. For example, the word close can be used in the phrase "Close the door" but not "do not go close to the landing gear".
Other constructed varieties of English include:
The phonology (sound system) of English differs between dialects. The descriptions below are most closely applicable to the standard varieties known as Received Pronunciation (RP) and General American. For information concerning a range of other varieties, see IPA chart for English dialects.
The table below shows the system of consonant phonemes that functions in most major varieties of English. The symbols are from the International Phonetic Alphabet (IPA), and are used in the pronunciation keys of many dictionaries. For more detailed information see English phonology: Consonants.
Where consonants are given in pairs (as with "p b"), the first is voiceless, the second is voiced. Most of the symbols represent the same sounds as they normally do when used as letters (see Writing system below), but /j/ represents the initial sound of yacht. The symbol /ʃ/ represents the sh sound, /ʒ/ the middle sound of vision, /tʃ/ the ch sound, /dʒ/ the sound of j in jump, /θ/ and /ð/ the th sounds in thing and this respectively, and /ŋ/ the ng sound in sing. The voiceless velar fricative /x/ is not a regular phoneme in most varieties of English, although it is used by some speakers in Scots/Gaelic words such as loch or in other loanwords such as Chanukah.
Some of the more significant variations in the pronunciation of consonants are these:
The system of vowel phonemes and their pronunciation is subject to significant variation between dialects. The table below lists the vowels found in Received Pronunciation (RP) and General American, with examples of words in which they occur. The vowels are represented with symbols from the International Phonetic Alphabet; those given for RP are in relatively standard use in British dictionaries and other publications. For more detailed information see English phonology: Vowels.
Some points to note:
English is a strongly stressed language. In content words of any number of syllables, as well as function words of more than one syllable, there will be at least one syllable with lexical stress. An example of this is civilization, in which the first and fourth syllables carry stress, and the other syllables are unstressed. The position of stress in English words is not predictable. English also has strong prosodic stress: typically the last stressed syllable of a phrase receives extra emphasis, but this may also occur on words to which a speaker wishes to draw attention. As regards rhythm, English is classed as a stress-timed language: one in which there is a tendency for the time intervals between stressed syllables to become equal, and therefore to shorten unstressed syllables. It is uncertain when English became stress-timed, but as most other surviving Germanic languages are it may date to before the breakup of proto-West Germanic.
Stress in English is sometimes phonemic; that is, capable of distinguishing words. In particular, many words used as verbs and nouns have developed different stress patterns for each use: for example, increase is stressed on the first syllable as a noun, giving increase, but on the second syllable as a verb, giving increase; see also Initial-stress-derived noun. Closely related to stress in English is the process of vowel reduction; for example, in the noun contract the first syllable is stressed and contains the vowel /ɒ/ (in RP), whereas in the verb contract the first syllable is unstressed and its vowel is reduced to /ə/ (schwa). The same process applies to certain common function words like of, which are pronounced with different vowels depending on whether or not they are stressed within the sentence. For more details, see Reduced vowels in English. Despite these practices, phonemic stress in English is generally a convention rather than essential to distinguish homophones: in both these examples, whether the word is being used as a noun or verb should normally be clear from context.
As concerns intonation, the pitch of the voice is used syntactically in English; for example, to convey whether the speaker is certain or uncertain about the polarity: most varieties of English use falling pitch for definite statements, and rising pitch to express uncertainty, as in yes–no questions. There is also a characteristic change of pitch on strongly stressed syllables, particularly on the "nuclear" (most strongly stressed) syllable in a sentence or intonation group. For more details see Intonation (linguistics): Intonation in English.
English grammar has minimal inflection compared with most other Indo-European languages. For example, Modern English, unlike Modern German or Dutch and the Romance languages, lacks grammatical gender and adjectival agreement. Case marking has almost disappeared from the language and mainly survives in pronouns. The patterning of strong (e.g. speak/spoke/spoken) versus weak verbs (e.g. love/loved or kick/kicked) inherited from its Germanic origins has declined in importance in modern English, and the remnants of inflection (such as plural marking) have become more regular.
At the same time, the language has become more analytic, and has developed features such as modal verbs and word order as resources for conveying meaning. Auxiliary verbs mark constructions such as questions, negative polarity, the passive voice and progressive aspect. English word order has moved from the Germanic V2 word order to being almost exclusively subject-verb-object; as English makes extensive use of auxiliary verbs, this will often create clusters of two or more verbs at the centre of the sentence, such as "he had hoped to try to open it". The long literary history of English has also created many conventions regarding the use of techniques such as verbal nouns and relative clauses to express complex ideas in formal writing.
English vocabulary has changed considerably over the centuries.
Like many languages deriving from Proto-Indo-European (PIE), many of the most common words in English can trace back their origin (through the Germanic branch) to PIE. Such words include the basic pronouns I, from Old English ic, (cf. German Ich, Gothic ik, Latin ego, Greek ego, Sanskrit aham), me (cf. German mich, mir, Gothic mik, mīs, Latin mē, Greek eme, Sanskrit mam), numbers (e.g. one, two, three, cf. Dutch een, twee, drie, Gothic ains, twai, threis (þreis), Latin ūnus, duo, trēs, Greek oinos "ace (on dice)", duo, treis), common family relationships such as mother, father, brother, sister etc. (cf. Dutch moeder, Greek meter, Latin mater, Sanskrit matṛ; mother), names of many animals (cf. German Maus, Dutch muis, Sanskrit mus, Greek mus, Latin mūs; mouse), and many common verbs (cf. Old High German knājan, Old Norse kná, Greek gignōmi, Latin gnoscere, Hittite kanes; to know).
Germanic words (generally words of Old English or to a lesser extent Old Norse origin) tend to be shorter than Latinate words, and are more common in ordinary speech, and include nearly all the basic pronouns, prepositions, conjunctions, modal verbs etc. that form the basis of English syntax and grammar. The shortness of the words is generally due to syncope in Middle English (e.g. OldEng hēafod > ModEng head, OldEng sāwol > ModEng soul) and to the loss of final syllables due to stress (e.g. OldEng gamen > ModEng game, OldEng ǣrende > ModEng errand), not because Germanic words are inherently shorter than Latinate words (the lengthier, higher-register words of Old English were largely forgotten following the subjugation of English after the Norman Conquest, and most of the Old English lexis devoted to literature, the arts, and sciences ceased to be productive when it fell into disuse. Only the shorter, more direct, words of Old English tended to pass into the Modern language.)
Consequently, those words which tend to be regarded as elegant or educated in Modern English are usually Latinate. However, the excessive use of Latinate words is considered at times to be either pretentious or an attempt to obfuscate an issue. George Orwell's essay "Politics and the English Language", considered an important scrutinisation of the English language, is critical of this, as well as other perceived misuses of the language.
An English speaker is in many cases able to choose between Germanic and Latinate synonyms: come or arrive; sight or vision; freedom or liberty. In some cases, there is a choice between a Germanic derived word (oversee), a Latin derived word (supervise), and a French word derived from the same Latin word (survey); or even Germanic words derived from Norman French (e.g., warranty) and Parisian French (guarantee), and even choices involving multiple Germanic and Latinate sources are possible: sickness (Old English), ill (Old Norse), infirmity (French), affliction (Latin). Such synonyms harbour a variety of different meanings and nuances. Yet the ability to choose between multiple synonyms is not a consequence of French and Latin influence, as this same richness existed in English prior to the extensive borrowing of French and Latin terms. Old English was extremely resourceful in its ability to express synonyms and shades of meaning on its own, in many respects rivaling or exceeding that of Modern English (synonyms numbering in the thirties for certain concepts were not uncommon).
Take for instance the various ways to express the word "astronomer" or "astrologer" in Old English: tunglere, tungolcræftiga, tungolwītega, tīdymbwlātend, tīdscēawere. In Modern English, however, the roles of such synonyms have largely been replaced by equivalents taken from Latin, French, and Greek, as English has taken the position of a diminished reliance upon native elements and resources for the creation of new words and terminologies. Familiarity with the etymology of groups of synonyms can give English speakers greater control over their linguistic register. See: List of Germanic and Latinate equivalents in English, Doublet (linguistics).
A commonly noted area where Germanic and French-derived words coexist is that of domestic or game animals and the meats produced from them. The nouns for meats are often different from, and unrelated to, those for the corresponding animals, the animal commonly having a Germanic name and the meat having a French-derived one. Examples include: deer and venison; cow and beef; swine/pig and pork; and sheep/lamb and mutton. This is assumed to be a result of the aftermath of the Norman conquest of England, where an Anglo-Norman-speaking elite were the consumers of the meat, produced by lower classes, which happened to be largely Anglo-Saxon, although a similar duality can also be seen in other languages like French, which did not undergo such linguistic upheaval (e.g. boeuf "beef" vs. vache "cow"). With the exception of beef and pork, the distinction today is gradually becoming less and less pronounced (venison is commonly referred to simply as deer meat, mutton is lamb, and chicken is both the animal and the meat over the more traditional term poultry. Use of the term mutton, however, remains, especially when referring to the meat of an older sheep, distinct from lamb; and poultry remains when referring to the meat of birds and fowls in general.)
There are Latinate words that are used in everyday speech. These words no longer appear Latinate and oftentimes have no Germanic equivalents. For instance, the words mountain, valley, river, aunt, uncle, move, use, and push are Latinate. Likewise, the inverse can occur: acknowledge, meaningful, understanding, mindful, lavish, behaviour, forbearance, behoove, forestall, allay, rhyme, starvation, embodiment come from Anglo-Saxon, and allegiance, abandonment, debutant, feudalism, seizure, guarantee, disregard, wardrobe, disenfranchise, disarray, bandolier, bourgeoisie, debauchery, performance, furniture, gallantry are of Germanic origin, usually through the Germanic element in French, so it is oftentimes impossible to know the origin of a word based on its register.
English easily accepts technical terms into common usage and often imports new words and phrases. Examples of this phenomenon include contemporary words such as cookie, Internet and URL (technical terms), as well as genre, über, lingua franca and amigo (imported words/phrases from French, German, Italian, and Spanish, respectively). In addition, slang often provides new meanings for old words and phrases. In fact, this fluidity is so pronounced that a distinction often needs to be made between formal forms of English and contemporary usage.
The vocabulary of English is undoubtedly very large, but assigning a specific number to its size is more a matter of definition than of calculation – and there is no official source to define accepted English words and spellings in the way that the French Académie française and similar bodies do for other languages.
Archaic, dialectal, and regional words might or might not be widely considered as "English", and neologisms are continually coined in medicine, science, technology and other fields, along with new slang and adopted foreign words. Some of these new words enter wide usage while others remain restricted to small circles.
The General Explanations at the beginning of the Oxford English Dictionary states:
The Vocabulary of a widely diffused and highly cultivated living language is not a fixed quantity circumscribed by definite limits... there is absolutely no defining line in any direction: the circle of the English language has a well-defined centre but no discernible circumference.
The current FAQ for the OED further states:
How many words are there in the English language? There is no single sensible answer to this question. It's impossible to count the number of words in a language, because it's so hard to decide what actually counts as a word.
The Oxford English Dictionary, 2nd edition (OED2) includes over 600,000 definitions, following a rather inclusive policy:
It embraces not only the standard language of literature and conversation, whether current at the moment, or obsolete, or archaic, but also the main technical vocabulary, and a large measure of dialectal usage and slang (Supplement to the OED, 1933).
The editors of Webster's Third New International Dictionary, Unabridged include 475,000 main headwords, but in their preface they estimate the true number to be much higher. Comparisons of the vocabulary size of English to that of other languages are generally not taken very seriously by linguists and lexicographers. Besides the fact that dictionaries will vary in their policies for including and counting entries, what is meant by a given language and what counts as a word do not have simple definitions. Also, a definition of word that works for one language may not work well in another, with differences in morphology and orthography making cross-linguistic definitions and word-counting difficult, and potentially giving very different results. Linguist Geoffrey K. Pullum has gone so far as to compare concerns over vocabulary size (and the notion that a supposedly larger lexicon leads to "greater richness and precision") to an obsession with penis length.
In December 2010 a joint Harvard/Google study found the language to contain 1,022,000 words and to expand at the rate of 8,500 words per year. The findings came from a computer analysis of 5,195,769 digitised books. Others have estimated a rate of growth of 25,000 words each year.
One of the consequences of the French influence is that the vocabulary of English is, to a certain extent, divided between those words that are Germanic (mostly West Germanic, with a smaller influence from the North Germanic branch) and those that are "Latinate" (derived directly from Latin, or through Norman French or other Romance languages). The situation is further compounded, as French, particularly Old French and Anglo-French, were also contributors in English of significant numbers of Germanic words, mostly from the Frankish element in French (see List of English Latinates of Germanic origin).
The majority (estimates range from roughly 50% to more than 80%) of the thousand most common English words are Germanic. However, the majority of more advanced words in subjects such as the sciences, philosophy and mathematics come from Latin or Greek, with Arabic also providing many words in astronomy, mathematics, and chemistry.
|1st 100||1st 1,000||2nd 1,000||Subsequent|
|Source: Nation 2001, p. 265|
Numerous sets of statistics have been proposed to demonstrate the proportionate origins of English vocabulary. None, as yet, is considered definitive by most linguists.
A computerised survey of about 80,000 words in the old Shorter Oxford Dictionary (3rd ed.) was published in Ordered Profusion by Thomas Finkenstaedt and Dieter Wolff (1973) that estimated the origin of English words as follows:
Many words of Old Norse origin have entered the English language, primarily from the Viking colonisation of eastern and northern England between 800–1000 during the Danelaw. These include common words such as anger, awe, bag, big, birth, blunder, both, cake, call, cast, cosy, cross, cut, die, dirt, drag, drown, egg, fellow, flat, flounder, gain, get, gift, give, guess, guest, gust, hug, husband, ill, kid, law, leg, lift, likely, link, loan, loose, low, mistake, odd, race (running), raise, root, rotten, same, scale, scare, score, seat, seem, sister, skill, skin, skirt, skull, sky, stain, steak, sway, take, though, thrive, Thursday, tight, till (until), trust, ugly, want, weak, window, wing, wrong, the pronoun they (and its forms), and even the verb are (the present plural form of to be) through a merger of Old English and Old Norse cognates. More recent Scandinavian imports include angstrom, fjord, geyser, kraken, litmus, nickel, ombudsman, saga, ski, slalom, smorgasbord, and tungsten.
A large portion of English vocabulary is of French or Langues d'oïl origin, and was transmitted to English via the Anglo-Norman language spoken by the upper classes in England in the centuries following the Norman Conquest. Words of Norman French origin include competition, mountain, art, table, publicity, role, pattern, joust, choice, and force. As a result of the length of time they have been in use in English, these words have been anglicised to fit English rules of phonology, pronunciation and spelling.
Some French words were adopted during the 17th to 19th centuries, when French was the dominant language of Western international politics and trade. These words can normally be distinguished because they retain French rules for pronunciation and spelling, including diacritics, are often phrases rather than single words, and are sometimes written in italics. Examples include police, routine, machine, façade, table d'hôte and affaire de cœur. These words and phrases retain their French spelling and pronunciation because historically their French origin was emphasised to denote the speaker as educated or well-travelled at a time when education and travelling was still restricted to the middle and upper classes, and so their use implied a higher social status in the user. (See also: French phrases used by English speakers).
Many words describing the navy, types of ships, and other objects or activities on the water are of Dutch origin. Yacht, skipper, cruiser, flag, freight, furlough, breeze, hoist, iceberg, boom, duck ("fabric, cloth"), and maelstrom are examples. Other words pertain to art and daily life: easel, etch, slim, staple (Middle Dutch stapel "market"), slip (Middle Dutch slippen), landscape, cookie, curl, shock, aloof, boss, brawl (brallen "to boast"), smack (smakken "to hurl down"), shudder, scum, peg, coleslaw, waffle, dope (doop "dipping sauce"), slender (Old Dutch slinder), slight, gas, pump. Dutch has also contributed to English slang, e.g. spook, and the now obsolete snyder (tailor) and stiver (small coin).
Words from Low German include bluster, cower, dollar, drum, geek, grab, lazy, mate, monkey, mud, ogle, orlop, paltry, poll, poodle, prong, scurvy, smug, smuggle, trade.
Since around the 9th century, English has been written in the Latin script, which replaced Anglo-Saxon runes. The modern English alphabet contains 26 letters of the Latin script: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z (which also have majuscule, capital or uppercase forms: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z). Other symbols used in writing English include the ligatures, æ and œ (though these are no longer common). There is also some usage of diacritics, mainly in foreign loanwords (like the acute accent in café and exposé), and in the occasional use of a diaeresis to indicate that two vowels are pronounced separately (as in naïve, Zoë). For more information see English terms with diacritical marks.
The spelling system, or orthography, of English is multilayered, with elements of French, Latin and Greek spelling on top of the native Germanic system; further complications have arisen through sound changes with which the orthography has not kept pace. This means that, compared with many other languages, English spelling is not a reliable indicator of pronunciation and vice versa (it is not, generally speaking, a phonemic orthography).
Though letters and sounds may not correspond in isolation, spelling rules that take into account syllable structure, phonetics, and accents are 75% or more reliable. Some phonics spelling advocates claim that English is more than 80% phonetic. However, English has fewer consistent relationships between sounds and letters than many other languages; for example, the letter sequence ough can be pronounced in 10 different ways. The consequence of this complex orthographic history is that reading can be challenging. It takes longer for students to become completely fluent readers of English than of many other languages, including French, Greek, and Spanish. English-speaking children have been found to take up to two years longer to learn to read than children in 12 other European countries.
As regards the consonants, the correspondence between spelling and pronunciation is fairly regular. The letters b, d, f, h, j, k, l, m, n, p, r, s, t, v, w, z represent, respectively, the phonemes /b/, /d/, /f/, /h/, /dʒ/, /k/, /l/, /m/, /n/, /p/, /r/, /s/, /t/, /v/, /w/, /z/ (as tabulated in the Consonants section above). The letters c and g normally represent /k/ and /ɡ/, but there is also a soft c pronounced /s/, and a soft g pronounced /dʒ/. Some sounds are represented by digraphs: ch for /tʃ/, sh for /ʃ/, th for /θ/ or /ð/, ng for /ŋ/ (also ph is pronounced /f/ in Greek-derived words). Doubled consonant letters (and the combination ck) are generally pronounced as single consonants, and qu and x are pronounced as the sequences /kw/ and /ks/. The letter y, when used as a consonant, represents /j/. However this set of rules is not applicable without exception; many words have silent consonants or other cases of irregular pronunciation.
With the vowels, however, correspondences between spelling and pronunciation are even more irregular. As can be seen under Vowels above, there are many more vowel phonemes in English than there are vowel letters (a, e, i, o, u, y). This means that diphthongs and other long vowels often need to be indicated by combinations of letters (like the oa in boat and the ay in stay), or using a silent e or similar device (as in note and cake). Even these devices are not used consistently, so consequently vowel pronunciation remains the main source of irregularity in English orthography.
Wikipedia's India estimate of 350 million includes two categories – "English Speakers" and "English Users". The distinction between the Speakers and Users is that Users only know how to read English words while Speakers know how to read English, understand spoken English as well as form their own sentences to converse in English. The distinction becomes clear when you consider the China numbers. China has over 200~350 million users that can read English words but, as anyone can see on the streets of China, only handful of million who are English speakers.
Hence we exclude all words that had become obsolete by 1150 [the end of the Old English era]... Dialectal words and forms which occur since 1500 are not admitted, except when they continue the history of the word or sense once in general use, illustrate the history of a word, or have themselves a certain literary currency.
|Find more about English language at Wikipedia's sister projects|
|Definitions and translations from Wiktionary|
|Media from Commons|
|Quotations from Wikiquote|
|Textbooks from Wikibooks|
|Learning resources from Wikiversity|
|Database entry Q1860 on Wikidata|