The SNP mutation R-M420 was discovered after R-M17 (R1a1a1), which resulted in a reorganization of the lineage in particular establishing a new paragroup (designated R-M420*) for the relatively rare lineages which are not in the R-SRY10831.2 (R1a1) branch leading to R-M17.
The modern distribution of R1a1a has two widely separated areas of high frequency: South Asia and Eastern Europe. Some researchers have claimed that South Asian populations had the highest STR diversity within R1a1a. Other studies have proposed Eastern European, Central Asian and even West Asian origins for R1a1a.
According to Pamjav et al. (2012), "Inner and Central Asia is an overlap zone for the R1a1-Z280 and R1a1-Z93 lineages [which] implies that an early differentiation zone of R1a1-M198 conceivably occurred somewhere within the Eurasian Steppes or the Middle East and Caucasus region as they lie between South Asia and Eastern Europe." A large, 2014 study by Peter A Underhill et al., using 16,244 individuals from over 126 populations from across Eurasia, concluded there was compelling evidence, that "the initial episodes of haplogroup R1a diversification likely occurred in the vicinity of present-day Iran."
The R1a family tree now has three major levels of branching, with the largest number of defined subclades within the dominant and best known branch, R1a1a (which will be found with various names; in particular, as "R1a1" in relatively recent but not the latest literature.)
R1a, distinguished by several unique markers including the M420 mutation, is a subclade of Haplogroup R-M173 (previously called R1). R1a has the sister-subclades Haplogroup R1b-M343, and the paragroup R-M173*.
R-M420, defined by the mutation M420, has two branches: R-SRY1532.2, defined by the mutation SRY1532.2, which makes up the vast majority; and R-M420*, the paragroup, defined as M420 positive but SRY1532.2 negative. (In the 2002 scheme, this SRY1532.2 negative minority was one part of the relatively rare group classified as the paragroup R1*.) Mutations understood to be equivalent to M420 include M449, M511, M513, L62, and L63.
R1a1 is defined by SRY1532.2 or SRY10831.2), understood to always include SRY10831.2, M448, L122, M459, and M516.) This family of lineages is dominated by M17 and M198. In contrast, paragroup R-SRY1532.2* lacks either the M17 or M198 markers.
The R-SRY1532.2* paragroup is apparently less rare than R1*, but still relatively unusual, though it has been tested in more than one survey. Underhill et all. (2009) reported 1/51 in Norway, 3/305 in Sweden, 1/57 Greek Macedonians, 1/150 Iranians, 2/734 ethnic Armenians, and 1/141 Kabardians. Sahoo et al. (2006) reported R-SRY1532.2* for 1/15 Himachal Pradesh Rajput samples.
R1a1a1 (RM-417) is the most widely found subclade, in two variations which are found respectively in Europe (R1a1a1b1 (R-Z282) ([R1a1a1a*] (R-Z282) (Underhill 2014/2015)) and Central and South Asia (R1a1a1b2 (R-Z93) ([R1a1a2*] (R-Z93) Underhill 2014/2015)).
R-M458 is a mainly Slavic SNP, characterized by its own mutation, and was first called cluster N. Underhill et al. (2009) found it to be present in modern European populations roughly between the Rhine catchment and the Ural Mountains and traced it to "a founder effect that [...] falls into the early Holocene period, 7.9±2.6 KYA." M458 was found in one skeleton from a 14th-century grave field in Usedom, Mecklenburg-Vorpommern, Germany. The paper by Underhill et al. (2009) also reports a surprisingly high frequency of M458 in some Northern Caucasian populations (for example 27.5% among Karachays and 23.5% among Balkars, 7.8% among Karanogays and 3.4% among Abazas).
R1a1a1b1a1a (R-L260), commonly referred to as West Slavic or Polish, is a subclade of the larger parent group R-M458, and was first identified as an STR cluster by Pawlowski 2002 and then by Gwozdz 2009. Thus, R-L260 was what Gwozdz 2009 called cluster "P." In 2010 it was verified to be a haplogroup identified by its own mutation (SNP). It apparently accounts for about 8% of Polish men, making it the most common subclade in Poland. Outside of Poland it is less common (Pawlowski 2002). In addition to Poland, it is mainly found in the Czech Republic and Slovakia, and is considered "clearly West Slavic." The founding ancestor of R-L260 is estimated to have lived between 2000 and 3000 years ago, i.e. during the Iron Age, with significant population expansion less than 1,500 years ago.
R-M334 ([R1a1a1g1], a subclade of [R1a1a1g] (M458) c.q. R1a1a1b1a1 (M458)) was found by Underhill et al. (2009) only in one Estonian man and may define a very recently founded and small clade.
R1a1a1b1a2b3* (M417+, Z645+, Z283+, Z282+, Z280+, CTS1211+, CTS3402, Y33+, CTS3318+, Y2613+) (Gwozdz's Cluster K) is a STR based group that is R-M17(xM458). This cluster is common in Poland but not exclusive to Poland.
This large subclade appears to encompass most of the R1a1a found in Asia.
R1a1a1b2 [R1a1a2* (Underhill (2014))] (R-Z93) is most common (>30%) in the South Siberian Altai region of Russia, cropping up in Kyrgyzstan (6%) and in all Iranian populations (1–8%).
R1a1b2a* (R-Z2125): This subgroup occurs at highest frequencies in Kyrgyzstan and in Afghan Pashtuns (>40%). At a frequency of >10% it is also observed in other Afghan ethnic groups and in some populations in the Caucasus and Iran.
Table only shows positive sets from N = 3667 derived from 60 Eurasian populations sample.
R-M434 is a subclade of Z2125. It was detected in 14 people (out of 3667 people tested) all in a restricted geographical range from Pakistan to Oman. This likely reflects a recent mutation event in Pakistan (Underhill 2009).
R1a1b2a1* (R-M560 is very rare and was only observed in four samples: two Burushaski speakers (north Pakistan), one Hazara (Afghanistan), and one Iranian Azerbaijani.
R1a1b2a2* (R-M780) occurs at high frequency in South Asia: India, Pakistan, Afghanistan, and the Himalayas. The group also occurs at >3% in some Iranian populations and is present at >30% in Roma from Croatia and Hungary.
Besides these, studies show high percentages in regionally diverse groups such as Manipuris (50%)(Underhill 2009) to the extreme North East and in Punjab (47%)(Kivisild 2003) to the extreme North West.
Wells 2001, noted that in the western part of the country, Iranians show low R1a1a levels, while males of eastern parts of Iran carried up to 35% R1a1a. Nasidze 2004 found R1a1a in approximately 20% of Iranian males from the cities of Tehran and Isfahan. Regueiro 2006 in a study of Iran, noted much higher frequencies in the south than the north.
Further to the north of these Middle Eastern regions on the other hand, R1a1a levels start to increase in the Caucasus, once again in an uneven way. Several populations studied have shown no sign of R1a1a, while highest levels so far discovered in the region appears to belong to speakers of the Karachay-Balkar language among whom about one quarter of men tested so far are in haplogroup R1a1a (Underhill 2009).
Haplogroup R1a1a was found at elevated levels among a sample of the Israeli population who self-designated themselves as Levites and Ashkenazi Jews (Levites comprise approximately 4% of Jews). Behar reported R1a1a to be the dominant haplogroup in Ashkenazi Levites (52%), although rare in Ashkenazi Cohanim (1.3%). (Behar 2003).
The remains of a father and his two sons, from an archaeological site discovered in 2005 near Eulau (in Saxony-Anhalt, Germany) and dated to about 2600 BCE, tested positive for the Y-SNP marker SRY10831.2. The Ysearch number for the Eulau remains is 2C46S. The ancestral clade was thus present in Europe at least 4600 years ago, in association with one site of the widespread Corded Ware culture (Haak 2008).
The question of the origins of R1a1a is relevant to the ongoing debate concerning the urheimat of the proto-Indo-European people, and may also be relevant to the origins of the Indus Valley Civilisation. R1a shows a strong correlation with Indo-European languages of western Asia and eastern Europe, being most prevalent in Poland, Russia, and Ukraine and also observed in Pakistan, India and central Asia. The connection between Y-DNA R-M17 and the spread of Indo-European languages was first noted by T. Zerjal and colleagues in 1999. Ornella Semino and colleagues proposed a postglacial spread of the R1a1 gene during the Late Glacial Maximum, subsequently magnified by the expansion of the Kurgan culture into Europe and eastward. Spencer Wells suggests that the distribution and age of R1a1 points to an ancient migration corresponding to the spread by the Kurgan people in their expansion from the Eurasian steppe.
According to Underhill et al. (2014/2015) the diversification of Z93 and the "early urbanization within the Indus Valley also occurred at [5,600 years ago] and the geographic distribution of R1a-M780 (Figure 3d) may reflect this." Poznik et al. (2016) note that 'striking expansions' occurred within R1a-Z93 at ~4,500-4,000 years ago, which "predates by a few centuries the collapse of the Indus Valley Civilisation."
Mascarenhas et al. (2015) note that the expansion of Z93 from Transcaucasia into South Asia is compatible with "the archeological records of eastward expansion of West Asian populations in the 4th millennium BCE culminating in the so-called Kura-Araxes migrations in the post-Uruk IV period."
According to Lazaridis et al. (2016), "farmers related to those from Iran spread northward into the Eurasian steppe; and people related to both the early farmers of Iran and to the pastoralists of the Eurasian steppe spread eastward into South Asia." They further note that ANI "can be modelled as a mix of ancestry related to both early farmers of western Iran and to people of the Bronze Age Eurasian steppe."[note 1]
Bryan Sykes in his book Blood of the Isles gives imaginative names to the founders or "clan patriarchs" of major British Y haplogroups, much as he did for mitochondrial haplogroups in his work The Seven Daughters of Eve. He named R1a1a in Europe the "clan" of a "patriarch" Sigurd, reflecting the theory that R1a1a in the British Isles has Norse origins.
The historic naming system commonly used for R1a was inconsistent in different published sources, because it changed often; this requires some explanation.
In 2002, the Y Chromosome Consortium (YCC) proposed a new naming system for haplogroups (YCC 2002), which has now become standard. In this system, names with the format "R1" and "R1a" are "phylogenetic" names, aimed at marking positions in a family tree. Names of SNP mutations can also be used to name clades or haplogroups. For example, as M173 is currently the defining mutation of R1, R1 is also R-M173, a "mutational" clade name. When a new branching in a tree is discovered, some phylogenetic names will change, but by definition all mutational names will remain the same.
The widely occurring haplogroup defined by mutation M17 was known by various names, such as "Eu19", as used in (Semino 2000) in the older naming systems. The 2002 YCC proposal assigned the name R1a to the haplogroup defined by mutation SRY1532.2. This included Eu19 (i.e. R-M17) as a subclade, so Eu19 was named R1a1. Note, SRY1532.2 is also known as SRY10831.2 The discovery of M420 in 2009 has caused a reassignment of these phylogenetic names.(Underhill 2009 and ISOGG 2012) R1a is now defined by the M420 mutation: in this updated tree, the subclade defined by SRY1532.2 has moved from R1a to R1a1, and Eu19 (R-M17) from R1a1 to R1a1a.
More recent updates recorded at the ISOGG reference webpage involve branches of R-M17, including one major branch, R-M417.
Contrasting family trees for R1a, showing the evolution of understanding of this clade
^Van Oven M, Van Geystelen A, Kayser M, Decorte R, Larmuseau HD (2014). "Seeing the wood for the trees: a minimal reference phylogeny for the human Y chromosome". Human Mutation35 (2): 187–91. doi:10.1002/humu.22468. PMID24166809.
^T. Zerjal et al, The use of Y-chromosomal DNA variation to investigate population history: recent male spread in Asia and Europe, in S.S. Papiha, R. Deka and R. Chakraborty (eds.), Genomic diversity: applications in human population genetics (1999), pp. 91–101.
^Ornella Semino, Giuseppe Passarino, Peter J. Oefner, Alice A. Lin, Svetlana Arbuzova, Lars E. Beckman, Giovanna De Benedictis, Paolo Francalacci, Anastasia Kouvatsi, Svetlana Limborska, Mladen Marciki, Anna Mika, Barbara Mika, Dragan Primorac, A. Silvana Santachiara-Benerecetti, L. Luca Cavalli-Sforza, Peter A. Underhill, The Genetic Legacy of Paleolithic Homo sapiens sapiens in Extant Europeans: A Y Chromosome Perspective, Science, vol. 290 (10 November 2000), pp. 1155-1159.
^R.S. Wells et al, The Eurasian Heartland: A continental perspective on Y-chromosome diversity, Proceedings of the National Academy of Sciences of the USA, vol. 98 no.18 (2001), pp. 10244-10249.
^ abcKeyser, Christine; Bouakaze, Caroline; Crubézy, Eric; Nikolaev, Valery G.; Montagnon, Daniel; Reis, Tatiana; Ludes, Bertrand (2009). "Ancient DNA provides new insights into the history of south Siberian Kurgan people". Human Genetics126 (3): 395–410. doi:10.1007/s00439-009-0683-0. ISSN0340-6717.
^Ricaut, F.; et al. (2004). "Genetic Analysis of a Scytho-Siberian Skeleton and Its Implications for Ancient Central Asian Migrations". Human Biology76: 1.
^Корниенко И. В., Водолажский Д. И. Использование нерекомбинантных маркеров Y-хромосомы в исследованиях древних популяций (на примере поселения Танаис)//Материалы Донских антропологических чтений. Ростов-на-Дону, Ростовский научно-исследовательский онкологический институт, Ростов-на-Дону, 2013.
^Kim, Kijeong; Brenner, Charles H.; Mair, Victor H.; Lee, Kwang-Ho; Kim, Jae-Hyun; Gelegdorj, Eregzen; Batbold, Natsag; Song, Yi-Chung; Yun, Hyeung-Won; Chang, Eun-Jeong; Lkhagvasuren, Gavaachimed; Bazarragchaa, Munkhtsetseg; Park, Ae-Ja; Lim, Inja; Hong, Yun-Pyo; Kim, Wonyong; Chung, Sang-In; Kim, Dae-Jin; Chung, Yoon-Hee; Kim, Sung-Su; Lee, Won-Bok; Kim, Kyung-Yong (2010). "A western Eurasian male is found in 2000-year-old elite Xiongnu cemetery in Northeast Mongolia". American Journal of Physical Anthropology142 (3): 429–440. doi:10.1002/ajpa.21242. ISSN0002-9483. PMID20091844.
Bouakaze, C.; Keyser, C; Amory, S; Crubézy, E; Ludes, B (2007). "First successful assay of Y-SNP typing by SNaPshot minisequencing on ancient DNA". International Journal of Legal Medicine121 (6): 493–9. doi:10.1007/s00414-007-0177-3. PMID17534642.
Cordaux, Richard; Aunger, R; Bentley, G; Nasidze, I; Sirajuddin, SM; Stoneking, M (2004). "Independent Origins of Indian Caste and Tribal Paternal Lineages". Current Biology14 (3): 231–235. doi:10.1016/j.cub.2004.01.024. PMID14761656.
Flores, Carlos; Maca-Meyer, N; Larruga, JM; Cabrera, VM; Karadsheh, N; Gonzalez, AM (2005). "Isolates in a corridor of migrations: a high-resolution analysis of Y-chromosome variation in Jordan". Journal of Human Genetics50 (9): 435–441. doi:10.1007/s10038-005-0274-4. PMID16142507.
Hammer, Michael F.; Behar, Doron M.; Karafet, Tatiana M.; Mendez, Fernando L.; Hallmark, Brian; Erez, Tamar; Zhivotovsky, Lev A.; Rosset, Saharon; Skorecki, Karl (2009). "Response"(PDF). Human Genetics126 (5): 725–726. doi:10.1007/s00439-009-0747-1.
Mukherjee, Namita; Nebel, Almut; Oppenheim, Ariella; Majumder, Partha P. (2001). "High-resolution analysis of Y-chromosomal polymorphisms reveals signatures of population movements from central Asia and West Asia into India". Journal of Genetics (December 2001) 80 (3): 125–135. doi:10.1007/BF02717908. PMID11988631..
Passarino, G; Semino, Ornella; Magria, Chiara; Al-Zahery, Nadia; Benuzzi, Giorgia; Quintana-Murci, Lluis; Andellnovic, Slmun; Bullc-Jakus, Floriana; et al. (2001). "The 49a,f haplotype 11 is a new marker of the EU19 lineage that traces migrations from northern regions of the black sea". Hum. Immunol.62 (9): 922–932. doi:10.1016/S0198-8859(01)00291-9. PMID11543894.
Saha, Anjana; Sharma, S; Bhat, A; Pandit, A; Bamezai, R (2005). "Genetic affinity among five different population groups in India reflecting a Y-chromosome gene flow". Journal of Human Genetics50 (1): 49–51. doi:10.1007/s10038-004-0219-3. PMID15611834..
Sanchez, J; Børsting, C; Hallenberg, C; Buchard, A; Hernandez, A; Morling, N (2003). "Multiplex PCR and minisequencing of SNPs—a model with 35 Y chromosome SNPs". Forensic Sci Int137 (1): 74–84. doi:10.1016/S0379-0738(03)00299-8. PMID14550618.
Völgyi, Antónia; Zalán, Andrea; Szvetnik, Enikő; Pamjav, Horolma (2008). "Hungarian population data for 11 Y-STR and 49 Y-SNP markers". Forensic Science International: Genetics3 (2): e27–8. doi:10.1016/j.fsigen.2008.04.006. PMID19215861.
Wang, Wei; Wise, Cheryl; Baric, Tom; Black, Michael L.; Bittles, Alan H. (2003). "The origins and genetic structure of three co-resident Chinese Muslim populations: The Salar, Bo'an and Dongxiang". Human Genetics113 (3): 244–52. doi:10.1007/s00439-003-0948-y. PMID12759817.