Saturday, December 14, 2024

Attempt to redo the models from recent papers.

 

Attempt to Redo the Models from Recent Papers

Modeling Armenia LBA as a simple mixture of Kura-Araxes and Catacomb fails: the p-value is too low (pic. 1).

Adding an Anatolian population such as İkiztepe_C improves the p-value. This approach was used in Skourtanioti (2024), and the model becomes feasible (p ≈ 0.2).

Using Çamlıbel Tarlası_LateC instead of İkiztepe further improves the p-value. This was done in Yediay (2024), yielding p > 0.3.

Replacing the Anatolian component with Leyla Tepe (from what is now Azerbaijan) improves the model much more (p > 0.7). However, the standard errors increase substantially because Kura-Araxes and Leyla Tepe are too close genome-wide. To reduce the standard errors, I need to adjust the settings (right populations). That will require some time, but I think it is already quite clear what is happening here.

Finally, when I add Ukraine Neolithic hunter-gatherers (Ukraine_N HG) as an additional source, the p-value rises to ~0.9, i.e., close to certainty. The standard errors also improve slightly, although they still require further refinement.

Conclusions

The populations that contributed to the formation of the Trialeti–Vanadzor culture (TVC) likely came from what is now the Krasnodar region, where they acquired a minor Ukraine_N HG component. They then moved via Dagestan, mixed with older South Caucasus Chalcolithic groups in the Kura-Araxes lowlands, and subsequently split into two branches:

  • One branch moved upstream into the Kura–Debed region and formed the TVC.

  • The other moved into the Urmia and Van basins and formed the Van–Urmia culture, although their genetic impact there appears to have been lower than in the TVC.

The Y-DNA associated with these groups should therefore be sought in the Krasnodar region, especially lineages such as L584, I2a2b, and PF331. Identifying Y4364 will be more difficult. It is even possible that the true homeland of Proto-Yamnaya was also located there.






Wednesday, December 11, 2024

Comments on the Chart Before Reviewing Yediay et al. (2024)

Comments on the Chart Before Reviewing Yediay et al. (2024)

Before reviewing the recent Yediay et al. (2024) preprint, a few comments about this chart are necessary.

Armenia Middle and Late Bronze Age samples clearly show ancestry from three main sources.

1. Kura–Araxes ancestry

The first component comes from Armenia EBA, associated with the Kura–Araxes culture. This is expected and does not require further explanation.

2. “Anatolia_C” ancestry

The second component is labeled Anatolia_C. At first glance this may seem surprising, but it actually has a straightforward explanation.

The Anatolia_C component largely derives from Neolithic and Chalcolithic populations of historic Armenia, associated with the Chaff-Faced Ware (CFW) cultural horizon. These groups lived before the rise of the Kura–Araxes culture and were also present in what is now Azerbaijan, where they are known archaeologically as the Leyla Tepe culture.

When the Caucasus hunter-gatherer–shifted Kura–Araxes culture expanded from its homeland in the South Caucasus, it did not completely replace these earlier populations. In many areas, they continued to live alongside the Kura–Araxes communities.

When steppe ancestry arrived from the north about 4,500 years ago, these groups most likely entered the region through what is now Azerbaijan, where they encountered the remaining populations associated with the Leyla Tepe culture. After mixing with them and acquiring this Anatolia_C-like ancestry, they moved toward the upper Kura–Debed river region, where they mixed with populations related to Armenia EBA.

Another possible source of Anatolia_C-like ancestry may have been populations living in the southern parts of the Araxes plain.

This pattern was also noted in Skourtanioti et al. (2024). However, the authors interpreted it incorrectly, proposing two separate migrations occurring during the same period—one from Anatolia and another from the steppe. Genetic bloggers had already pointed out earlier that this explanation is unlikely, and Davidski even opened a dedicated discussion thread on the issue.

3. Apparent CWC ancestry

The third component is the so-called CWC ancestry. This is almost certainly not a real signal.

The Corded Ware culture (CWC) is strongly associated with R1a, yet no R1a lineages have been found in Middle and Late Bronze Age samples from the South Caucasus (including Armenia and Georgia).

The reason CWC appears in the model is likely that the steppe groups who migrated into the South Caucasus carried a small amount of WHG-related ancestry. This additional WHG / Ukraine Neolithic hunter-gatherer affinity was already present east of the Azov region even before the formation of the Yamnaya culture.

This additional component could also explain the appearance of I2a2b in ancient Armenia.

It is important to note that CWC is genetically similar to Yamnaya, but it contains roughly:

  • about 10% additional UNHG-related ancestry

  • about 20% Euro-Anatolian farmer ancestry

Even a very small 1% WHG introgression into a Yamnaya-like population can create a statistical signal resembling 10% CWC ancestry in modeling. Given that MLBA samples already contain excess Anatolian ancestry, it is not surprising that the calculator interpreted this mixture as CWC rather than Yamnaya.

Thus, the CWC component in this case is an artifact of the modeling, although it reflects real underlying genetic processes.

Urartian period samples

A similar situation can be observed in Urartu-period samples, where the Anatolia_C component is higher. This indicates a stronger Neolithic-derived ancestry, while the steppe component is lower.

There is one exception—an outlier individual, who appears to have been a migrant from Etiuni.


Sunday, December 1, 2024

A Greek Sample in the Armenian Genetic Cluster

A Greek Sample in the Armenian Genetic Cluster

One of the Greek samples in Hovhannisyan et al. (2024) falls within the modern Armenian cluster on the PCA. I was unable to find detailed information about this individual, except that the Greek DNA used in the paper was taken from Lazaridis et al. (2014, Nature).

Based on its position on the PCA, the sample does not appear to belong to Cappadocian Greeks. It also does not resemble Trabzon Greeks. The most likely possibility is that the individual belongs to the Urum population from southwestern Georgia.

The Urums settled in the Tsalka region of Georgia after the Russian–Turkish wars of the 19th century. We have one Urum individual in our group whose DNA appears similar to that sample. Of course, this remains only a hypothesis and requires further verification.

It should also be noted that Urums should not be confused with the Urumu tribe of the Iron Age. The word Urum derives from Rome / the Roman Empire, whereas the name Urumu has an uncertain origin, although it may possibly be related to Aramu.

In the second PCA, the main modern Greek cluster can be observed, clearly separate from modern Armenians. Compared with Mycenaean-period Greeks, the modern Greek cluster appears somewhat shifted toward the north.

Another Greek sample appears to be almost certainly of Anatolian origin, as it plots approximately between the Armenian and Greek clusters.



Friday, November 29, 2024

A new alphabetic system was apparently discovered in north Syria.

A Newly Discovered Alphabetic System in Northern Syria

A new alphabetic writing system has apparently been discovered in northern Syria. It may represent the oldest known alphabet, dated to around 2400 BCE.

Until now, it was generally assumed that the first alphabet developed from Egyptian hieroglyphs, with later modifications appearing in Sinai and spreading from there to the Levant and Phoenicia.

However, this newly discovered script from ancient Syria appears to be older than the Proto-Sinaitic alphabet. If confirmed, this finding could significantly change our understanding of how and when alphabetic writing systems first emerged and evolved.

Wednesday, November 27, 2024

A Map Explaining the Formation of Modern Armenian Genetics

A Map Explaining the Formation of Modern Armenian Genetics

I created this map to illustrate how modern Armenian genetics formed. The map represents the genetic situation during the Middle Bronze Age, roughly 4,000 years ago. I deliberately chose these colors to emphasize the clinal nature of the genetic landscape.

The yellow area represents the Trialeti–Vanadzor / Lchashen cultural sphere, which shows high levels of steppe ancestry. In this context, these populations are usually associated with Etiuni.

The orange region has lower steppe ancestry, approximately comparable to that of modern Armenians. We have a few samples from this zone, including Van–Urartu.

The red region shows little or no steppe ancestry and instead has a stronger affinity to Levantine Bronze Age populations. It is notable that these areas were historically inhabited by Hurrians. We have some samples from Şırnak and Batman, although they are not recent enough to fully represent the situation during the Middle and Late Bronze Age. The Dinkha Tepe 2 sample dates to the Middle Bronze Age, but it comes from northwestern Iran, so it is not exactly representative of the red zone.

Further south, the Levantine lowlands were inhabited by populations genetically similar to the red region, but with a more pronounced southern shift. Numerous samples from sites such as Alalakh and Ebla illustrate this pattern.

Modern Armenians derive ancestry from all three regions—orange, yellow, and red. For most Armenians, the largest contribution comes from the orange region. Eastern Armenians show additional ancestry from the yellow zone, while Armenians from southwestern regions have significant orange ancestry but also some contribution from the red zone.

An important point to understand is that the orange region itself can be modeled as a mixture of yellow and red. In theory, this would allow us to reduce the number of colors used in the model, but doing so risks oversimplifying the situation. In practice, some alleles typical of the red region appear among eastern Armenians, while Armenians from southern and western areas also carry some alleles associated with the yellow region. Overall, these overlapping contributions cause all Armenian groups to cluster closely together on PCA plots.

Another key point is that modern Armenians do not show any significant additional ancestry from outside these colored regions. Of course, some sporadic influences occurred during later historical periods, but these are generally negligible and can usually be ignored in population-level calculations. Armenians who settled outside these regions sometimes acquired local ancestry, but such cases are historically documented and can be easily identified.

A reasonable question arises: why are samples from these three regions not directly used to model Armenians?

The issue likely relates to how modeling tools operate. When very closely related populations are used as sources, the standard errors increase, whereas using more distant populations often reduces them. Despite some exaggerated perceptions, the populations represented by these three colors are actually quite close genetically. For this reason, it can sometimes be easier to choose a more distant source from south of the red zone and obtain statistically feasible models. There may also be other technical factors involved that I am not fully aware of.

However, the real issue is not the models themselves. For example, Lazaridis also used Levant_N as a distal source and argued that its contribution increased after 600 BCE, yet this did not lead to sensationalist interpretations in the media. The real problem is the lack of historical interpretation accompanying many genetic models. When genetic results are not interpreted in the context of known historical processes, it is unsurprising that others interpret them according to their own narratives.

In this case, the relevant historical events are well known. One is the existence of a Hurrian cultural belt across the southern regions of historic Armenia, which likely had a more southern genetic profile. Another is the formation and expansion of the Urartian Empire. These two factors alone are sufficient to explain the main features of the modern Armenian genetic profile, although other events may also have played a role.

Hopefully, our paper with Armen Petrosyan will soon be published in English. In it, we discuss this period of genetic shift in eastern Armenia, and I hope it will help those who want to better understand this complex historical process.

PS below in the comments You can see a model mixing yellow and orange with high standard errors. Made by Nareg Asatrian


Tuesday, November 26, 2024

Sasun Armenians in Hovhannisyan et al. (2024)

 

Sasun Armenians in Hovhannisyan et al. (2024)

Hovhannisyan et al. (2024) published, for the first time, five genome-wide DNA samples of Sasun Armenians. Until now, we only had Y-DNA studies of Sasun Armenians, which showed that their Y-DNA pool differs somewhat from that of other Armenian subgroups (see picture 2). Various theories have been proposed to explain this difference based on historical records and local traditions.

The new paper examined this issue and found little difference between the autosomes of Sasun Armenians and other Armenian subgroups. This can be seen on the PCA, where Sasun samples plot close to other Armenians marked as E, W, and C, while Sasun is marked as S. All five Sasun samples fall on the southern side of the Armenian cluster, which corresponds well with their geographic location.

When the G25 coordinates of these samples become available, we will be able to examine them more closely.

Y-DNA Peculiarities

Understanding the distinct Y-DNA composition of Sasun Armenians will be difficult without ancient DNA from the region.

The haplogroup T likely had a homeland near or overlapping with the Sasun region. Meanwhile, the presence of R2 in Sasun may reflect a founder effect. Haplogroup R2 was prominent among Zagros Neolithic farmers and has recently also been identified among South Caucasus Neolithic populations.

Historical Context

The Y-DNA profile of Sasun may also be connected with the specific historical background of the region.

Assyrian sources mention a kingdom called Shubria in this area. The name of this kingdom derives from the older Sumerian term Subir. Very little is known about the Subir people, but later sources use the term Subarean language to refer to a Hurrian language. In the Iron Age, several Hurrian royal names are attested in this region. However, this does not necessarily mean that the earlier Subir populations were Hurrian as well.

The southern lowlands of Sasun had a Semitic presence, while in the north, in the Mush region, the Urumu tribes are attested. The Urumu, later known as Urme, were almost certainly an Armenian-speaking tribe.

Around 400 BCE, Xenophon described the Centrites River (modern Botan River) as the southern boundary of Armenian territory. Sasun lies north of this river, placing it clearly within the Armenian satrapy.

Conclusion

To fully understand the complex genetic history of Sasun and its surrounding regions, additional ancient DNA samples will be necessary




Sunday, November 24, 2024

The Distribution of EHG Ancestry Today

The Distribution of EHG Ancestry Today

The Eastern Hunter-Gatherer (EHG) genetic profile appears in Eastern Europe after the Last Glacial Maximum (around 20,000 years before present). Before that time, the region was inhabited by different populations that apparently disappeared due to extremely cold climatic conditions.

EHG samples are found across a wide geographic area, ranging from the North Caucasus to Karelia in the far north of Eastern Europe. Various maps on the internet attempt to illustrate the global distribution of EHG ancestry today. However, these maps require some clarification (see the link in the comment section).

Two Ways to Measure EHG Ancestry

There are two main ways to estimate the amount of EHG ancestry remaining in modern populations.

The first approach ignores the fact that much of the EHG ancestry was dispersed through the expansions of Yamnaya and Corded Ware populations. This method is commonly used, but it can be misleading. Because EHG constituted roughly half of the Yamnaya genetic profile, people may mistakenly assume that higher EHG levels automatically imply greater Yamnaya ancestry, which is not necessarily correct.

The second approach attempts to separate Yamnaya and Corded Ware ancestry from the total EHG signal, in order to identify the amount of “pure” EHG ancestry that remained independent of those migrations.

Modeling Method

To do this, I selected Corded Ware samples as a source population, since Yamnaya itself never moved into northern Europe—only Corded Ware groups derived from Yamnaya did.

I also included Ancient North Eurasian (ANE) samples from Siberia in order to avoid a pseudo-EHG signal, and used Karelia hunter-gatherers as a reference for pure EHG.

All modern populations were included in the analysis.

Results: Pure EHG

The highest levels of pure EHG ancestry not associated with Yamnaya migrations are found among:

  • Mari

  • Chuvash (a Turkic-speaking group)

  • Saami

  • some northern Russians

  • Udmurts

The highest value reaches about 33%, but most of these populations have less than 25%.

This indicates that relatively little pure EHG ancestry survives today outside the context of Yamnaya or Corded Ware expansions. It is mostly preserved in northeastern Europe, which makes sense because Corded Ware pastoralists never settled extensively in that region. The harsh climate likely made herding and early agriculture difficult, limiting their expansion there.

Corded Ware / Yamnaya Ancestry

The second chart shows where Corded Ware ancestry is highest today.

The peak levels occur in northern Europe, particularly among Germanic-speaking populations in Scandinavia, reaching about 53%.

Using Yamnaya instead of Corded Ware as a source produces essentially the same pattern. In other words, Yamnaya-related ancestry is highest in northwestern Europe.

This has a simple explanation: northern Europe, especially Scandinavia, had relatively low population density in prehistoric times, whereas southern Europe, West Asia, and South Asia had much denser populations. Migrating groups therefore left a larger genetic impact in sparsely populated regions.

Linguistic Implications

What does this distribution suggest about the language spoken by the northern EHG populations?

Since the highest levels of pure EHG are found only among a subset of Uralic-speaking groups, it is unlikely that the northern EHG originally spoke a Uralic language.

Moreover, many eastern Uralic-speaking populations have little or no EHG ancestry, although they do possess Yamnaya-related ancestry. The defining genetic feature of eastern Uralic speakers in Europe is the presence of Siberian / Nganasan-related ancestry, while their most frequent Y-DNA haplogroup (N1) also originates from Siberia.

Conversely, these northern populations virtually lack Y-DNA lineages associated with EHG. Any R1a present among them derives from Corded Ware expansions, not from earlier hunter-gatherer populations.

Taken together, this evidence suggests that the language spoken by the northern EHG populations is now extinct.

The Uralic-speaking populations likely arrived from Siberia sometime after 1500 BCE, while Indo-European groups in northern Europe—such as Balto-Slavic and Germanic speakers—descend largely from Corded Ware populations that expanded into the region after 2800 BCE.