Phylogeography of SARS-CoV-2 pandemic in Spain: a story of multiple introductions, micro-geographic stratification, founder effects, and super-spreaders

Alberto Gómez-Carballa; Xabier Bello; Jacobo Pardo-Seco; María Luisa Pérez del Molino; Federico Martinón-Torres; Antonio Salas

doi:10.24272/j.issn.2095-8137.2020.217

Alberto Gómez-Carballa, Xabier Bello, Jacobo Pardo-Seco, María Luisa Pérez del Molino, Federico Martinón-Torres, Antonio Salas. 2020. Phylogeography of SARS-CoV-2 pandemic in Spain: a story of multiple introductions, micro-geographic stratification, founder effects, and super-spreaders. Zoological Research, 41(6): 605-620. DOI: 10.24272/j.issn.2095-8137.2020.217

Citation:

Phylogeography of SARS-CoV-2 pandemic in Spain: a story of multiple introductions, micro-geographic stratification, founder effects, and super-spreaders

Abstract

Abstract

Spain has been one of the main global pandemic epicenters for coronavirus disease 2019 (COVID-19). Here, we analyzed >41 000 genomes (including >26 000 high-quality (HQ) genomes) downloaded from the GISAID repository, including 1 245 (922 HQ) sampled in Spain. The aim of this study was to investigate genome variation of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and reconstruct phylogeographic and transmission patterns in Spain. Phylogeographic analysis suggested at least 34 independent introductions of SARS-CoV-2 to Spain at the beginning of the outbreak. Six lineages spread very successfully in the country, probably favored by super-spreaders, namely, A2a4 (7.8%), A2a5 (38.4%), A2a10 (2.8%), B3a (30.1%), and B9 (8.7%), which accounted for 87.9% of all genomes in the Spanish database. One distinct feature of the Spanish SARS-CoV-2 genomes was the higher frequency of B lineages (39.3%, mainly B3a+B9) than found in any other European country. While B3a, B9, (and an important sub-lineage of A2a5, namely, A2a5c) most likely originated in Spain, the other three haplogroups were imported from other European locations. The B3a strain may have originated in the Basque Country from a B3 ancestor of uncertain geographic origin, whereas B9 likely emerged in Madrid. The time of the most recent common ancestor (TMRCA) of SARS-CoV-2 suggested that the first coronavirus entered the country around 11 February 2020, as estimated from the TMRCA of B3a, the first lineage detected in the country. Moreover, earlier claims that the D614G mutation is associated to higher transmissibility is not consistent with the very high prevalence of COVID-19 in Spain when compared to other countries with lower disease incidence but much higher frequency of this mutation (56.4% in Spain vs. 82.4% in rest of Europe). Instead, the data support a major role of genetic drift in modeling the micro-geographic stratification of virus strains across the country as well as the role of SARS-CoV-2 super-spreaders.