TY - JOUR
T1 - Identification of transposable elements and satellite DNA in the Neotropical species Drosophila amaguana from the Ecuadorian Andean Forests
AU - Coba-Males, Manuel Alejandro
AU - Orozco-Arias, Simon
AU - Guyot, Romain
AU - Vela, Doris
N1 - Publisher Copyright:
© 2025 Coba-Males et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2025/12
Y1 - 2025/12
N2 - Genome size variation in eukaryotic species is largely influenced by repetitive DNA sequences such as transposable elements (TEs), simple repeats, and satellite DNAs (satDNAs), which do not necessarily correlate with organismal complexity. In insects, TEs are crucial to evolutionary processes and are correlated with variations in genome size. In this study, we describe, for the first time, the mobilome and satellitome of Drosophila amaguana, an Ecuadorian Neotropical species with a large, unexplored genome size, to assess the contribution of these repetitive DNA sequences to its genome composition. Using a draft genome assembly of approximately 455.5 Mb, generated from Illumina short-read sequences obtained from 10 wild specimens of D. amaguana collected at the Refugio de Vida Silvestre Pasochoa, we employed a de novo approach to create a manually curated TE library of 737 consensus sequences. We identified 716 novel TE families that had not been previously described, 20 TEs previously characterized in other Drosophila species, and one DNA transposon previously described in the Lepeophtheirus genus. The total TE content in the D. amaguana genome was 21.54%, distributed as follows: 6.35% Helitrons (1 superfamily), 5.13% LTR retrotransposons (5 superfamilies), 3.63% TIRs (9 superfamilies), 3.61% LINEs (7 superfamilies), 1.17% MITEs, 0.94% Maverick, 0.67% PLE, 0.02% SINEs, and 0.01% DIRS. We also identified 11.8% of simple repeats. Additionally, we estimated the satDNA content using Illumina raw reads and identified 16 satDNA families, all unique to the Drosophila genus, which comprise 4.90% of the genome. Overall, our results based on short-read data suggest that the large genome size of D. amaguana may not be the consequence of a high amount of TEs or satDNAs. Instead, its large genome size could be attributed to other factors (e.g., noncoding DNA occupying substantial portions of the genome or a high percentage of duplicated genes) that remain to be determined or explored in future studies using long-reads to overcome short-reads limitations. These findings may currently offer valuable insights into the adaptative and evolutionary processes of the mesophragmatica species group in the Andean forests.
AB - Genome size variation in eukaryotic species is largely influenced by repetitive DNA sequences such as transposable elements (TEs), simple repeats, and satellite DNAs (satDNAs), which do not necessarily correlate with organismal complexity. In insects, TEs are crucial to evolutionary processes and are correlated with variations in genome size. In this study, we describe, for the first time, the mobilome and satellitome of Drosophila amaguana, an Ecuadorian Neotropical species with a large, unexplored genome size, to assess the contribution of these repetitive DNA sequences to its genome composition. Using a draft genome assembly of approximately 455.5 Mb, generated from Illumina short-read sequences obtained from 10 wild specimens of D. amaguana collected at the Refugio de Vida Silvestre Pasochoa, we employed a de novo approach to create a manually curated TE library of 737 consensus sequences. We identified 716 novel TE families that had not been previously described, 20 TEs previously characterized in other Drosophila species, and one DNA transposon previously described in the Lepeophtheirus genus. The total TE content in the D. amaguana genome was 21.54%, distributed as follows: 6.35% Helitrons (1 superfamily), 5.13% LTR retrotransposons (5 superfamilies), 3.63% TIRs (9 superfamilies), 3.61% LINEs (7 superfamilies), 1.17% MITEs, 0.94% Maverick, 0.67% PLE, 0.02% SINEs, and 0.01% DIRS. We also identified 11.8% of simple repeats. Additionally, we estimated the satDNA content using Illumina raw reads and identified 16 satDNA families, all unique to the Drosophila genus, which comprise 4.90% of the genome. Overall, our results based on short-read data suggest that the large genome size of D. amaguana may not be the consequence of a high amount of TEs or satDNAs. Instead, its large genome size could be attributed to other factors (e.g., noncoding DNA occupying substantial portions of the genome or a high percentage of duplicated genes) that remain to be determined or explored in future studies using long-reads to overcome short-reads limitations. These findings may currently offer valuable insights into the adaptative and evolutionary processes of the mesophragmatica species group in the Andean forests.
UR - https://www.scopus.com/pages/publications/105024311949
U2 - 10.1371/journal.pone.0337390
DO - 10.1371/journal.pone.0337390
M3 - Artículo
C2 - 41370328
AN - SCOPUS:105024311949
SN - 1932-6203
VL - 20
JO - PLoS ONE
JF - PLoS ONE
IS - 12 December
M1 - e0337390
ER -