三种方法对草地贪夜蛾基因组转座子的注释
Comparison of three annotation methods for characterizing transposable elements in the Spodoptera frugiperda genome
张春辉 王 磊 刘 运 彭长军 岳碧松 李 静
点击:914次 下载:29次
DOI:10.7679/j.issn.2095-1353.2021.069
作者单位:四川大学生命科学学院,生物资源与生态环境教育部重点实验室,成都 610064;中国科学院成都生物研究所, 成都 610041
中文关键词:转座子;Repbase;ArTEdb;从头预测;草地贪夜蛾析
英文关键词:transposable element; Repbase; ArTEdb; de novo; Spodoptera frugiperda
中文摘要:
【目的】 转座子(Transposable element,TE)是昆虫基因组的重要组成,不同昆虫类群的TE组成、基因组占比及转座活性等基本特征存在巨大差异。本研究旨在探究不同方法对于草地贪夜蛾Spodoptera frugiperda TE的注释效果,并在基因组水平阐明草地贪夜蛾TE的基本特征。【方法】 采用3种方法对草地贪夜蛾基因组TE进行预测,包括基于数据库Repbase、ArTEdb进行同源预测,基于重复序列的特性和结构进行从头预测。【结果】 ArTEdb方法和从头预测方法鉴定的TE分别占基因组21.48%和27.26%,其中LINE元件无论是拷贝数还是分布密度都最高;其次是DNA元件。2种方法预测的TE分歧率分布峰值约10%,而分歧率<10%的TE主要是DNA转座子和LINE。比较3种方法的预测结果,Repbase方法灵敏度低,预测的TE远少于其他2种方法。ArTEdb方法能注释出更多的TE,但该方法对于已知超家族鉴定效果不佳。而从头预测注释出的TE数量多,且能划分到不同超家族,甚至能鉴定不包含在Repbase鳞翅目库的TE超家族。【结论】 草地贪夜蛾基因组最主要的TE类型是LINE和DNA元件,基因组存在大量年轻的转座子,草地贪夜蛾在TE家族的组成上与其它鳞翅目物种存在差异。从头预测的方法对草地贪夜蛾基因组TE注释效果较其它2种方法更好。这一研究结果为深入研究转座子的功能及其对草地贪夜蛾基因组多样性奠定了基础。
英文摘要:
[Objectives] To find the best method to annotate and characterize Transposable elements (TEs), an important component of insect genomes that vary widely in content, families and transposition activity, among insect taxa, in the Spodoptera frugiperda genome. [Methods] Three annotation methods were used to identify S. frugiperda TEs; Repbase- and ArTEdb- based, homologous prediction and RepeatModeler based de novo annotation. [Results] The ArTEdb and de novo methods predicted TE percentages of 21.48% and 27.26%, respectively. LINEs were the most dominant TEs, both in terms of copy number and density. TEs divergence rates peaked at around 10%, indicating that the majority of TEs have appeared in the S. frugiperda genome recently. TEs with divergence rates <10% were mainly LINEs and DNA transposons. In addition, several superfamilies were abundant in S. frugiperda that are rare or lacking in other Lepidopteran species. Of the three annotation methods, Repbase had the lowest sensitivity and identified the least number of TEs. ArTEdb identified abundant TEs, and had good sensitivity to DNA elements but failed to further classify TEs into superfamilies. The de novo method not only identified the most TEs but also successfully classified these into different superfamilies. It also identified new superfamilies currently not included in the Lepidopteran repbase database. [Conclusion] We successfully characterized TE content and composition in S. frugiperda and found that the de novo method was superior to the Repbase and ArTEdb methods in terms of both identifying and categorizing TEs. These findings improve our understanding of the TEs of S. frugiperda and should benefit further studies on the functional significance of TEs and their contribution to genomic diversity.