Distribution of microsatellites in the genome of Spodoptera frugiperda
Author of the article:Distribution of microsatellites in the genome of Spodoptera frugiperda
Author's Workplace:Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, China; Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, China
Key Words:Spodoptera frugiperda; microsatellite; genome; chromosome; functional annotation
Abstract:
[Objectives] To analyze the distribution of perfect SSRs (P-SSRs) in the entire genome of Spodoptera frugiperda and the GO function of genes containing P-SSRs in the exon regions, in order to provide a data base for the development of microsatellite markers and further functional studies. [Methods] Krait v0.10.2 was used to identify P-SSRs and analyze their diversity, after which a python script was used to locate P-SSRs so that their distribution in different regions of the genome could be analyzed. Finally, BLAST software, the Swiss-Prot protein database and the R package’s clusterProfiler v3.14.3 toolkit were combined to analyze the functions of genes containing P-SSRs in exon regions. [Results] A total of 64 025 P-SSRs were identified, of these mononucleotide SSRs (28 782) were the most abundant, followed by tetranucleotide SSRs (19 278), trinucleotide SSRs (7 685), dinucleotide SSRs (5 734), pentanucleotide SSRs (1 639) and hexanucleotide SSRs (907). The main repeated copy categories in the above 6 types of P-SSRs were A, AC, ATC, AAAT, AAAAT and AAAGTC, respectively. The number of repetitions of P-SSRs gradually decreased as the number of repeated bases increased. In addition to chromosomes 23, 28 and 30, the distribution of the 6 types of P-SSRs on other chromosomes was consistent with their overall distribution. A further 47 714 P-SSRs were located in the intergenic regions and 16 311 in the genes, including 16 103 in intron regions and 208 in exon regions. P-SSRs in the exon regions were distributed on 196 genes, 132 genes of which were annotated by GO annotation analysis. There were 357 GO terms, of which 122 were attributed to the cellular component, 117 to molecular function and 118 to biological processes. GO enrichment terms were mainly related to molecular functions such as phosphatase and kinase activity, and to biological processes such as RNA polymerase II mediated transcription.[Conclusion] A preliminarily analysis of the distribution of P-SSRs in the S. frugiperda genome was successfully completed. Although the total relative abundance of P-SSRs is relatively low, P-SSRs are widely distributed in the intergenic regions of the genome. In addition to mononucleotide SSRs, the relative abundance of tetranucleotide SSRs and trinucleotide SSRs is relatively high, which could provide an abundance of candidate loci for use as microsatellite molecular markers.