Identification and analysis of SNP and InDel loci in Nosema ceranae
Author of the article:ZHANG Wen-De;CAI Zong-Bing;LONG Qi;WU Ying;SUN Ming-Hui;KANG Yu-Xin;HU Ying;ZHAO Xiao;CHEN Da-Fu;GUO
Author's Workplace:College of Animal Sciences (College of Bee Science), Fujian Agriculture and Forestry University, Fuzhou 350002, China; Apitherapy Research Institute, Fujian Agriculture and Forestry University, Fuzhou 350002, China
Key Words:Nosema ceranae; single nucleotide polymorphism (SNP); insertion and deletion (InDel); transcriptome; molecular marker
Abstract:
[Objectives] Nosema ceranae is a widespread, single-cell, fungal pathogen that exclusively infects
honeybee midgut epithelial cells worldwide. The objective of this study is to
identify and analyze single nucleotide polymorphism (SNP) and insertion and
deletion (InDel) loci in N. ceranae using high-quality transcriptome data
obtained from clean N. ceranae spores, with the aim of developing novel
molecular markers. [Methods] SNP and InDel loci
were detected using GATK software. SnpEff software was
used to predict genomic regions with mutation sites and the effects caused by
mutation. Genes containing SNP or InDel loci were
respectively aligned to the GO and KEGG databases to annotate them to
corresponding function and pathways. [Results] A total
of 28 195 SNP loci were identified in N. ceranae, 21 403 of which were conversion loci and 6 792
of which were transversion loci. These SNP loci had 12 types of mutation, the
most abundant of which was C/T. SNP loci were mainly distributed in the CDS
region, followed by the intergenic region, upstream region, downstream region
and the intron. In addition, the most common type of mutation codon in SNP loci
was synonymous mutation. Genes containing SNP loci were annotated to 43 GO
terms, including metabolic process, cellular process and catalytic activity,
and 85 KEGG pathways, such as metabolic pathway, ribosome and biosynthesis of
secondary metabolites. 2 831 InDel loci were identified, most of which were
distributed in the intergenic region with fewest found in the CDS region. In
addition, the most abundant type of codon mutation was the frameshift mutation.
Genes containing Indel loci were annotated to 38 GO terms, including cellular
process, cell and binding, and 73 KEGG pathways including metabolic pathways, the
biosynthesis of secondary metabolites and ribosomes. [Conclusion] There are high numbers of SNP and InDel loci
in N. ceranae and, similar to other
species, the most common SNP mutation is conversion. The genomic distribution
of the functional elements and mutation types of SNP loci are obviously
different from those of InDel loci. Genes containing SNP and InDel loci are
potentially involved in the adaptation of N. ceranae to the intracellular environment and the proliferation process
of this pathogen.