南京林业大学学报(自然科学版) ›› 2020, Vol. 44 ›› Issue (2): 75-83.doi: 10.3969/j.issn.1000-2006.201907017

• 研究论文 • 上一篇    下一篇

湿地松转录组SSR分析及EST-SSR标记开发

易敏1(), 张露1, 雷蕾1, 程子珊1, 孙世武2, 赖猛1,*()   

  1. 1.江西农业大学,江西特色林木资源培育与利用2011协同创新中心,江西 南昌 330045
    2.吉安市青原区白云山林场,江西 吉安 343062
  • 收稿日期:2019-07-12 修回日期:2019-09-18 出版日期:2020-03-30 发布日期:2020-04-01
  • 通讯作者: 赖猛
  • 基金资助:
    江西省青年科学基金项目(20181BAB214015);江西省青年科学基金项目(20161BAB214176);国家自然科学基金项目(31860220);江西省林业厅林业科技创新专项(201811)

Analysis of SSR information in transcriptome and development of EST-SSR molecular markers in Pinus elliottii Engelm.

YI Min1(), ZHANG Lu1, LEI Lei1, CHENG Zishan1, SUN Shiwu2, LAI Meng1,*()   

  1. 1. 2011 Collaboration Innovation Center of Jiangxi Typical Trees Cultivation and Utilization, Jiangxi Agricultural University, Nanchang 330045, China
    2. Baiyun Mountain Forest Farm in Qingyuan District, Ji’an 343062, China
  • Received:2019-07-12 Revised:2019-09-18 Online:2020-03-30 Published:2020-04-01
  • Contact: LAI Meng

摘要:

【目的】湿地松是南方地区优先推广的优质产脂树种。但湿地松分子遗传基础薄弱,基因组序列信息匮乏,影响了湿地松基因组学的深入研究。目前,湿地松分子研究所用的SSR标记主要来自其他近缘种或利用公共数据库中有限的基因序列资源开发的SSR标记,其多态性和通用性较差。为解决这一问题, 笔者根据湿地松转录组测序数据开发EST-SSR位点,并揭示其在转录组序列中的分布类型及特征,为湿地松分子标记辅助育种奠定基础。【方法】利用MISA软件对转录组序列进行SSRs查找和分布特征分析。查找标准参数设置为: 单核苷酸重复>10次,二核苷酸重复>6次,三、四、五、六核苷酸重复>5次。根据SSR位点两端的保守区域,利用Primer3.0设计并随机挑选120对SSR引物,通过琼脂糖电泳和毛细管电泳对来自美国和吉安的113份家系个体进行遗传多样性分析,确定引物多态性。【结果】79 574条unigenes序列中搜索到3 818个SSR位点,出现频率为4.80%,平均18.27 kb出现1个SSR位点,3 373个unigenes含有SSR位点,SSR发生频率(含SSR位点的序列数/搜索序列总数)为4.24%,其中2 980条序列含1个SSR位点,含1个以上SSR位点的序列有393条。在检测到的3 818个SSR标记中,单核苷酸分布最多,其次是二核苷酸和三核苷酸,SSR数量分别占总数的63.54%、19.15%和16.27%,而四、五、六核苷酸重复类型所占比例较小,分别为0.52%、0.13%和0.31%。SSR重复单元的重复次数分布在5~22次之间,除单核苷酸重复外的1 391个SSR中,重复5次的数量最多,为498个(35.80%);重复6次和7次的次之,分别为417个(29.98%)和198个(14.23%);重复10次以上的仅有38个(2.73%)。在检测到的731个二核苷酸重复SSR中,最常出现的重复单元为AT/AT,数量为491个(12.86%),AG/CT和AC/GT两种类型的重复单元出现的次数次之,分别为156(4.09%)和81个(2.12%)。在检测到的621个三核苷酸重复中,AAT/ATT是出现频率最高的单元,共139个(3.64%),其次是AAG/CTT,共122个(3.20%)。3 818个SSR中有24.59%的位置未知,其余的SSR则分布在非编码区域(untranslated region,UTR)或者编码区(coding sequence,CDS)上,分布数量表现为3'UTR>5'UTR>CDS。参试的120对SSR引物,有92对扩增成功(76.78%),其中24对呈现多态(20%)。24对引物(13个二核苷酸重复、7个三核苷酸重复和4个四核苷酸重复)共检测出81个等位基因,每个位点的等位基因数为2~9,平均为3.38个。多态性信息含量(polymorphism information content,PIC)为0.103~0.726,平均为0.349。【结论】通过对湿地松转录组数据的挖掘,共获得 3 818个SSR位点,主要重复单元为AT/AT和AAT/ATT,可扩增出多态性位点的引物重复单元以二、三核苷酸重复为主。基于湿地松转录组序列的SSR标记开发是可行的。

关键词: 湿地松, 转录组, SSR, 引物, EST-SSR标记, 单核苷酸, 二核苷酸, 三核苷酸

Abstract:

【Objective】Slash pine (Pinus elliotii Engelm.) is a high-quality resin-producing species widely distributed in southern China. Despite it being an important economically important species, genomic and transcriptomic data on this species is scarce, which has hampered its genomic studies. To date, SSR markers used in molecular studies on P. elliottii were mainly those from other related species or developed by using limited gene sequence resources from public databases, which have low polymorphism rate and high generality. In order to solve these problems, we used transcriptome data to develop EST-SSR markers for slash pine. Distribution patterns of the markers in the transcriptome sequences and their characteristics were analyzed in order to lay the foundation for molecular marker-assisted selection ofP. elliottii.【Method】The SSR loci from the transcriptome sequences were analyzed by MicroSAtellite (MISA), and statistical analyses were conducted for the distribution and characteristics of SSR loci. The parameters were set as follows: the SSRs were considered to contain mono-, di-, tri-, tetra-, penta- and hexa-nucleotides with minimum repeat numbers of 10, 6, 5, 5, 5 and 5, respectively. The 120 pairs of EST-SSR primers were designed using Primer 3. Agarose electrophoresis was used for initial check, and capillary electrophoresis was used for separation and detection of the polymorphisms in the primers. In order to study their genetic diversity, 113 samples of families were collected from three seed orchards in South America and from a seed stand in Ji’an,Jiangxi Province.【Result】A total of 79 574 unigenes with 3 818 SSR loci were detected through transcription of slash pine genes. SSR sites occurred with a frequency of 4.80%(number of SSR/number of searching sequences), with an average of one SSR per 18.27 kb. A total of 3 373 EST sequences were screened for SSRs with a frequency of 4.24%(number of sequences with SSRs/number of searching sequences). A total of 2 980 sequences contained single SSRs of different motif types, and 393 sequences contained more than two SSRs. Among the 3 818 potential EST-SSRs, six types of motifs were identified: mononucleotide (63.54%) which had the highest frequency, followed by dinucleotide (19.15%),trinucleotide (16.27%), tetranucleotide (0.52%), pentanucleotide (0.13%) and hexanucleotide repeats (0.31%). The number of repeats of the different SSR motifs varied from 5 to 22, with the exception of mononucleotides. The frequency of five repeats was the highest among all repeats (35.80%), followed by that of six repeats (29.98%) and seven repeats (14.23%). Only 2.73% of ten repeats were found. Among the dinucleotide repeats, AT/AT was the most common motif (12.86%), followed by AG/CT (4.09%) and AC/GT (2.12%). Among the trinucleotide repeats, AAT/ATT was the most common motif, accounting for 3.64% of the total trinucleotide repeats, followed by the AAG/CTT (3.20%). Among all mapped SSRs, the EST-SSR that belonged to the unknown region accounted for 24.59%. SSRs in different genomic regions (5'UTR, 3'-UTR and CDS) showed distinct patterns of distribution. At the genomic level, 3'UTRs had the highest density of SSRs, followed by 5'UTR and CDS. Among the 120 primer pairs, twenty-four pairs (containing 13 di-, 7 tri and four tetranucleotides) showed polymorphism, which accounted for 4.8% of the total number of primer pairs. Eighty-one alleles were tested from twenty-four pairs of fluorescence primers, and the number of alleles ranged from 2 to 9 with a mean value of 3.38. Polymorphic information content ranged from 0.103 to 0.726, with an average of 0.349.【Conclusion】A total of 3 818 SSRs were identified from transcriptome sequencing ofP. elliottii,with AT/AT and AAT/ATT the most common repeats.The amplified primers of the polymorphism loci were mainly dinucleotide and trinucleotide repeats.We concluded that it is feasible to develop the SSR markers based on the P. elliottii transcriptomic sequence, and our results provide new information on genetic diversity analysis and molecular marker-assisted selection of P. elliottii, as well as a basis for SSR marker development in other species.

Key words: Pinus elliottii Engelm. (slash pine), transcriptome, SSR, primer, EST-SSR moleculcy marlcer, mono-nucleotide, di-nucleotides, tri-nucleotides

中图分类号: