全基因组测序技术研究及其在木本植物中的应用

刘海琳,尹佟明

南京林业大学学报(自然科学版) ›› 2018, Vol. 42 ›› Issue (05) : 172-178.

PDF(1392290 KB)
PDF(1392290 KB)
南京林业大学学报(自然科学版) ›› 2018, Vol. 42 ›› Issue (05) : 172-178. DOI: 10.3969/j.issn.1000-2006.201709020
综合述评

全基因组测序技术研究及其在木本植物中的应用

  • 刘海琳,尹佟明*
作者信息 +

Progress on the whole genome sequencing and the application in woody plants

  • LIU Hailin, YIN Tongming*
Author information +
文章历史 +

摘要

基因组序列是开展遗传研究重要的信息基础,随着测序技术飞速发展至第3代长片段测序方法,测序读长历经从几十到数万个碱基的提升,对进一步提升基因组组装的完整度以及准确性提供了极大的裨益。现已完成了大量植物种全基因组测序工作,其中木本植物有40多个,还有更多树种的全基因组测序正在进行之中。针对各类测序技术的基因组组装及后续分析,研究人员也开发了大量的生物信息学工具。笔者从测序技术、基因组装技术和全基因组测序生物信息学分析等方面,罗列了目前已完成全基因组测序的木本植物,介绍了全基因组测序技术的发展与应用,以及适用于第3代数据基因组组装的生物学分析软件,为林木基因组研究者提供一定的借鉴。

Abstract

Genome sequence provides one of the essential platforms for different aspects of genetics study for a focal organism. With the fast progress in sequencing technology, it has undergone three generations. The length of reads developed from decades of bases to decades of thousand bases, which has been of great benefit to the completeness and accuracy of genome assembly. Numbers of species have been sequenced the whole genome including more than 40 woody plants thus far, and there are many more sequencing projects of tree genome in process. Consequently, different bioinformatics analytical tools have been developed and applied for genome assembly and the subsequent analyses. Here, in the aspects of sequence technologies, method of genome assembling and the analysis by bioinformatics, we briefly listed the sequenced woody plants, reviewed the technical progress and utility of sequencing technologies, and introduced several analytical methods in genome assembling by using the third generation sequencing technology.

引用本文

导出引用
刘海琳,尹佟明. 全基因组测序技术研究及其在木本植物中的应用[J]. 南京林业大学学报(自然科学版). 2018, 42(05): 172-178 https://doi.org/10.3969/j.issn.1000-2006.201709020
LIU Hailin, YIN Tongming. Progress on the whole genome sequencing and the application in woody plants[J]. JOURNAL OF NANJING FORESTRY UNIVERSITY. 2018, 42(05): 172-178 https://doi.org/10.3969/j.issn.1000-2006.201709020
中图分类号: Q948    S722   

参考文献

[1] VENTER J C, ADAMS M D, MYERS E W, et al. The sequence of the human genome[J]. Science, 2001, 291(5507):1304-1351. DOI:10.1126/science.1058040.
[2] ARABIDOPSIS GENOME I. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana[J]. Nature, 2000, 408(6814):796-815. DOI:10.1038/35048692.
[3] TUSKAN G A, DIFAZIO S, JANSSON S, et al. The genome of black cottonwood, Populus trichocarpa(Torr. & Gray)[J]. Science, 2006, 313(5793):1596-1604. DOI:10.1126/science.1128691.
[4] LEE H, GURTOWSKI J, YOO S. Third-generation sequencing and the future of genomics[J/OL]. BioRxiv, 2016. DOI:10.1101/048603.https://doi.org/10.1101/048603.
[5] SINGH R, LOW E T L, OOI L C L, et al. The oil palm SHELL gene controls oil yield and encodes a homologue of SEEDSTICK[J]. Nature, 2013, 500(7462):340. DOI:10.1038/nature12356.
[6] AMBORELLA GENOME P. The Amborella genome and the evolution of flowering plants[J]. Science, 2013, 342(6165):1241089. DOI:10.1126/science.1241089.
[7] GUAN R, ZHAO Y, ZHANG H, et al. Draft genome of the living fossil Ginkgo biloba[J]. Gigascience, 2016, 5(1):49. DOI:10.1186/s13742-016-0154-1.
[8] XIA E H, ZHANG H B, SHENG J, et al. The tea tree genome provides insights into tea flavor and independent evolution of caffeine biosynthesis[J]. Mol Plant, 2017, 10(6):866-877. DOI:10.1016/j.molp.2017.04.002.
[9] SALOJARVI J, SMOLANDER O P, NIEMINEN K, et al. Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch[J]. Nat Genet, 2017, 49(6):904-912. DOI:10.1038/ng.3862.
[10] ARGOUT X, SALSE J, AURY J M, et al. The genome of Theobroma cacao[J]. Nat Genet, 2011, 43(2):101-108. DOI:10.1038/ng.736.
[11] ARGOUT X, MARTIN G, DROC G, et al. The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies[J]. Bmc Genomics, 2017, 18(1):730. DOI:10.1186/s12864-017-4120-9.
[12] QIN G, XU C, MING R, et al. The pomegranate(Punica granatum L.)genome and the genomics of punicalagin biosynthesis[J]. Plant J, 2017, 91(6):1108-1128. DOI:10.1111/tpj.13625.
[13] YUAN Z, FANG Y, ZHANG T, et al. The pomegranate(Punica granatum L.)genome provides insights into fruit quality and ovule developmental biology[J]. Plant Biotechnol J, 2017, 16(7):1363-1374. DOI:10.1111/pbi.12875.
[14] HUANG J, ZHANG C, ZHAO X, et al. The jujube genome provides insights into genome evolution and the domestication of sweetness/acidity taste in fruit trees[J]. PLoS Genet, 2016, 12(12):e1006433. DOI:10.1371/journal.pgen.1006433.
[15] LIU M J, ZHAO J, CAI Q L, et al. The complex jujube genome provides insights into fruit tree biology[J]. Nature Communications, 2014, 5:5315. DOI:10.1038/ncomms6315.
[16] WANG X, XU Y, ZHANG S, et al. Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction[J]. Nat Genet, 2017, 49(5):765-772. DOI:10.1038/ng.3839.
[17] NEALE D B, MARTINEZ-GARCIA P J, DE LA TORRE A R, et al. Novel Insights into tree biology and genome evolution as revealed through genomics[J]. Annu Rev Plant Biol, 2017, 68:457-483. DOI:10.1146/annurev-arplant-042916-041049.
[18] SANGER F, NICKLEN S, COULSON A R. DNA sequencing with chain-terminating inhibitors[J]. Proc Natl Acad Sci U S A, 1977, 74(12):5463-5467.
[19] SHENDURE J, JI H. Next-generation DNA sequencing[J]. Nat Biotechnol, 2008, 26(10):1135-1145. DOI:10.1038/nbt1486.
[20] MING R, HOU S, FENG Y, et al. The draft genome of the transgenic tropical fruit tree papaya(Carica papaya Linnaeus)[J]. Nature, 2008, 452(7190):991-996. DOI:10.1038/nature06856.
[21] JAILLON O, AURY J M, NOEL B, et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla[J]. Nature, 2007, 449(7161):463-467. DOI:10.1038/nature06148.
[22] MYBURG A A, GRATTAPAGLIA D, TUSKAN G A, et al. The genome of Eucalyptus grandis[J]. Nature, 2014, 510(7505):356-362. DOI:10.1038/nature13308.
[23] TURKTAS M, KURTOGLU K Y, DORADO G, et al. Sequencing of plant genomes: a review[J]. Turk J Agric For, 2015, 39(3):361-376. DOI:10.3906/tar-1409-93.
[24] METZKER M L. Sequencing technologies: the next generation[J]. Nat Rev Genet, 2010, 11(1):31-46. DOI:10.1038/nrg2626.
[25] GOODWIN S, MCPHERSON J D, MCCOMBIE W R. Coming of age: ten years of next-generation sequencing technologies[J]. Nat Rev Genet, 2016, 17(6):333-351. DOI:10.1038/nrg.2016.49.
[26] VELASCO R, ZHARKIKH A, AFFOURTIT J, et al. The genome of the domesticated apple(Malus×domestica Borkh.)[J]. Nat Genet, 2010, 42(10):833. DOI:10.1038/ng.654.
[27] DAI X, HU Q, CAI Q, et al. The willow genome and divergent evolution from poplar after the common genome duplication[J]. Cell Res, 2014, 24(10):1274-1277. DOI:10.1038/cr.2014.83.
[28] XU Q, CHEN L L, RUAN X, et al. The draft genome of sweet orange(Citrus sinensis)[J]. Nat Genet, 2013, 45(1):59-66. DOI:10.1038/ng.2472.
[29] WU J, WANG Z W, SHI Z B, et al. The genome of the pear(Pyrus bretschneideri Rehd.)[J]. Genome Research, 2013, 23(2):396-408. DOI:10.1101/gr.144311.112.
[30] VERDE I, ABBOTT A G, SCALABRIN S, et al. The high-quality draft genome of peach(Prunus persica)identifies unique patterns of genetic diversity, domestication and genome evolution[J]. Nat Genet, 2013, 45(5):487-U447. DOI:10.1038/ng.2586.
[31] MA T, WANG J Y, ZHOU G K, et al. Genomic insights into salt adaptation in a desert poplar[J]. Nature Communications, 2013, 4:279. DOI:10.1038/ncomms3797.
[32] NYSTEDT B, STREET N R, WETTERBOM A, et al. The Norway spruce genome sequence and conifer genome evolution[J]. Nature, 2013, 497(7451):579-584. DOI:10.1038/nature12211.
[33] ZIMIN A, STEVENS K A, CREPEAU M, et al. Sequencing and assembly of the 22-Gb loblolly pine genome[J]. Genetics, 2014, 196(3):875-890. DOI:10.1534/genetics.113.159715.
[34] NEALE D B, WEGRZYN J L, STEVENS K A, et al. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies[J]. Genome Biol, 2014, 15(3):R59. DOI:10.1186/gb-2014-15-3-r59.
[35] STEVENS K A, WEGRZYN J L, ZIMIN A, et al. Sequence of the sugar pine megagenome[J]. Genetics, 2016, 204(4):1613-1626. DOI:10.1534/genetics.116.193227.
[36] RHOADS A, AU K F. PacBio sequencing and its applications[J]. Genomics Proteomics Bioinformatics, 2015, 13(5):278-289. DOI:10.1016/j.gpb.2015.08.002.
[37] SCHATZ M C, DELCHER A L, SALZBERG S L. Assembly of large genomes using second-generation sequencing[J]. Genome Research, 2010, 20(9):1165-1173. DOI:10.1101/gr.101360.109.
[38] ENGLISH A C, SALERNO W J, HAMPTON O A, et al. Assessing structural variation in a personal genome-towards a human reference diploid genome[J]. Bmc Genomics, 2015, 16:286. DOI:10.1186/s12864-015-1479-3.
[39] EID J, FEHR A, GRAY J, et al. Real-time DNA sequencing from single polymerase molecules[J]. Science, 2009, 323(5910):133-138. DOI:10.1126/science.1162986.
[40] BERLIN K, KOREN S, CHIN C S, et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing[J]. Nat Biotechnol, 2015, 33(6):623-630. DOI:10.1038/nbt.3238.
[41] VANBUREN R, BRYANT D, EDGER P P, et al. Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum[J]. Nature, 2015, 527(7579):508-209. DOI:10.1038/nature15714.
[42] DU H, YU Y, MA Y, et al. Sequencing and de novo assembly of a near complete indica rice genome[J]. Nat Commun, 2017, 8:15324. DOI:10.1038/ncomms15324.
[43] GORDON D, HUDDLESTON J, CHAISSON M J P, et al. Long-read sequence assembly of the gorilla genome[J]. Science, 2016, 352(6281). DOI:10.1126/science.aae0344.
[44] SCALLY A, DUTHEIL J Y, HILLIER L W, et al. Insights into hominid evolution from the gorilla genome sequence[J]. Nature, 2012, 483(7388):169-175. DOI:10.1038/nature10842.
[45] BURTON J N, ADEY A, PATWARDHAN R P, et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions[J]. Nat Biotechnol, 2013, 31(12):1119-1125. DOI:10.1038/nbt.2727.
[46] MASCHER M, GUNDLACH H, HIMMELBACH A, et al. A chromosome conformation capture ordered sequence of the barley genome[J]. Nature, 2017, 544(7651):426. DOI:10.1038/nature22043.
[47] BICKHART D M, ROSEN B D, KOREN S, et al. Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome[J]. Nat Genet, 2017, 49(4):643-650. DOI:10.1038/ng.3802.
[48] JARVIS D E, HO Y S, LIGHTFOOT D J, et al. The genome of Chenopodium quinoa[J]. Nature, 2017, 542(7641):307. DOI:10.1038/nature21370.
[49] PENDLETON M, SEBRA R, PANG A W C, et al. Assembly and diploid architecture of an individual human genome via single-molecule technologies[J]. Nature Methods, 2015, 12(8):780-786. DOI:10.1038/Nmeth.3454.
[50] JIAO Y P, PELUSO P, SHI J H, et al. Improved maize reference genome with single-molecule technologies[J]. Nature, 2017, 546(7659):524. DOI:10.1038/nature22971.
[51] EISENSTEIN M. Startups use short-read data to expand long-read sequencing market[J]. Nature Biotechnology, 2015, 33(5):433-435.
[52] ZHENG G X Y, LAU B T, SCHNALL-LEVIN M, et al. Haplotyping germline and cancer genomes with high-throughput linked-read sequencing[J]. Nature Biotechnology, 2016, 34(3):303-311. DOI:10.1038/nbt.3432.
[53] SEO J S, RHIE A, KIM J, et al. De novo assembly and phasing of a Korean human genome[J]. Nature, 2016, 538(7624):243-247. DOI:10.1038/nature20098.
[54] ZHANG G Q, LIU K W, LI Z, et al. The Apostasia genome and the evolution of orchids[J]. Nature, 2017, 549(7672):379-383. DOI:10.1038/nature23897.
[55] CAI J, LIU X, VANNESTE K, et al. The genome sequence of the orchid Phalaenopsis equestris[J]. Nat Genet, 2015, 47(1):65-72. DOI:10.1038/ng.3149.
[56] ZHANG G Q, XU Q, BIAN C, et al. The Dendrobium catenatum Lindl. genome sequence provides insights into polysaccharide synthase, floral development and adaptive evolution[J]. Sci Rep, 2016, 6:19029. DOI:10.1038/srep19029.
[57] STADEN R. A strategy of DNA sequencing employing computer programs[J]. Nucleic Acids Res, 1979, 6(7):2601-2610.
[58] BATZOGLOU S, JAFFE D B, STANLEY K, et al. ARACHNE: a whole-genome shotgun assembler[J]. Genome Res, 2002, 12(1):177-189. DOI:10.1101/gr.208902.
[59] JAFFE D B, BUTLER J, GNERRE S, et al. Whole-genome sequence assembly for mammalian genomes: Arachne 2[J]. Genome Res, 2003, 13(1):91-96. DOI:10.1101/gr.828403.
[60] MYERS E W, SUTTON G G, DELCHER A L, et al. A whole-genome assembly of drosophila[J]. Science, 2000, 287(5461):2196-2204.
[61] HUANG X, MADAN A. CAP3: A DNA sequence assembly program[J]. Genome Res, 1999, 9(9):868-877.
[62] MULLIKIN J C, NING Z M. The phusion assembler[J]. Genome Research, 2003, 13(1):81-90. DOI:10.1101/gr.731003.
[63] WARREN R L, SUTTON G G, JONES S J M, et al. Assembling millions of short DNA sequences using SSAKE[J]. Bioinformatics, 2007, 23(4):500-501. DOI:10.1093/bioinformatics/btl629.
[64] IDURY R M, WATERMAN M S. A new algorithm for DNA sequence assembly[J]. J Comput Biol, 1995, 2(2):291-306. DOI:10.1089/cmb.1995.2.291.
[65] PEVZNER P A, TANG H X, WATERMAN M S. An Eulerian path approach to DNA fragment assembly[J]. P Natl Acad Sci USA, 2001, 98(17):9748-9753. DOI:DOI 10.1073/pnas.171285098.
[66] ZERBINO D R, BIRNEY E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs[J]. Genome Research, 2008, 18(5):821-829. DOI:10.1101/gr.074492.107.
[67] SIMPSON J T, WONG K, JACKMAN S D, et al. ABySS: a parallel assembler for short read sequence data[J]. Genome Research, 2009, 19(6):1117-1123. DOI:10.1101/gr.089532.108.
[68] GNERRE S, MACCALLUM I, PRZYBYLSKI D, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data[J]. Proc Natl Acad Sci U S A, 2011, 108(4):1513-1518. DOI:10.1073/pnas.1017351108.
[69] KAJITANI R, TOSHIMOTO K, NOGUCHI H, et al. Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads[J]. Genome Res, 2014, 24(8):1384-1395. DOI:10.1101/gr.170720.113.
[70] MILLER J R, KOREN S, SUTTON G. Assembly algorithms for next-generation sequencing data[J]. Genomics, 2010, 95(6):315-327. DOI:10.1016/j.ygeno.2010.03.001.
[71] CHIN C S, ALEXANDER D H, MARKS P, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data[J]. Nat Methods, 2013, 10(6):563-569. DOI:10.1038/nmeth.2474.
[72] CHIN C S, PELUSO P, SEDLAZECK F J, et al. Phased diploid genome assembly with single-molecule real-time sequencing[J]. Nat Methods, 2016, 13(12):1050-1054. DOI:10.1038/nmeth.4035
[73] KOREN S, WALENZ B P, BERLIN K, et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation[J]. Genome Research, 2017, 27(5):722-736. DOI:10.1101/gr.215087.116.
[74] KOREN S, HARHAY G P, SMITH T P, et al. Reducing assembly complexity of microbial genomes with single-molecule sequencing[J]. Genome Biol, 2013, 14(9):R101. DOI:10.1186/gb-2013-14-9-r101.
[75] YE C X, HILL C M, WU S G, et al. DBG2OLC: Efficient assembly of large genomes using long erroneous reads of the third generation sequencing technologies[J]. Sci Rep-Uk, 2016, 6:594. DOI:10.1038/srep31900.
[76] YE C, MA Z S. Sparc: a sparsity-based consensus algorithm for long erroneous sequencing reads[J]. PeerJ, 2016, 4:e2016. DOI:10.7717/peerj.2016.
[77] BOETZER M, HENKEL C V, JANSEN H J, et al. Scaffolding pre-assembled contigs using SSPACE[J]. Bioinformatics, 2011, 27(4):578-579. DOI:10.1093/bioinformatics/btq683.
[78] BOETZER M, PIROVANO W. SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information[J]. Bmc Bioinformatics, 2014, 15(1):578-579. DOI:10.1186/1471-2105-15-211.
[79] LIU Z, ENGLISH A C, RICHARDS S, et al. Mind the gap: upgrading genomes with Pacific biosciences RS long-read sequencing technology[J]. PloS One, 2012, 7(11):e47768. DOI:10.1371/journal.pone.0047768.
[80] CONTE M A, KOCHER T D. An improved genome reference for the African cichlid, Metriaclima zebra[J]. Bmc Genomics, 2015, 16:724.
[81] ZHU J, JIANG F, WANG X, et al. Genome sequence of the small brown planthopper, Laodelphax striatellus[J]. Gigascience, 2017, 6(12):1-12. DOI:10.1093/gigascience/gix109.
[82] STEGGER M, DRIEBE E M, ROE C, et al. Genome sequence of Staphylococcus aureus strain CA-347, a USA600 methicillin-resistant isolate[J]. Genome Announc, 2013, 1(4):e00517-13. DOI:10.1128/genomeA.00517-13.
[83] FEUILLET C, LEACH J E, ROGERS J, et al. Crop genome sequencing: lessons and rationales[J]. Trends Plant Sci, 2011, 16(2):77-88. DOI:10.1016/j.tplants.2010.10.005.
[84] TAUDIEN S, STEUERNAGEL B, ARIYADASA R, et al. Sequencing of BAC pools by different next generation sequencing platforms and strategies[J]. BMC Res Notes, 2011, 4:411. DOI:10.1186/1756-0500-4-411.
[85] JIAO W B, SCHNEEBERGER K. The impact of third generation genomic technologies on plant genome assembly[J]. Curr Opin Plant Biol, 2017, 36:64-70. DOI:10.1016/j.pbi.2017.02.002.
[86] YASUI Y, HIRAKAWA H, OIKAWA T, et al. Draft genome sequence of an inbred line of Chenopodium quinoa, an allotetraploid crop with great environmental adaptability and outstanding nutritional properties[J]. DNA Res, 2016, 23(6):535-546. DOI:10.1093/dnares/dsw037.

基金

收稿日期:2017-09-09 修回日期:2018-03-16 基金项目:国家自然科学基金项目(31570662) 第一作者:刘海琳(lindaliu_njfu@163.com),博士。*通信作者:尹佟明(tmyin@njfu.edu.cn),教授,博士。

PDF(1392290 KB)

Accesses

Citation

Detail

段落导航
相关文章

/