我们的网站为什么显示成这样?

可能因为您的浏览器不支持样式,您可以更新您的浏览器到最新版本,以获取对此功能的支持,访问下面的网站,获取关于浏览器的信息:

|Table of Contents|

一款基于转录组差异基因表达分析的软件包——findDEG(PDF)

《南京林业大学学报(自然科学版)》[ISSN:1000-2006/CN:32-1161/S]

Issue:
2019年02期
Page:
93-99
Column:
研究论文
publishdate:
2019-03-30

Article Info:/Info

Title:
findDEG: an integrated software package for differential gene expression analysis with RNA sequencing data
Article ID:
1000-2006(2019)02-0093-07
Author(s):
WU JiyanYAO DanWU HainanTONG Chunfa*
(College of Forestry, Nanjing Forestry University, Nanjing 210037, China)
Keywords:
gene differential expression analysis Perl language poplar transcript transcriptome sequencing findDEG
Classification number :
Q811.4
DOI:
10.3969/j.issn.1000-2006.201806029
Document Code:
A
Abstract:
【Objective】With the fast development of next-generation sequencing technology, transcriptome sequencing(or RNA-seq)is being widely used for differential gene expression analyses and gene annotations in many species. A variety of software packages for RNA-seq data analysis are available. However, the practical analysis involves several complicated steps and multiple parameters, making it difficult for most researchers to perform such an analysis accurately. 【Method】Based on the available software packages such as Trinity, TopHat+Cufflinks and HISAT2+StringTie, an integrated package was generated to analyze RNA-seq data by considering different methods for computing gene expression abundance and hypothesis testing of differential gene expression. Meanwhile, other issues were also considered, including whether a reference genome is available, if the sampling is repetitive or not, and whether the data is paired or single end. 【Result】An integrated software package called findDEG was developed with Perl language for differential gene expression analysis. The software consisted of three modules, i.e., Trinity, TopHat+Cufflinks, and HISAT2+StringTie. The Trinity module provides three methods for calculating transcript expression abundance and four methods for testing differentially expressed genes, while the module TopHat+Cufflinks allows users to choose either the new or old version of Cufflinks for performing differential gene expression analysis. However, the module HISAT2+StringTie has only one strategy for the analysis. The new software is freely available at the website http://www.bioseqdata.com/findDEG/findDEG.htm. By taking three analytical strategies, including the old and new versions of Cufflinks and the Trinity module, we analyzed the RNA-seq data from Populus simonii under normal and drought stress conditions. Consequently, the new and old versions of Cufflinks identified 53 and 33 differentially expressed genes, respectively, with 25 matching genes between them. Trinity detected up to 1 641 differentially expressed genes, of which 14 and 3 genes were the same as the results from the new and old versions of Cufflinks, respectively. 【Conclusion】The new developed software findDEG can conveniently provide more than a dozen strategies for differential gene expression analysis with RNA-seq data by using one piece of software to conduct the whole analysis, avoiding many intermediate parameters and results that would need to be manually processed.

References


[1] TRAPNELL C, ROBERTS A, GOFF L, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks [J]. Nature Protocols, 2012, 7(3): 562-578. DOI:10.1038/nprot.2012.016.
[2] PERTEA M, KIM D, PERTEA G M, et al. TranscripT-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown [J]. Nature Protocols, 2016, 11(9): 1650. DOI:10.1038/nprot.2016.095.
[3] HAAS B J, PAPANICOLAOU A, YASSOUR M, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis [J]. Nature Protocols, 2013, 8(8): 1494-1512. DOI:10.1038/nprot.2013.084.
[4] GHOSH S, CHAN C K K. Analysis of RNA-Seq data using TopHat and Cufflinks [J]. Methods in Molecular Biology, 2016, 1374:339-361. DOI:10.1007/978-1-4939-3167-5_18.
[5] KIM D, LANGMEAD B, SALZBERG S L. HISAT: a fast spliced aligner with low memory requirements [J]. Nature Methods, 2015, 12(4): 357-360. DOI:10.1038/nmeth.3317.
[6] FRAZEE A C, PERTEA G, JAFFE A E, et al. Ballgown bridges the gap between transcriptome assembly and expression analysis [J]. Nature Biotechnology, 2015, 33(3): 243-246. DOI:10.1038/nbt.3172.
[7] LI B, DEWEY C N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome [J]. BMC Bioinformatics, 2011, 12(1): 323. DOI:10.1186/1471-2105-12-323.
[8] BRAY N L, PIMENTEL H, MELSTED P, et al. Near-optimal probabilistic RNA-seq quantification [J]. Nature Biotechnology, 2016, 34(5): 525-527. DOI:10.1038/nbt.3519.
[9] PATRO R, DUGGAL G, LOVE M I, et al. Salmon provides fast and bias-aware quantification of transcript expression [J]. Nature Methods, 2017, 14(4): 417-419. DOI:10.1038/nmeth.4197.
[10] ROBINSON M D, MCCARTHY D J, SMYTH G K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data [J]. Bioinformatics, 2010, 26(1): 139-140. DOI:10.1093/bioinformatics/btp616.
[11] ANDERS S, MCCARTHY D J, CHEN Y, et al. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor [J]. Nature Protocols, 2013, 8(9): 1765-1786. DOI:10.1038/nprot.2013.099.
[12] LAW C W, CHEN Y, SHI W, et al. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts [J]. Genome Biology, 2014, 15(2): 29. DOI:10.1186/gb-2014-15-2-r29.
[13] SUOMI T, SEYEDNASROLLAH F, JAAKKOLA M K, et al. ROTS: an R package for reproducibility-optimized statistical testing [J]. PloS Computational Biology, 2017, 13(5): e1005562. DOI:10.1371/journal.pcbi.1005562.
[14] SAHRAEIAN S M E, MOHIYUDDIN M, SEBRA R, et al. Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis [J]. Nature Communications, 2017, 8(1): 59. DOI:10.1038/s41467-017-00050-4.
[15] TONG C F, LI H G, WANG Y, et al. Construction of high-density linkage maps of Populus deltoides × P. simonii using restriction-site associated DNA sequencing [J]. PloS One, 2016, 11(3):e0150692. DOI:10.1371/journal.pone.0150692.
[16] MOUSAVI M, TONG C F, LIU F X, et al. De novo SNP discovery and genetic linkage mapping in poplar using restriction site associated DNA and whole-genome sequencing technologies [J]. BMC Genomics, 2016, 17:656. DOI:10.1186/s12864-016-3003-9.
[17] 欧佳佳. 杨树干旱响应转录组测序分析 [D].南京: 南京林业大学, 2015.
OU J J. Research on the drought-responsive transcriptome of Populus using RNA-seq [D].Nanjing: Nanjing Forestry University, 2015.
[18] TRAPNELL C, PACHTER L, SALZBERG S L. TopHat: discovering splice junctions with RNA-Seq [J]. Bioinformatics, 2009, 25(9): 1105-1111. DOI:10.1093/bioinformatics/btp120.
[19] TRAPNELL C, WILLIAMS B A, PERTEA G, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation [J]. Nature Biotechnology, 2010, 28(5): 511-515. DOI:10.1038/nbt.1621.
[20] PERTEA M, PERTEA G M, ANTONESCU C M, et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads [J]. Nature Biotechnology, 2015, 33(3): 290-295. DOI:10.1038/nbt.3122.
[21] GRABHERR M G, HAAS B J, YASSOUR M, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome [J]. Nature Biotechnology, 2011, 29(7): 644-652. DOI:10.1038/nbt.1883.
[22] LANGMEAD B, SALZBERG S L. Fast gapped-read alignment with Bowtie 2 [J]. Nature Methods, 2012, 9(4): 357-359. DOI:10.1038/nmeth.1923.
[23] LI H, HANDSAKER B, WYSOKER A, et al. The sequence alignment/map format and SAMtools [J]. Bioinformatics, 2009, 25(16):2078-2079. DOI:10.1093/bioinformatics/btp352.
[24] BENJAMINI Y, HOCHBERG Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing [J]. Journal of the Royal Statistical Society, 1995, 57(1): 289-300. DOI:10.1111/j.2517-6161.1995.tb02031.x.
[25] TUSKAN G A, DIFAZIO S, JANSSON S, et al. The genome of black cottonwood, Populus trichocarpa(Torr. & Gray)[J]. Science, 2006, 313(5793):1596-1604.DOI:10.1126/science.1128691.
[26] TANG S, DONG Y, LIANG D, et al. Analysis of the drought stress-responsive transcriptome of black cottonwood(Populus trichocarpa)using deep RNA sequencing [J]. Plant Molecular Biology Reporter, 2014, 33(3): 424-438. DOI:10.1007/s11105-014-0759-4.
[27] TANG S, LIANG H, YAN D, et al. Populus euphratica: the transcriptomic response to drought stress [J]. Plant molecular biology, 2013, 83(6): 539-557. DOI:10.1007/s11103-013-0107-3.
[28] ROBERTS R J, CARNEIRO M O, SCHATZ M C. The advantages of SMRT sequencing [J]. Genome Biology, 2013, 14(7): 405. DOI:10.1186/gb-2013-14-6-405.
[29] JAIN M, OLSEN H E, PATEN B, et al. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community [J]. Genome Biology, 2016, 17(1): 239. DOI:10.1186/s13059-016-1103-0.
[30] SEDLAZECK F J, LEE H, DARBY C A, et al. Piercing the dark matter: bioinformatics of long-range sequencing and mapping [J]. Nature Reviews Genetics, 2018, 19(6): 329-346. DOI:10.1038/s41576-018-0003-4.

Last Update: 2019-03-30