JOURNAL OF NANJING FORESTRY UNIVERSITY ›› 2019, Vol. 43 ›› Issue (02): 93-99.doi: 10.3969/j.issn.1000-2006.201806029

Previous Articles     Next Articles

findDEG: an integrated software package for differential gene expression analysis with RNA sequencing data

WU Jiyan,YAO Dan,WU Hainan,TONG Chunfa*   

  1. (College of Forestry, Nanjing Forestry University, Nanjing 210037, China)
  • Online:2019-03-30 Published:2019-03-30

Abstract: 【Objective】With the fast development of next-generation sequencing technology, transcriptome sequencing(or RNA-seq)is being widely used for differential gene expression analyses and gene annotations in many species. A variety of software packages for RNA-seq data analysis are available. However, the practical analysis involves several complicated steps and multiple parameters, making it difficult for most researchers to perform such an analysis accurately. 【Method】Based on the available software packages such as Trinity, TopHat+Cufflinks and HISAT2+StringTie, an integrated package was generated to analyze RNA-seq data by considering different methods for computing gene expression abundance and hypothesis testing of differential gene expression. Meanwhile, other issues were also considered, including whether a reference genome is available, if the sampling is repetitive or not, and whether the data is paired or single end. 【Result】An integrated software package called findDEG was developed with Perl language for differential gene expression analysis. The software consisted of three modules, i.e., Trinity, TopHat+Cufflinks, and HISAT2+StringTie. The Trinity module provides three methods for calculating transcript expression abundance and four methods for testing differentially expressed genes, while the module TopHat+Cufflinks allows users to choose either the new or old version of Cufflinks for performing differential gene expression analysis. However, the module HISAT2+StringTie has only one strategy for the analysis. The new software is freely available at the website http://www.bioseqdata.com/findDEG/findDEG.htm. By taking three analytical strategies, including the old and new versions of Cufflinks and the Trinity module, we analyzed the RNA-seq data from Populus simonii under normal and drought stress conditions. Consequently, the new and old versions of Cufflinks identified 53 and 33 differentially expressed genes, respectively, with 25 matching genes between them. Trinity detected up to 1 641 differentially expressed genes, of which 14 and 3 genes were the same as the results from the new and old versions of Cufflinks, respectively. 【Conclusion】The new developed software findDEG can conveniently provide more than a dozen strategies for differential gene expression analysis with RNA-seq data by using one piece of software to conduct the whole analysis, avoiding many intermediate parameters and results that would need to be manually processed.

CLC Number: