Estimation of Isoform Expression in RNA-Seq Data Using a Hierarchical Bayesian Model
Zengmiao Wang1, Jun Wang1, Changjing Wu2 and Minghua Deng1,2,3,
¡ì
1. Center for Quantitative Biology, Peking University, Beijing 100871, PR China.
2. LMAM, School of Mathematical Sciences, Peking University, Beijing 100871, PR China.
3. Center for Statistical Science, Peking University, Beijing 100871, PR China.
¡ìE-mail:
dengmh@math.pku.edu.cn
Introduction:
Estimation of gene or isoform expression is a fundamental step in many transcriptome
analysis tasks, such as differential expression analysis, eQTL (or sQTL) studies, and biological
network construction. RNA-seq technology enables us to monitor the expression
on genome-wide scale at single base pair resolution and offers the possibility of accurately
measuring expression at the level of isoform. However, challenges remain because
of non-uniform read sampling and the presence of various biases in RNA-seq data. In
this article, we present a novel hierarchical Bayesian method to estimate isoform expression.
While most of the existing methods treat gene expression as a by-product, we
incorporate it into our model and explicitly describe its relationship with corresponding
isoform expression using a Multinomial distribution. In this way, gene and isoform expression
are included in a unified framework and it helps us achieve a better performance
over other state-of-the-art algorithms for isoform expression estimation. The effectiveness
of the proposed method is demonstrated using both simulated data with known ground truth and two real RNA-seq datasets from MAQC project.
Source
codes:
The algorithm is implemented in R language and a compressed file for the source code and test data is
available here.
*********************************************************************************************************************
Last Update:
06/06/2015
Questions, comments,
suggestions, please contact wangzengmiao@pku.edu.cn