transcriptR - An Integrative Tool for ChIP- And RNA-Seq Based Primary
Transcripts Detection and Quantification
The differences in the RNA types being sequenced have an
impact on the resulting sequencing profiles. mRNA-seq data is
enriched with reads derived from exons, while GRO-, nucRNA- and
chrRNA-seq demonstrate a substantial broader coverage of both
exonic and intronic regions. The presence of intronic reads in
GRO-seq type of data makes it possible to use it to
computationally identify and quantify all de novo continuous
regions of transcription distributed across the genome. This
type of data, however, is more challenging to interpret and
less common practice compared to mRNA-seq. One of the
challenges for primary transcript detection concerns the
simultaneous transcription of closely spaced genes, which needs
to be properly divided into individually transcribed units. The
R package transcriptR combines RNA-seq data with ChIP-seq data
of histone modifications that mark active Transcription Start
Sites (TSSs), such as, H3K4me3 or H3K9/14Ac to overcome this
challenge. The advantage of this approach over the use of, for
example, gene annotations is that this approach is data driven
and therefore able to deal also with novel and case specific
events. Furthermore, the integration of ChIP- and RNA-seq data
allows the identification all known and novel active
transcription start sites within a given sample.