RNA Seq study to interpret biological insights includes various steps but to analyse the data in simple steps, the algorithm Bowtie 2-t could be used which is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.
Bowtie2-t is a version of the bowtie2 algorithm configured to run mapping on transcripts defined in the GTF file. Once you click on the “Bowtie2-t” option after starting the pipeline, you will get a pop up with the “OK” option. Clicking “OK” will complete the mapping.
Mapping on Transcripts
Modules of this stage create the expression of profiles across detected isoforms, specifying the number of expressed copies for each isoform separately. These modules work with files alignment (sam or bam for cuffmerge) and lists of isoforms
(formatted as gtf). For quantification of an isoform’s expression, the original mapping of reads on the genome could be used, or, better, they will be mapped on the reverse engineered isoforms again. After read mapping, the task is to deconvolute expression counts for each of the exons of a gene by assigning sub-counts to different isoforms of this gene as depicted on the right: There several modules of TauService (rQuant, Per- Pos) that are performing deconvolution via Regression analysis of exon’s profiles of counts. The probabilistic model based approach, is utilized in modules Cufflinks/CuffMerge/CuffDif and RSEM implemented in the T-BioInfo. Here the RSEM probabilistic model depicted below:
Infers a putative “heritage” for each read: with what isoform the read could be associated. Such separation of all reads that are mapped on all exons of a gene allows estimating expression level of each isoform.
To map the reads to the genome, Bowtie 2-G is used and to map over the transcriptome, Bowtie 2-t is used.
Link to lesson to understand Bowtie 2 algorithm for RNA-Seq Study - https://learn.omicslogic.com/Learn/course-5-transcriptomics/lesson/04-t1-analysis-of-raw-rna-seq-data-logical-steps