
CellLines: Workshop Files and References
Machine Learning for Transcriptomics Data Workshop A subset from the "Modeling Precision Medicine Treatment Selection for Patients Based on Multi-Omics Biomarkers" series. For thi...

Data Mining / Post Processing
To use the Data Mining analysis section, the user has to have one or more tables that are in a proper format. It can operate with genomic, transcriptomic, proteomic or metabolomic data that ...

ChIP-Seq Analysis
ChIP-Seq Analysis ChIP-Seq, or chromatin immunoprecipitation sequencing, is a technique that performs analysis of transcriptome data generated by next-generation sequencing technologies or ...

T-BioInfo Interface Overview
When you open the T-BioInfo platform, you will see a list of data types that correlate with different sections of the platform. As an example, we can choose the RNA seq pipeline by clicking on the ...

RNA-Seq Practical : From NGS expression table to statistical analysis on T-Bioinfo Server
RNA-Seq (RNA sequencing) is a sequencing technique which uses NGS (next-generation sequencing) to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the con...

What is the fastest and most simple pipeline for RNA-seq?
RNA Seq study to interpret biological insights includes various steps but to analyse the data in simple steps, the algorithm Bowtie 2-t could be used which is an ultrafast and memory-efficient tool...

Results from the RNA-seq Pipeline
The real research and struggle begins when the pipeline is complete. The results obtained from the pipeline are to be processed, normalized and then analyzed to interpret biological insights. It is...

RNA seq on T-BioInfo
RNA-Seq is a technique that performs analysis of transcriptome data generated by next-generation sequencing technologies or by microarrays. Success in analysis of the transcriptome is largely depen...

Basic PCA: making a scatterplot of Principle Component Analysis results in Excel
After we run a pipeline to process raw reads from the FASTQ file, we can study the gene expression table. Working with the gene expression table includes understanding our column and row names, as ...

Using PCA Draw - a different PCA on T-BioInfo
PCA identifies linear combinations of genes such that each combination (called a Principal Component) explains the maximum variance. It's often used to make data easy to explore and visualize. We l...

Quantile Normalization for Gene Expression (RNA-seq) on T BioInfo
Quantile normalization, and other normalization procedures are important to transform distributions between, and amongst, different samples to have the same distribution. Quantile normalization is ...

Statistical analysis: T-Test in Excel to find the differences between two groups
The way we operate with data has been changed by computers to the point that they perform certain calculations we often do not completely understand. For example, the p-value that is often used to ...

Differential Gene Expression Analysis & Biological Annotation Pipeline
There are a number of methods/algorithms that can be applied to scrutinize the significant genes from the RNA expression data. Depending on whether data is normalized or not, these methods can be a...

Annotation & Pathways analysis Pipeline
From the DESeq2 and GSEA analysis module, we can get important information about their biological implications. Besides, if you perform simply T-test or Use EdgeR, you will not get information rega...

Short tutorial on using T-BioInfo platform to run Gene Set Enrichment Analysis.
Gene set enrichment analysis is a method to identify classes of genes or proteins that are over-represented in a large set of genes or proteins, and may have an association with disease phenotypes....

Factor Regression Analysis
When multiple factors are affecting gene expression in your project, you can utilize a regression-based method of finding the relationship between expression values and levels of factors. Regressio...

Principal component Analysis (PCA ):Tutorial
Let's begin our journey into the cell line data with a review of Principal Component Analysis (PCA). PCA is a statistical approach from linear algebra that uses a matrix of covariance to find an ef...

Transforming the FASTQ files into structured data
Before we can use the methods for advanced analysis, the data found in this collection has to be prepared. The main reason for this is because each one of these samples can be downloaded as a colle...

Unsupervised Machine Learning (Hierarchical Clustering)
Complex patterns in large datasets are hard to find manually. These types of data show non-linear dependencies and contain noise that makes it hard to find statistically significant differences. Th...

Unsupervised Machine Learning (K-Mean Clustering)
Another conventional clustering method is called k-means. In this clustering method, we take a number of clusters k as an input parameter, then randomly select k initial “centroids” in ...

Supervised Machine Learning(Decision Tree and Random Forest)
Supervised Machine Learning is an algorithm that takes in data that is labeled – typically this is prepared by people who annotate the dataset. In biomedical projects, the annotation could be...

Supervised Machine Learning(Support Vector Machine (SVM))
Many times, it is not possible to have any linear discrimination and finding a quadratic function to delineate groups is practically impossible, which reduces prediction accuracy. In those cases, w...

Supervised Machine Learning: Feature Selection
Feature Selection Methods: swLDA and RF Feature Selection starts with testing all individual features (i.e., genes) and selects the one that provides the best classification quality (for the train...

Genomic Variation in NGS Data: Practical
What are Genomics Variations ? Genomic variation explains some of the differences among people, such as eye color and blood group, as well as whether a person has a higher or lower risk for gettin...

Phylogenetic Analysis
What is Phylogenetic analysis ? Phylogenetic analysis is the study of the evolutionary development of a species or a group of organisms or a particular characteristic of an organism. It is importa...

Mutability Analysis & Interpretation
What is mutability ? Mutability in simple terms describes the observed rate of a given position to change (or mutate). Frequency of change is calculated by a job called “Mutation-Call-Binom9...

Differential Mutation Analysis
Differential Mutation Analysis Differential mutation analysis is a framework that uncovers cancer genes by comparing the mutational profiles of genes across cancer genomes with their natural germl...

Analysis of 16s “Amplicon” Data using DADA2 Pipeline
The gut microbiome contains tens of trillions of microorganisms including at least 1000 different species of known bacteria. The mammalian gut microbiome has co-evolved with its hosts for hundreds ...