Projects and Tutorials

Data Mining & Machine Learning
Document Video

Unsupervised Machine Learning (Hierarchical Clustering)

User Ratings :

Complex patterns in large datasets are hard to find manually. These types of data show non-linear dependencies and contain noise that makes it hard to find statistically significant differences. That’s why we turn to automated learning, also known as “machine learning”. This is when we train an algorithm, or a “machine” to perform tasks with data, looking for patterns. The T-BioInfo platform has several types of clustering methods: P-clustering for big data, hierarchical clustering (H-Clust), K-means, Specc and DBScan. 

 The basic, bottom-up hierarchical clustering algorithm is the H-clust module. As in the other methods we have been using in the course, navigate to the Data Mining Unsupervised Analysis area. 

To learn more about the various clustering algorithms, the outputs they generate and the analysis, visit: https://learn.omicslogic.com/Learn/course-5-transcriptomics/lesson/13-t3-unsupervised-machine-learning-clustering

 

In this course, we will explore hierarchical and K-means clustering, both of which are widely used for analysis of RNA-seq data. After the completion of the pipeline, you would obtain a dendrogram in which you would be able to look at the clusters, distances between clusters and the order of samples