Before we can use the methods for advanced analysis, the data found in this collection has to be prepared. The main reason for this is because each one of these samples can be downloaded as a collection of reads that have been sequenced from random places on the mRNA fragments extracted from these cell lines. We cannot analyze data directly from these reads, but instead, we have to transform each sample into a set of elements (genes) that have a numeric value for its expression in a given sample. That is to say, we need to convert the “read” data into structured data, i.e. prepare a table of expression.
To see the processing pipeline and review an explanation of each step, you can run a demo hands-on analysis of this data by visiting the T-BioInfo platform ( and selecting the DEMO: RNA-seq (cell line project) selection under the RNA-Seq/chip parallel analysis of NGS and microarray data. Learn more about cell Line Data and the need of normalization in “Cell line Data and Preparation” lesson under the Transcriptomics course on the OmicsLogic Learn portal:
To access the courses, register on :