Projects and Tutorials

Genomics/Epigenomics
Document Video

Genomic Variation in NGS Data: Practical

User Ratings :

What are Genomics Variations ?

Genomic variation explains some of the differences among people, such as eye color and blood group, as well as whether a person has a higher or lower risk for getting particular diseases. Differences in the sequence of DNA among individuals are called genetic variation. Genetic variations in the human genome can take many forms, including 

  • Single nucleotide changes or substitutions
  • Tandem repeats
  • Insertions and deletions (indels)
  • Additions 
  • Deletions 
  • Copy number variations (CNVs), etc.,

 

The task of identifying possible variations in genome or transcriptome sequences with respect to a chosen reference sequence is referred to as “Variant Calling”. In germline variant calling, the reference sequence is the standard for the species of interest. For somatic variant calling, the reference is the genome of a chosen control somatic cell sample. In this tutorial, we will analyze Next-Generation Sequencing (NGS) data collected from two individuals with primary ductal carcinoma, and perform a somatic variant calling analysis. 

 

The data files we will use in this tutorial were taken from the Sequence Read Archive (SRA), a repository for biological sequence and alignment data. For each cell line, data is available in the form of pair-end reads. This is a common format of DNA sequence data. During genomic sequencing, DNA molecules are fragmented, and sequenced individually.

For a better understanding of concepts and  stepwise instructions on performing the pipeline on the T-Bioinfo Server and to understand the theoretical concept behind the algorithms used, visit Lesson 5: Genomics on OmicsLogic Learn Portal :

https://learn.omicslogic.com/Learn/course-3-genomics/lesson/05-genomic-variation-in-ngs-data-practical

In this lesson, We will create a pipeline for variant calling using the T-BioInfo platform to process paired-end read data from each pair of matched tumor-normal cell lines, and identify variants between cancerous and noncancerous cell genomes.


Reference publication: https://www.nature.com/articles/jhg201055