0. Installations

Please, install R and RStudio. For RStudio you will need admin-rights, but R you could install in your home dir.

1.1. RNA-seq Data Generation

Short dictionary

10-minute simple explanation of TPM / FPKM is here.

1.2. File Formats

See infromation about sequencing file formats at NGS Analysis

1.3. Sequence-based QC: FastQC

FastQC – a simple but widely-used Java-based tool for quality control of the experiments at the sequence level. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

See Examples.

You can combine FastQC results into an interactive dashboard usin MultiQC. Short introduction in this video.

1.5. Data Repositories

Take home messages 1

  1. RNA-seq can be used as row counts and normalized (TPM, FPKM). See what you need for a specific algorithm!


  1. For QC of your samples at sequence level – use FastQC. To combine results - MultiQC


  1. Expression-related data in transcriptomics are strongly right-skewed. Therefore:
  • for statistics use either precise distribution (negative binomial for RNA-seq) or work with log-transformed data

  • use log-transformed data for exploratory analysis and visualization

  1. Several large repositories of the data exist. Before planning your experiments – make a search for existing data

Home Next

By Petr Nazarov