RNA-seq Data Analysis Course

From Raw Reads to Biological Insights — A Hands-on Training Program

Dates 5-9 April 2026 | KAUST | Building 9 | Room 2120

Learning Outcomes

📚
RNA-seq Principles

Understand the biological and technical principles of RNA sequencing, library preparation, and experimental design

💻
Read Processing & QC

Perform quality control, trimming, and pre-processing of raw RNA-seq reads using industry-standard tools

🧬
Alignment & Quantification

Align reads to reference genomes and transcriptomes; quantify gene and transcript expression

📄
Differential Expression

Identify differentially expressed genes using DESeq2 and edgeR and interpret statistical results

🌟
Visualization & Reporting

Produce publication-quality plots, heatmaps, volcano plots, and comprehensive analysis reports

🎓
Functional Analysis

Perform gene ontology enrichment and pathway analysis to translate DEG lists into biological insight

🔨
Production Workflows with nf-core

Run a fully reproducible, scalable RNA-seq analysis using the nf-core/rnaseq pipeline on real datasets on HPC

Program Overview

Module 1

Biological Background

RNA-seq overview, applications, library preparation, and protocols. Experimental design: replication, batch effects, and statistical power. Sequencing technologies: Illumina, Nanopore, and PacBio SMRT.

View Module 1 →
Module 2

Computational Overview & Data Access

Linux command-line refresher and Ibex HPC orientation. Genomic file formats (FASTA, FASTQ, BAM, GTF). Retrieving reference genomes and public RNA-seq datasets from SRA using sra-toolkit and the NCBI datasets tool.

View Module 2 →
Module 3

QC & Preprocessing

Raw data quality evaluation with FastQC, adapter trimming and quality filtering with fastp, ribosomal and contaminant read removal, and aggregated QC reporting with MultiQC.

View Module 3 →
Module 4

Read Alignment

Splice-aware alignment to GRCh38 with STAR, BAM post-processing with samtools and Picard, alignment QC with RSeQC and QualiMap, and visualization in IGV.

View Module 4 →
Module 5

Quantification

Transcript quantification with Salmon, post-alignment QC with RSeQC and dupRadar, exploratory analysis and sample-level QC (PCA, sample-distance heatmap) in Python and R.

View Module 5 →
Module 6

Standardized Analysis I: nf-core/rnaseq

Introduction to Nextflow and nf-core. Configure and run the nf-core/rnaseq pipeline on Ibex with the KAUST institutional profile — samplesheet setup, key parameters, and interpreting the MultiQC output.

View Module 6 →
Module 7

Differential Expression Analysis

Normalization strategies, statistical testing, experimental design, and contrasts. Differential expression analysis in R using DESeq2 — PCA, volcano plots, MA plots, and heatmaps with ggplot2 and pheatmap.

View Module 7 →
Module 8

Functional Enrichment

Gene Ontology enrichment with clusterProfiler, KEGG pathway analysis, and Gene Set Enrichment Analysis (GSEA) with fgsea — interpreting enrichment results and producing publication-quality dot plots.

View Module 8 →
Module 9

Standardized Analysis II: nf-core/differentialabundance

End-to-end automated differential expression using nf-core/differentialabundance — from count matrices to an interactive Shiny report, covering contrasts configuration and output interpretation.

View Module 9 →
Module 10

Real-world Analysis Capstone

End-to-end analysis of GSE136366 using nf-core/rnaseq and nf-core/differentialabundance. Interpret MultiQC reports, perform DEA, run enrichment analysis, and present results to the group.

View Module 10 →

Practical Information

Schedule

Prerequisites

Tools Used

FastQC fastp MultiQC STAR samtools Picard Salmon DESeq2 edgeR tximport clusterProfiler fgsea pheatmap ggplot2 R / Bioconductor Jupyter IGV RSeQC Nextflow nf-core/rnaseq nf-core/differentialabundance nf-core tools Singularity

Relevant Resources