Hands-on Labs

This program includes hands-on labs that guide you through a complete RNA-seq analysis workflow. Each lab builds on the previous ones, using a consistent dataset throughout.

Dataset: TDP-43 Knockout Study (GSE136366)

Throughout these labs, we analyze RNA-seq data comparing TDP-43 knockout (KO) vs wildtype rescue (WT) in human HeLa cells. TDP-43 is an RNA-binding protein implicated in ALS and frontotemporal dementia.

Samples: 3 KO + 3 WT replicates (6 total)
Sequencing: Paired-end Illumina HiSeq 2500, 70 bp reads
Workshop subset: Chromosome 11 only (faster processing)

Day 1: Foundations

Lab 1

Warm-up for Large-scale Analysis & Genomic Datasets

Familiarize with the training environment, access public genomics resources, and download example RNA-seq datasets from NCBI SRA.

SRA Toolkit wget awk NCBI Ensembl

Start Lab →

Lab 2

Hands-on with Transcriptomics Data

Explore RNA-seq datasets and sequencing reads. Learn to navigate and understand FASTQ file structure and content.

SeqKit gzip awk

Start Lab →

Day 2: Preprocessing Pipeline

Lab 4 has two options - complete either Lab 4a or Lab 4b (not both):

Lab 3

Quality Control Hands-on

FastQC, fastp; inspect data quality metrics, trimming, adapter removal, pre/post-QC comparison

FastQC fastp multiQC

Start Lab →

Lab 4a

Genome Alignment (STAR + Salmon)

Align RNA-seq reads to the genome with STAR, then quantify with Salmon. Choose this if you need BAM files for visualization or downstream analysis.

STAR Salmon samtools IGV

Start Lab →

Lab 4b

Pseudo-alignment (Salmon Only)

Fast transcript quantification with Salmon pseudo-alignment. Choose this if you only need gene/transcript counts for differential expression analysis.

Salmon tximport R

Start Lab →

Day 3: Downstream Analysis

Lab 5

Differential Expression Analysis

Run DESeq2 for differential expression analysis. Identify genes differentially expressed between KO and WT.

DESeq2 R Jupyter

Start Lab →

Lab 6

Visualization & QC Plots

Generate publication-quality plots using Python/Jupyter: PCA, volcano plots, MA plots, heatmaps.

matplotlib seaborn Jupyter

Start Lab →

Lab 7

Enrichment Analysis with g:Profiler

Perform GO term and KEGG pathway enrichment analysis on differentially expressed genes using Python and g:Profiler.

gprofiler-official matplotlib Jupyter

Start Lab →

Lab Prerequisites

Before starting the labs, make sure you have:

Completed the Environment Setup
Access to a Linux terminal (native Linux, macOS, or WSL on Windows)
Activated your conda environment with bioinformatics tools installed
Sufficient disk space (at least 10 GB free recommended)