Hands-on Labs
This program includes hands-on labs that guide you through a complete RNA-seq analysis workflow. Each lab builds on the previous ones, using a consistent dataset throughout.
Throughout these labs, we analyze RNA-seq data comparing TDP-43 knockout (KO) vs wildtype rescue (WT) in human HeLa cells. TDP-43 is an RNA-binding protein implicated in ALS and frontotemporal dementia.
- Samples: 3 KO + 3 WT replicates (6 total)
- Sequencing: Paired-end Illumina HiSeq 2500, 70 bp reads
- Workshop subset: Chromosome 11 only (faster processing)
Day 1: Foundations
Warm-up for Large-scale Analysis & Genomic Datasets
Familiarize with the training environment, access public genomics resources, and download example RNA-seq datasets from NCBI SRA.
Hands-on with Transcriptomics Data
Explore RNA-seq datasets and sequencing reads. Learn to navigate and understand FASTQ file structure and content.
Day 2: Preprocessing Pipeline
Lab 4 has two options - complete either Lab 4a or Lab 4b (not both):
Quality Control Hands-on
FastQC, fastp; inspect data quality metrics, trimming, adapter removal, pre/post-QC comparison
Genome Alignment (STAR + Salmon)
Align RNA-seq reads to the genome with STAR, then quantify with Salmon. Choose this if you need BAM files for visualization or downstream analysis.
Pseudo-alignment (Salmon Only)
Fast transcript quantification with Salmon pseudo-alignment. Choose this if you only need gene/transcript counts for differential expression analysis.
Day 3: Downstream Analysis
Differential Expression Analysis
Run DESeq2 for differential expression analysis. Identify genes differentially expressed between KO and WT.
Visualization & QC Plots
Generate publication-quality plots using Python/Jupyter: PCA, volcano plots, MA plots, heatmaps.
Enrichment Analysis with g:Profiler
Perform GO term and KEGG pathway enrichment analysis on differentially expressed genes using Python and g:Profiler.
Lab Prerequisites
Before starting the labs, make sure you have:
- Completed the Environment Setup
- Access to a Linux terminal (native Linux, macOS, or WSL on Windows)
- Activated your conda environment with bioinformatics tools installed
- Sufficient disk space (at least 10 GB free recommended)