Schedule
10/02/25: Introduction to the Course
- What is team-based learning?
- How is the course structured?
17/02/25: A genome owner’s starter pack
Read the following sections from Chapter 1.2 of the "An Owner's Guide to the Human Genome" book:
- A short history.
- The DNA molecule.
- DNA is the world’s greatest data storage device.
- The genomic encoding of biological information.
- Genes and the encoding of proteins.
- The encoding of gene regulation.
- The inheritance of genomes.
You must complete the individual readiness assessment test (iRAT) in Moodle before our scheduled meeting on 17 Feb.
Assignments (link):
- Task 1: Genes, proteins, introns and exons (2 points).
- Task 2: The Human Reference Genome (1 point).
- Task 3: Using the High Performance Computing Center (2 points).
In-class work:
- Complete the Team Readiness Assurance Test in Moodle (20 minutes).
- Discussion of the answers (5-10 minutes).
- In teams, work on Task 1 from Assigment 1 (20 minutes).
- Discussion of the answers (5-10 minutes).
- Read "The Human Reference Genome" section (pages 25-27) from Chapter 1.2 and answer questions Task 2 from Assignment 1.
- Start working on Task 3 from Assignment 1.
03/03/25: Human genome variation and why it matters
Read the following sections from Chapter 1.3 of the "An Owner's Guide to the Human Genome" book:
- SNPs
- Genotype frequencies and the Hardy-Weinberg model.
- How many SNPs are there?
- Beyond SNPs: Other types of inherited variation.
- How do SNPs affect the information encoded in genomes?
From Chapter 3.1, read the following section:
- Genetic clustering: PCA
You must complete the individual readiness assessment test (iRAT) in Moodle before our scheduled meeting on 3 Mar.
In-class reading:
- Example: hemophilia in the royal families of Europe.
Optional reading:
- Chromosome inheritance errors: aneuploidy
Assignments:
- Understanding the VCF file format
- Characterising population structure using PCA (using genotype data without imputation).
10/03/25: DNA sequencing: a fundamental tool for studying biology.
Read the following sections from Chapter 1.4 of the "An Owner's Guide to the Human Genome" book:
- A short history of sequencing.
- Sequencing applications in human genomics.
- Genome resequencing and polymorphism discovery.
- Low-budget approaches to studying genome variation.
Learn about Burrows-Wheeler Transform (BWT) and FM-index from these source:
- TBD
Assignment:
- Construct FM-index using a pen and a paper.
17/03/25: Using RNA-sequencing to measure gene expression
- Revisit material from Chapter 1.2 to remind, what is gene expression and how it is regulated.
- Whatch "StatQuest: A gentle introduction to RNA-seq" to learn how RNA sequencing works.
Assignment:
- Perform RNA-seq alignment in HPC to obtain read counts and visualise the BAM files.
- Bonus: implement RNA-seq alignment steps as a Nextflow worklow.
24/03/25: Identifying differentially expressed genes
Read sections 6.1 - 6.10 from Chapter 6 of Modern Statistics for Modern Biology to refresh your understanding of statistical testing.
Read the following sections from Chapter 8 of the MSMB book to understand the specifics of modelling count data:
- 8.3.1 The challenges of count data
- 8.4 Modeling count data
Assignments
- Exploratory data analysis with PCA
- Differential gene expression analysis with DESeq2 or PyDESeq2
31/03/25 Linkage, recombination, and LD.
Read the following sections from Chapter 2.3 of the "An Owner's Guide to the Human Genome" book:
- A first look at haplotype structure.
- Linkage generates haplotype structure (or equivalently, LD).
- Recombination.
- Measuring LD between pairs of SNPs.
- Strong recombination breaks down LD.
- Recombination and LD in human data.
- Haplotype copying models. Phasing and imputation.
Note: Figure 2.37. briefly mentions coalecent, which needs to be explained somewhere.
Optional reading:
- PRDM9 and the hotspot paradox.
Assignments
- Calculate and visualise LD in some specific genetic loci (e.g. LCT vs ARHGEF3).
- Perform genotype imputation for on chromosome using Michigan Imputation Server.
- Understand how LD and genotype imputation help us to match RNA-seq samples to genotyped invdividuals.
07/04/25 Association testing and fine mapping
14/04/25 Other molecular traits: RNA splicing and chromatin accessibility
21/04/25 Deep learning models for regulatory genomics
Assignment Train a ChromBPNet model and use it to predict variant effects.
28/04/25 Understanding mode of action of genetic variants
05/05/25 Mendelian randomisation
12/05/25 Single-cell technologies
- There may be some minor changes in the schedule (due to guest lecturers or some other force majeure)