Schedule
10/02/25: Introduction to the Course
- What is team-based learning?
- How is the course structured?
17/02/25: A genome owner’s starter pack
Read the following sections from Chapter 1.2 of the "An Owner's Guide to the Human Genome" book:
- A short history.
- The DNA molecule.
- DNA is the world’s greatest data storage device.
- The genomic encoding of biological information.
- Genes and the encoding of proteins.
- The encoding of gene regulation.
- The inheritance of genomes.
You must complete the individual readiness assessment test (iRAT) in Moodle before our scheduled meeting on 17 Feb.
Assignments (link):
- Task 1: Genes, proteins, introns and exons (2 points).
- Task 2: The Human Reference Genome (1 point).
- Task 3: Using the High Performance Computing Center (2 points).
In-class work:
- Complete the Team Readiness Assurance Test in Moodle (20 minutes).
- Discussion of the answers (5-10 minutes).
- In teams, work on Task 1 from Assigment 1 (20 minutes).
- Discussion of the answers (5-10 minutes).
- Read "The Human Reference Genome" section (pages 25-27) from Chapter 1.2 and answer questions Task 2 from Assignment 1.
- Start working on Task 3 from Assignment 1.
03/03/25: Human genome variation and why it matters
Read the following sections from Chapter 1.3 of the "An Owner's Guide to the Human Genome" book:
- SNPs
- Genotype frequencies and the Hardy-Weinberg model.
- How many SNPs are there?
- Beyond SNPs: Other types of inherited variation.
- How do SNPs affect the information encoded in genomes?
From Chapter 3.1, read the following section:
- Genetic clustering: PCA
You must complete the individual readiness assessment test (iRAT) in Moodle before our scheduled meeting on 3 Mar.
In-class work:
- Complete the Team Readiness Assurance Test in Moodle (10 minutes).
- Read "Example: hemophilia in the royal families of Europe." in Chapter 1.3 and answer the following questions:
How does the royal mutation lead to the production of the nonfunctional Factor IX protein? What are the most common type of mutation that causes hemophilia?
- Start working on Tasks 1-3 in Assignment 2.
Optional reading:
- Chromosome inheritance errors: aneuploidy
Assignments (link):
- Task 1: Understanding the VCF file format (1 points)
- Task 2: Extracting genotypes from a VCF file (2 points)
- Task 3: Exploring allele frequency differences (and population structure) with principal component analysis (PCA) (2 points)
10/03/25: DNA sequencing: a fundamental tool for studying biology.
Read the following sections from Chapter 1.4 of the "An Owner's Guide to the Human Genome" book:
- A short history of sequencing.
- Sequencing applications in human genomics.
- Genome resequencing and polymorphism discovery.
- Low-budget approaches to studying genome variation.
Read this high-level introduction to short read alignment:
- Short read alignment algorithms: (link)
OPTIONAL:
- Learn more about Burrows-Wheeler Transform (BWT) form this entertaining video (link)
- Read this if you would like to understand more of algorithmic detail behind short read alignment:Short Read Mapping: An Algorithimic Tour
Assignment:
- Construct FM-index using a pen and a paper. (link)
17/03/25: Using RNA-sequencing to measure gene expression
- Revisit material from Chapter 1.2 to remind, what is gene expression and how it is regulated.
- Whatch "StatQuest: A gentle introduction to RNA-seq" to learn how RNA sequencing works.
Assignments (link) :
- Task 1: RNA-seq alignment.
- Bonus task: implement RNA-seq alignment steps as a Nextflow worklow.
In-class work:
- Complete TRAT
- Work in teams to complete Tasks 1A-1C. Focus on conceptual understanding. It is fine if only one the team members runs the commands in HPC.
- Review of Assignment 2: compare your answers to another member in your group. Are there any discrepancies? Can you resolve the discrepancies in your larger group of 4-5?
24/03/25: Identifying differentially expressed genes
Read sections 6.1 - 6.10 from Chapter 6 of Modern Statistics for Modern Biology to refresh your understanding of statistical testing.
Read the following sections from Chapter 8 of the MSMB book to understand the specifics of modelling count data:
- 8.3.1 The challenges of count data
- 8.4 Modeling count data
Assignments (link)
- Task 1: Understanding the gene expression dataset (2 points)
- Task 2: Differential gene expression (2 points)
31/03/25 Linkage, recombination, and LD.
Read the following sections from Chapter 2.3 of the "An Owner's Guide to the Human Genome" book:
- A first look at haplotype structure.
- Linkage generates haplotype structure (or equivalently, LD).
- Recombination.
- Measuring LD between pairs of SNPs.
- Strong recombination breaks down LD.
- Recombination and LD in human data.
- Haplotype copying models. Phasing and imputation.
Note: Figure 2.37. briefly mentions coalecent, which needs to be explained somewhere.
Optional reading:
- PRDM9 and the hotspot paradox.
Assignments
- Calculate and visualise LD in some specific genetic loci (e.g. LCT vs ARHGEF3).
- Perform genotype imputation for on chromosome using Michigan Imputation Server.
- Understand how LD and genotype imputation help us to match RNA-seq samples to genotyped invdividuals.
07/04/25 Association testing and fine mapping
Pre-class work
Watch the first 33 minutes of this lecture in Youtube: MPG Primer: Introduction to fine-mapping (2023)
Additional optional material:
- Genome-wide association studies
- From genome-wide associations to candidate causal variants by statistical fine-mapping
- (PDF files in here: link)
Assignments (link)
- Task 1: Association testing (2 points)
- Task 2: Statistical fine mapping (2 points)
14/04/25 Connecting GWAS and eQTLs through colocalization
Watch the first 33 minutes of this lecture in Youtube: MPG Primer: Connecting GWAS and eQTLs through colocalization (2024)
Assignments (link)
- Task 1: Colocalisation testing in R (2 points)
- Task 2: Exploring genetic colocalisations in the Open Targets Platform (2 points)
21/04/25 Feedback session for Assignments 1-6 (not mandatory)
28/04/25 Role of RNA splicing in human complex traits
Pre-class work
Read the following sections from the review paper "The Expanding Landscape of Alternative Splicing Variation in Human Populations".
Mandatory sections:
- Introduction
- Technologies for High-Throughput Analysis of Alternative Splicing
- Quantifying Alternative Splicing by Using RNA-Seq Data
- Computational Approaches for Discovering Genetic Associations of Alternative Splicing
- Widespread Variation and Phenotypic Association of Alternative Splicing in Human Populations
- Alternative Splicing Meets Machine Learning
- Conclusion
Assignments (link)
05/05/25 Profiling of chromatin accessibility to understand gene regulatory mechanisms
Read the following sections from the 'Chromatin accessiblity profiling methods' review paper (exluding parts of the review focussing on single-cell methods):
- Introduction (beginning until sub-heading 'Experimentation).
- Experimentation (until subheading MNase-seq).
- Results (unil subheading 'Single-cell data analysis')
- Applications (until subheading 'Evolution of chromatin accessibility')
12/05/25 Single-cell technologies
Pre-class work
Watch this Youtube video: MPG Primer: Single-Cell Multiome Technology and Analysis Methods
Assignments: link
19/05/25 Mendelian randomisation (MR)
Pre-class work
- Whatch these two very short Youtube videos: one, two.
- Read a short primer on Introduction to Mendelian randomisation (compiled by Ralf Tambets)
- Read the classic 2012 paper (Voight et al] that used Mendelian randomisation to study the effect of increasing HDL cholesterol ("good cholesterol") on heart disease.
Assignments: link