Arvutiteaduse instituut
  1. Kursused
  2. 2024/25 kevad
  3. Bioinformaatika (MTAT.03.239)
EN
Logi sisse

Bioinformaatika 2024/25 kevad

  • Main
  • Schedule
  • Assignments
  • Introductory biology
  • Projects

Schedule

10/02/25: Introduction to the Course

  • What is team-based learning?
  • How is the course structured?

17/02/25: A genome owner’s starter pack

Read the following sections from Chapter 1.2 of the "An Owner's Guide to the Human Genome" book:

  1. A short history.
  2. The DNA molecule.
  3. DNA is the world’s greatest data storage device.
  4. The genomic encoding of biological information.
  5. Genes and the encoding of proteins.
  6. The encoding of gene regulation.
  7. The inheritance of genomes.

You must complete the individual readiness assessment test (iRAT) in Moodle before our scheduled meeting on 17 Feb.

Assignments (link):

  • Task 1: Genes, proteins, introns and exons (2 points).
  • Task 2: The Human Reference Genome (1 point).
  • Task 3: Using the High Performance Computing Center (2 points).

In-class work:

  1. Complete the Team Readiness Assurance Test in Moodle (20 minutes).
  2. Discussion of the answers (5-10 minutes).
  3. In teams, work on Task 1 from Assigment 1 (20 minutes).
  4. Discussion of the answers (5-10 minutes).
  5. Read "The Human Reference Genome" section (pages 25-27) from Chapter 1.2 and answer questions Task 2 from Assignment 1.
  6. Start working on Task 3 from Assignment 1.

03/03/25: Human genome variation and why it matters

Read the following sections from Chapter 1.3 of the "An Owner's Guide to the Human Genome" book:

  1. SNPs
  2. Genotype frequencies and the Hardy-Weinberg model.
  3. How many SNPs are there?
  4. Beyond SNPs: Other types of inherited variation.
  5. How do SNPs affect the information encoded in genomes?

From Chapter 3.1, read the following section:

  1. Genetic clustering: PCA

You must complete the individual readiness assessment test (iRAT) in Moodle before our scheduled meeting on 3 Mar.

In-class work:

  • Complete the Team Readiness Assurance Test in Moodle (10 minutes).
  • Read "Example: hemophilia in the royal families of Europe." in Chapter 1.3 and answer the following questions:

How does the royal mutation lead to the production of the nonfunctional Factor IX protein? What are the most common type of mutation that causes hemophilia?

  • Start working on Tasks 1-3 in Assignment 2.

Optional reading:

  • Chromosome inheritance errors: aneuploidy

Assignments (link):

  • Task 1: Understanding the VCF file format (1 points)
  • Task 2: Extracting genotypes from a VCF file (2 points)
  • Task 3: Exploring allele frequency differences (and population structure) with principal component analysis (PCA) (2 points)

10/03/25: DNA sequencing: a fundamental tool for studying biology.

Read the following sections from Chapter 1.4 of the "An Owner's Guide to the Human Genome" book:

  1. A short history of sequencing.
  2. Sequencing applications in human genomics.
  3. Genome resequencing and polymorphism discovery.
  4. Low-budget approaches to studying genome variation.

Read this high-level introduction to short read alignment:

  • Short read alignment algorithms: (link)

OPTIONAL:

  • Learn more about Burrows-Wheeler Transform (BWT) form this entertaining video (link)
  • Read this if you would like to understand more of algorithmic detail behind short read alignment:Short Read Mapping: An Algorithimic Tour

Assignment:

  • Construct FM-index using a pen and a paper. (link)

17/03/25: Using RNA-sequencing to measure gene expression

  1. Revisit material from Chapter 1.2 to remind, what is gene expression and how it is regulated.
  2. Whatch "StatQuest: A gentle introduction to RNA-seq" to learn how RNA sequencing works.

Assignments (link) :

  1. Task 1: RNA-seq alignment.
  2. Bonus task: implement RNA-seq alignment steps as a Nextflow worklow.

In-class work:

  1. Complete TRAT
  2. Work in teams to complete Tasks 1A-1C. Focus on conceptual understanding. It is fine if only one the team members runs the commands in HPC.
  3. Review of Assignment 2: compare your answers to another member in your group. Are there any discrepancies? Can you resolve the discrepancies in your larger group of 4-5?

24/03/25: Identifying differentially expressed genes

Read sections 6.1 - 6.10 from Chapter 6 of Modern Statistics for Modern Biology to refresh your understanding of statistical testing.

Read the following sections from Chapter 8 of the MSMB book to understand the specifics of modelling count data:

  1. 8.3.1 The challenges of count data
  2. 8.4 Modeling count data

Assignments (link)

  • Task 1: Understanding the gene expression dataset (2 points)
  • Task 2: Differential gene expression (2 points)

31/03/25 Linkage, recombination, and LD.

Read the following sections from Chapter 2.3 of the "An Owner's Guide to the Human Genome" book:

  1. A first look at haplotype structure.
  2. Linkage generates haplotype structure (or equivalently, LD).
  3. Recombination.
  4. Measuring LD between pairs of SNPs.
  5. Strong recombination breaks down LD.
  6. Recombination and LD in human data.
  7. Haplotype copying models. Phasing and imputation.

Note: Figure 2.37. briefly mentions coalecent, which needs to be explained somewhere.

Optional reading:

  1. PRDM9 and the hotspot paradox.

Assignments

  • Calculate and visualise LD in some specific genetic loci (e.g. LCT vs ARHGEF3).
  • Perform genotype imputation for on chromosome using Michigan Imputation Server.
  • Understand how LD and genotype imputation help us to match RNA-seq samples to genotyped invdividuals.

07/04/25 Association testing and fine mapping

Pre-class work

Watch the first 33 minutes of this lecture in Youtube: MPG Primer: Introduction to fine-mapping (2023)

Additional optional material:

  • Genome-wide association studies
  • From genome-wide associations to candidate causal variants by statistical fine-mapping
  • (PDF files in here: link)

Assignments (link)

  • Task 1: Association testing (2 points)
  • Task 2: Statistical fine mapping (2 points)

14/04/25 Connecting GWAS and eQTLs through colocalization

Watch the first 33 minutes of this lecture in Youtube: MPG Primer: Connecting GWAS and eQTLs through colocalization (2024)

Assignments (link)

  • Task 1: Colocalisation testing in R (2 points)
  • Task 2: Exploring genetic colocalisations in the Open Targets Platform (2 points)

21/04/25 Feedback session for Assignments 1-6 (not mandatory)

28/04/25 Role of RNA splicing in human complex traits

Pre-class work

Read the following sections from the review paper "The Expanding Landscape of Alternative Splicing Variation in Human Populations".

Mandatory sections:

  • Introduction
  • Technologies for High-Throughput Analysis of Alternative Splicing
  • Quantifying Alternative Splicing by Using RNA-Seq Data
  • Computational Approaches for Discovering Genetic Associations of Alternative Splicing
  • Widespread Variation and Phenotypic Association of Alternative Splicing in Human Populations
  • Alternative Splicing Meets Machine Learning
  • Conclusion

Assignments (link)

05/05/25 Profiling of chromatin accessibility to understand gene regulatory mechanisms

Read the following sections from the 'Chromatin accessiblity profiling methods' review paper (exluding parts of the review focussing on single-cell methods):

  • Introduction (beginning until sub-heading 'Experimentation).
  • Experimentation (until subheading MNase-seq).
  • Results (unil subheading 'Single-cell data analysis')
  • Applications (until subheading 'Evolution of chromatin accessibility')

12/05/25 Single-cell technologies

Pre-class work

Watch this Youtube video: MPG Primer: Single-Cell Multiome Technology and Analysis Methods

Assignments: link

19/05/25 Mendelian randomisation (MR)

Pre-class work

  • Whatch these two very short Youtube videos: one, two.
  • Read a short primer on Introduction to Mendelian randomisation (compiled by Ralf Tambets)
  • Read the classic 2012 paper (Voight et al] that used Mendelian randomisation to study the effect of increasing HDL cholesterol ("good cholesterol") on heart disease.

Assignments: link

  • Arvutiteaduse instituut
  • Loodus- ja täppisteaduste valdkond
  • Tartu Ülikool
Tehniliste probleemide või küsimuste korral kirjuta:

Kursuse sisu ja korralduslike küsimustega pöörduge kursuse korraldajate poole.
Õppematerjalide varalised autoriõigused kuuluvad Tartu Ülikoolile. Õppematerjalide kasutamine on lubatud autoriõiguse seaduses ettenähtud teose vaba kasutamise eesmärkidel ja tingimustel. Õppematerjalide kasutamisel on kasutaja kohustatud viitama õppematerjalide autorile.
Õppematerjalide kasutamine muudel eesmärkidel on lubatud ainult Tartu Ülikooli eelneval kirjalikul nõusolekul.
Courses’i keskkonna kasutustingimused