Course info
This course introduces the principles and practices of modern data engineering. Building upon prior knowledge of BASH, Git, R, Python, and SQL, students will develop practical skills required to design, implement, and maintain data workflows in real-world environments.
The course focuses on core data engineering concepts, including:
- ETL (Extract, Transform, Load) pipelines
- Data workflow automation
- Reproducible data environments
- Version control in data projects
- Data validation and transformation
- Containerised development environments
Through hands-on assignments, students will work with realistic datasets and build structured, reproducible data pipelines. Emphasis is placed on practical implementation, system thinking, and best practices in collaborative data work.
By the end of the course, students will understand the architecture of data engineering systems and be able to design and implement robust data workflows.
Schedule (in Tartu)
- Lectures:
- Thursday 14.15 - 16.00, Delta - 2048
- Practicals/seminars:
- Thursday 16.15 - 18.00, Delta - 2048
Log in to see panopto link
Contacts
- Lecturer: Priit Adler
Learning objectives
- Student knows and can describe the tasks and role of a data engineer
- Student can explain terms and concepts related to data engineering; for example, ETL, OLAP
- A student can independently find and extract relevant data from a data file using either the command line, R, or Python
- Student can independently construct and perform SQL queries
- The student can describe and implement the ETL process (with supervision).
Grading
Project: 30 % Homework: 70 %