The grading will be 70% project, 30% exam. During the course, it may be possible to earn bonus points. E.g., by participating in in-class quizzes.
Project
The project should show an end-to-end data engineering lifecycle product. For example, orchestrated ETL processes that pull in data from various sources into a data warehouse where the data is integrated, aggregated, and served to a data analytics dashboard or a ML pipeline.
The tools used in the project can be the ones we learn about in class, but the students are encouraged to modify the tools as they see fit.
The project is to be done in a group of 3 people.
The final submission for the project is working code in a GitHub repository with a walkthrough of the steps needed to run the project.
Furthermore, a poster needs to be designed and presented by the group in a poster session.
At least 50% (35 points) are needed to pass the project successfully. Minimum grade can be achieved by integrating and cleaning 2 data sources, using an orchestration tool, and modeling the data in a star schema.
Exam
The exam will consist of about 10 questions. The material is the course book (Fundamentals of Data Engineering) and lecture material. Most of the material is covered by the lectures, but some questions may come directly from the book as well.
At least 50% (15 points) are needed to successfully pass the exam.