Schedule
Seminars are held every week on Wednesday at 16.15-18.00 Delta-1026 (starting from 09.09).
Please register for the course "How to Win a Data Science Competition: Learn from Top Kagglers" using your university email either here or here and try to solve practical tasks from the "Introduction & Recap" module to estimate required efforts for completing the course.
Please, choose or propose the paper that you will present to the class here.
Our course will be split into two parts, one could name them theoretical and practical, but naturally, both of them are practical, just in a different way.
Part 1
For the first five weeks (September-October) the schedule will be aligned with the Coursera course "How to Win a Data Science Competition: Learn from Top Kagglers". All the participants are expected to watch the course videos (~2h) and solve quiz tasks (~1-2h) at home. During our weekly meetings (1.5h) in class, we will discuss the course materials and related papers (or blog posts or videos, 1-2 per meeting). Each participant will present it to the class at least once.
Week 0: slides, recording (log into courses to see link)
- Introductory lecture by Mikhail (logistics, objectives, intro to data science competitions)
Week 1: word2vec slides, colab, recording (log into courses to see link)
- Homework: Coursera Week 1 (Feature preprocessing and generation)
- Seminar: papers by the schedule
Week 2: slides, recording (log into courses to see link)
- Homework: Coursera Week 2 (Exploratory data analysis, Validation, Data leakages)
- Seminar: papers by the schedule
Week 3: slides, recording (log into courses to see link)
- Homework: Coursera Week 3 (Metrics Optimization, Advanced Feature Engineering I)
- Seminar: papers by the schedule
Week 4: slides, recording (log into courses to see link)
- Homework: Coursera Week 4 (Hyperparameter Optimization, Advanced Feature Engineering II)
- Seminar: papers by the schedule
Week 5: slides, recording (log into courses to see link)
- Homework: Coursera Week 5 (Competitions go through, Final project)
- Seminar: papers by the schedule
Part 2
In the second part of the course (October-December), we will participate in real data science competitions. The students will form teams and join one of the available competitions on any platform (it could be on Kaggle, DrivenData, CodaLab, Zindi, or anywhere else). Each team will present an overview of the selected competition (data, kernels, known problems) during the seminar along with 1-2 related papers. Teams are encouraged to recommend to other students any suitable materials for home reading or watching.
Week 6: slides, recording (log into courses to see link)
- Homework: Coursera Final Project
- Seminar: Final Project discussion (short presentation by the authors of top solutions), an overview of ongoing competitions (by Mikhail)
Week 7-11
- Homework: competition solving, video (optional)
- Seminar: an overview of a competition one the teams (schedule TBA)
- 04.11: slides, recording (log into courses to see link)
- 11.11: slides (crops, volcanoes), recording (log into courses to see link)
- 25.11: slides, recording (log into courses to see link)
- 02.12
Week 12:
- Homework: finalizing competition (NB! deadline for your selected competition could be later but we expect you to have some results by this time)
- Seminar: competition wrap up (short presentation by each team about their progress and result if available)
One week is reserved for the occasional skip.