Projects
We are committed to teaching you the practical skills needed to successfully complete any computer vision project you might encounter professionally. This usually implies being able to work with data collection and annotations as much as training machine learning models. While the latter is discussed a lot, the former is rarely mentioned in courses. Therefore, we decided to design projects that emphasise data processing skills as much as the ability to build powerful computer vision systems.
You can get up to 40 points for the project + bonuses.
Project key milestones:
- Fill in the profile assessment questionnaire that will help us to assign you into teams. Please, make sure to complete the survey until Friday 16.02.2024.
- The team formation deadline is Feb 25 (23:59, Sunday)
- The dataset preparation deadline is March 24 (23:59, Sunday). You should upload a short 2-page report to the courses homeworks page.
- The Kaggle setup deadline is April 14 (23:59, Sunday). You should send a link to your Kaggle page by the deadline in slack to the instructors.
- Participate in other teams’ Kaggles and try to improve upon their benchmarks and beat other teams. The deadline is May 12 (23:59, Sunday). Upload a short 2-page report to the courses.cs.ut.ee/dl4cv/homeworks page summarising your performance.
- Present your overall experience during the final class on May 29th.
- Fill out the team’s self-assessment questionnaire to estimate each member’s contribution by May 31 (Friday, 23:59).
What to do?
There are a number of milestones to deliver for the project, but worry not, we will guide you through them gradually, and it all should start making sense.
1. Profile assessment (Feb 16.02 @ 23:59)
Please fill in the following questionnaire that will help us to form the initial teams. This is a very important step, because without it you cannot get a team and without a team you cannot complete the project.
2. Team formation (March 1 @ 23:59)
Objective: you have been automatically assigned to one of the teams. Figure out whom are you going to work together with.
Tasks:
- You have been automatically assigned a team based on your responses to the profile assessment questionnaire.
- All you have to do is connect with your team, for example, by creating a separate channel in Slack.
- Ideally, meet, get to know each other, and decide on the roles inside your team.
Find the current team assignment in the following document: https://docs.google.com/spreadsheets/d/14SVjt0fpC6CTKxkigHez8URCVL8I6lwG8Tao2LUzpNQ/edit#gid=0.
Deliverable: nothing to deliver for this milestone. It, therefore, gives 0 points out of 40.
3. Dataset preparation (March 24 @ 23:59)
Objective: Walk in the shoes of Fei-Fei Li. Substantially modify existing or create a new computer vision dataset.
Tasks:
- Decide on the type of computer vision problem (classification, object detection, or segmentation) and gather relevant data. If you go for object detection or segmentation, your dataset may be smaller than for classification.
- Ensure your dataset is balanced and diverse.
- Preprocess the data (filtering out poor quality data, resizing, etc.). Leave normalisation and augmentation to the teams who will work with your competition.
- If this is a novel dataset, label the data accurately. Document the labeling procedure.
- Split the dataset into training and test subsets.
Deliverable: A 2-page report (with Appendix that includes images) describing the dataset you have created, motivation, sources, preprocessing techniques, labeling methods, and any challenges faced. Please highlight very clearly if the dataset is new or a modification of the existing dataset. If it is a modification, how was it modified? If it is a new dataset, how was it collected and labeled? Please, make it clear what kind of CV problem this dataset is built around (classification, segmentation or object detection). Describe train and test splits. Add examples of images and labels to your report. Provide relevant summary statistics. Upload this report to the course's homework page by March 24 (23:59, Sunday). This part is worth 10 out of 40 points.
Bonus points. Teams can get:
- up to three bonus points if the team creates a completely new dataset rather than reusing the existing one. Keep in mind that the new dataset should be big enough to train deep-learning models.
- up to two bonus point if the team creates a dataset around object detection and up to two points if it is a segmentation problem.
The above bonus points are additive.
4. Kaggle setup (April 14 @ 23:59)
Objective: Create a Kaggle inClass competition on the basis of the prepared dataset.
Tasks:
- Create a new inClass competition via https://www.kaggle.com/competitions/new (make it public)
- Upload your dataset to Kaggle.
- Define clear competition rules, evaluation criteria, and splits between public and private leaderboards.
- Create a small starter notebook that would serve as a starting point for other teams. This notebook should contain code for generating a random submission file as well as some basic exploratory data analysis for your data.
- Set the start and end dates to April 15 and May 12 (midnight), 2024, respectively.
- Establish a strong baseline by training a model as a benchmark - other teams will try to beat your baseline.
Where to get help: check out these guidelines from Kaggle to learn more about hosting your own competition on Kaggle.
Deliverable: Set up your Kaggle competition page with all necessary details and share the link with the lecturer by April 14 (23:59, Sunday). Send a link to your Kaggle page by the deadline in slack to the instructors. This part is worth 10 out of 40 points.
5. Participation in Kaggle competitions (May 12 @ 23:59)
Objective: Participate in at least 4 other teams' competitions, aiming to improve upon the benchmarks.
Tasks:
- Choose to participate in at least four competitions. There is a chance that we will select those four for each team separately (we will figure this out soon).
- Analyze each competition's problem and dataset.
- Develop and train models to solve the given problems.
- Submit your solutions to the competitions and iterate to improve your results.
Where to get help: Do not stress too much about your team’s final performance, instead, try to practice as many as possible concepts and models that we have studied in the course. If you manage to beat a few benchmarks on the way - the better 🙂
Deliverable: A 2-page report summarizing your approaches, methodologies, and performance in the competitions. Upload this report to the course's homework page by May 12 (23:59, Sunday). This part is worth 10 out of 40 points.
Bonus points At this stage, teams can get:
- One bonus point for every benchmark surpassed;
- Two bonus points if your own benchmark was not surpassed;
6. Final presentations (during the final class on May 29th)
Objective: Share your overall experience, learnings, and achievements in the course project via a presentation.
Tasks:
- Prepare a 10-minute presentation summarizing your journey through the project phases.
- Highlight key learnings, challenges faced, and how your team overcame them.
- Share insights from both setting up and participating in competitions.
Deliverable: deliver a presentation on May 29nd, during the final class. This part gives the final 10 points out of 40.
Please, add your slides to this folder before the event: https://drive.google.com/drive/u/1/folders/1ZkHb2q_CgR_zh6G1H6k7npD1rmEuLVwM?role=writer.
7. Team self-assessment (by May 31 @ 23:59)
Objective: so far, you have been graded as a team. Now, this is the time to talk about individual contributions.
Task: Fill out a questionnaire assessing your teammates’ involvement in the project's overall success.
Deliverable: Filled out Google form with questions by May 31 (23:59, Friday)
Additional Guidelines
- Employ the best project management practices: make a plan of what should be done and when; assign individual team members to each task; set up weekly meetings to review the progress and solve problems.
- Ethical Considerations: When creating new datasets, ensure ethical data collection practices, especially regarding privacy and consent.
This project is designed to enhance your technical skills in machine learning and computer vision and develop your abilities in managing and processing data effectively – a key skill in any data-driven field. Remember, the quality of your data and your understanding of the problem are just as important as the sophistication of your models.