Home Work 2
General Instructions & Submission
Release & Deadline
- Release date: 9 April 2026
- Deadline: 23 April 2026 (23:59)
Submission Requirements
Each student must submit:
1) Code files
- All modified scripts used in the homework
- Must be runnable
2) Report (PDF — single file) Include:
- Answers to all questions
- Tables of Results
- Short explanations
Task 1—Combine the Codes (6 Points)
Combine Code 1 and Code 2 to create an optimized training script.
1) Requirements
Students must:
- Start from Code 1
- Integrate AMP (Automatic Mixed Precision) from Code 2
- Produce a new file named:
practical5_single.py
2) Implementation Instructions
Your implementation must include:
autocast()GradScaler()scaler.scale(loss).backward()scaler.step(optimizer)scaler.update()
Hints:
- Keep the gradient accumulation logic from Code 1 unchanged, and integrate AMP inside it.
- Do not modify the training logic structure — only enhance it with AMP.
Task 2—Run and Compare All Versions (6 Points)
1) Requirements
Students must run the following three versions:
- Code 1: DDP + Gradient Accumulation
- Code 2: DDP + AMP
- Code 3: Combined (Accumulation + AMP)
2) Comparison Table
Fill in the table based on your observations:
| Code | Time(s) | Memory Usage (GB) | Stability |
| Code 1 | |||
| Code 2 | |||
| Code 3 |
3) Comparison Table: Modify Code 1 and Code 3
Students must run with different values of: ACCUM_STEPS = 4, 8, 16, 32
| Code | ACCUM_STEPS | Time(s) | Memory Usage (GB) | Observations |
| Code 1 | 4 | |||
| Code 1 | 8 | |||
| Code 1 | 16 | |||
| Code 1 | 32 | |||
| Code 3 | 4 | |||
| Code 3 | 8 | |||
| Code 3 | 16 | |||
| Code 3 | 32 |
Task 3— Run and Compare All Versions (3 Points)
A short report (1 - 2 page) answering:
- Q1) Compare execution time and explain differences using gradient synchronization and numerical precision?
- Q2) Compare GPU memory usage and explain how Gradient Accumulation and Mixed Precision impact memory differently?
- Q3) Explain how gradient accumulation changes the effective batch size. Why must the loss be divided by ACCUM_STEPS?