Institute of Computer Science
Courses.cs.ut.ee Institute of Computer Science University of Tartu
  1. Courses
  2. 2025/26 spring
  3. Parallelism in Deep Learning (LTAT.06.030)
ET
Log in

Parallelism in Deep Learning 2025/26 spring

  • Pealeht
  • Loengud
  • Laborid
  • Kodutöö
  • Viited

Practical 10

Hybrid Parallelism (DP + MP + Pipeline Concept)


Objective

In this practical session, you will:

  • Run a hybrid parallel training system
  • Identify:
    • Data Parallelism (DP)
    • Model Parallelism (MP)
    • Pipeline concept (micro-batching)
  • Modify the system and observe behavior changes

Background

Students should:

  • Understand:
    • DP, MP, Pipeline (from lecture)
  • Have access to:
    • Multi-GPU machine (≥ 4 GPUs)
    • PyTorch with distributed support

Setup Instructions
  • Step 1 — Allocate GPUs (HPC): srun --partition=gpu --gres=gpu:4 --pty bash
  • Step 2 — Run the code : torchrun --nproc_per_node=2 python/sample.py

Part 1—Run and Observe

1) Use the following script:

  • Download

2) Questions:

  • Q1) How many processes are running?
  • Q2) Which GPUs does each rank use?
  • Q3) How many micro-batches are processed?
  • Q4) When does synchronization happen?

Part 2—Modify Micro-batches

1) Change:

MICRO_BATCHES = 4

2) Try:

MICRO_BATCHES = 2
MICRO_BATCHES = 8

3) Questions:

  • Q1) What changes in the output?
  • Q2) How many forward/backward steps now?

Part 3—Change Batch Size

1) Modify: BATCH_SIZE = 64

2) Try:

BATCH_SIZE = 32
BATCH_SIZE = 128

2) Questions:

  • Q1) Does execution pattern change?
  • Q2) What stays the same?

Part 4—Change Number of Processes (DP)

1) Run: torchrun --nproc_per_node=1 python/sample.py

2) Questions:

  • Q1) What happens to [Rank 1]?
  • Q2) Is synchronization still happening?

Part 5—Break Model Parallelism (Important)

1) Modify code:

  • Replace : device1 = torch.device(f"cuda:{local_rank + 2}")
  • With : device1 = torch.device(f"cuda:{local_rank}")

2) Questions:

  • Q1) What changes in output?
  • Q2) Are multiple GPUs still used?

Note: This is not just code and practice—you are observing how modern distributed training actually works

  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment