Distributed Data Processing on the Cloud
In recent years, there has been a significant growth in the size of data that needs to be processed and analyzed. With the advent of cloud computing and maturity of distributed systems, several new solutions have popped up for distributed data processing such as MapReduce, in memory alternatives such as Apache Spark, NoSQL databases or frameworks based on the Bulk Synchronous Parallel model. This course aims at providing students with an overview of cloud and how large-scale data of the order of few Tera or Peta bytes can be processed with distributed data processing solutions and frameworks, on the cloud resources. The course introduces Cloud computing, MapReduce, BigData solutions such as Pig, Spark, Giraph, NoSQL solutions such as Riak and MongoDB.
On successful completion of this course, students will be able to:
- Understand the basic principles of distributed data processing and storage
- Apply well-known techniques to process data on the cloud
- Use different distributed data processing tools
- Adapt prominent data processing algorithms to distributed computing models such as MapReduce and Bulk Synchronous Parallel
The lab work in this course involves significant amount of programming. We will mainly work with Java (first half of the course) and Python (second half) programming languages, but we will also briefly touch SQL, R and JavaScript.
Lectures
- Friday at 12.15 - 14.00 in J. Liivi 2 - 122
week 1-16
Lecturers are:
- Satish Srirama - Ülikooli 17, Room - 324 (satish . srirama ät ut . ee)
- Pelle Jakovits - Ülikooli 17, Room - 324 (jakovits ät ut . ee)
Practice sessions
- Monday at 14.15 - 16.00 in Ülikooli 17 - 115
week 2-16
- Tuesday at 12.15 - 14.00 in Ülikooli 17 - 115
week 2-16
NB! Practice sessions start from the second week of the semester
Lab assistants are:
- Pelle Jakovits - Ülikooli 17, Room - 324 (jakovits ät ut . ee)
Examination 1
- Option 1: Friday 04.01.2019 12:00 - 15:00, Ulikooli 17 - 219
- Option 2: Monday 07.01.2019 10:00 - 13:00, Ulikooli 17 - 219
Resit examination
- Monday 21.01.2019 10:00 - 13:00, Ülikooli 17 - 219
Grading rules
Final grade consists of three components:
- Written exam – 50%
- Labs – 45%
- Active participation in the lectures - 5%
NB! You need to collect at least 50% in each grade component to pass the course!