Institute of Computer Science
  1. Courses
  2. 2020/21 spring
  3. Big Data Management (LTAT.02.003)
ET
Log in

Big Data Management 2020/21 spring

  • HomePage

Course Content

Introduction to Big Data Analytics, Characteristics of Big Data and Dimensions of Scalability, Data Science: Getting Value out of Big Data, Foundations for Big Data Systems and Programming, Big Data Platforms, Data Store & Processing using Hadoop, Big Data Storage and Analytics, Big Data Analytics ML Algorithms, Recommendation, Clustering, and Classification, Linked Big Data: Graph Computing and Graph Analytics, Handling Streaming Data, Graphical Models and Bayesian Networks, Big Data Visualization, Cognitive Mobile Analytics, Introduction to SQL in Big Data and HiveQL.

Objectives

  • Understand the Big Data Landscape, its challenges, and its technological stack
  • Develop a Big Data common-sense to identify the need for specific technologies
  • Gain practical experience by experimenting with relevant technologies

Info

Lecture and practice slots are on Zoom

  • Wednesday 12:15 – 14:00 and (log into courses to see link)
  • Friday 14:15 – 16:00 (log into courses to see link)

Syllabus

  • Introduction to Big Data
  • Deployment Models
  • Taming Data Volume with Apache Spark
  • Taming Data Velocity with Spark Structured Streaming
  • Taming Data Variety with GraphFrames
  • Gaining Value with Spark MLlib

Grading

  • 80% on mini projects (20% each, 15% the deliverables, 5% presentation)
  • 20% mid-term MCQ (there is a bonus grade (take the best two))
  • 10% on labs (submit at least 50%)

NOTE!!: The lecturers reserve the right to call for an individual interview that can impact the final grade.

Textbooks:

  • Big Data: Principles and Best Practices of Scalable Real-Time Data Systems by Nathan Marz And James Warren 2015.
  • Big Data for Beginners: Understanding SMART Big Data, Data Mining & Data Analytics for Improved Business Performance, Life Decisions & More! By Vince Reynolds 2016. * A. Rajaraman, J. Leskovec, and J. D. Ullman – Mining of Massive Datasets, 1st Edition, 2011.

Reference Books:

  • Dirk deRoos, Paul C. Zikopoulos, Roman B. Melynk, Bruce Brown, Rafael Coss: Hadoop for Dummies Applications (1st Edition) 2014.
  • Big Data and Analytics by Seema Acharya and Subhashini Chellappan 2015.
  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment