CS 560: Large-Scale Data Management (HWS 2023)

Organization

  • Lecturer: Prof. Dr. Rainer Gemulla
  • Tutor: Adrian Kochsiek
  • Type of course: Lecture, exercises (6 ECTS points)
  • Prerequisites: Database Systems I or equivalent, programming experience
  • Registration: Enroll in Portal 2

Lecture and tutorials will be held in presence, both starting in the first week. Details are discussed in the kickoff lecture (Tuesday, Sep 5, 10:15).

Content

This course introduces the fundamental concepts and computational paradigms of large-scale data management and Big Data. This includes methods for storing, updating, querying, and analyzing large dataset as well as for data-intensive computing. The course covers concept, algorithms, and system issues; accompanying exercises provide hands-on experience. Topics include:

  • Parallel and distributed databases
  • Big Data Processing (including MapReduce, Spark)
  • NoSQL databases
  • Cloud databases
  • Stream processing (tentative)
  • Graph processing (tentative)

Lecture Notes

Lecture notes, exercises, and supplementary material can be found in ILIAS.

Literature

  • H. Garcia-Molina, J. D. Ullman, J. Widom. Database Systems: The Complete Book. Prentice Hall, 2nd ed., 2008
  • T. Öszu, P. Valduriez. Principles of Distributed Database Systems. Springer, 4th ed., 2020
  • E. Redmond, J. R. Wilson- Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement. Pragmatic Bookshelf, 2nd ed., 2018
  • L. Wiese. Advanced Data Management: For SQL, NoSQL, Cloud and Distributed Databases. De Gruyter, 2015
  • More in lecture notes