CS 560: Large-Scale Data Management (HWS18)
Organization
- Lecturer:Prof. Dr. Rainer Gemulla
- Tutors:Daniel Ruffinelli
- Type of course: Lecture, exercises (6 ECTS points)
- Prerequisites: Database Systems I or equivalent, programming experience
- Registration: Enroll in ILIAS
Content
This course introduces the fundamental concepts and computational paradigms of large-scale data management and Big Data. This includes methods for storing, updating, querying, and analyzing large dataset as well as for data-intensive computing. The course covers concept, algorithms, and system issues; accompanying exercises provide hands-on experience. Topics include:
- Parallel and distributed databases
- MapReduce and its ecosystem
- Spark and dataflows
- NoSQL databases
- Stream processing (tentative)
- Graph databases (tentative)
Lecture Notes
Lecture notes, exercises, and supplementary material can be found in ILIAS.
Literature
- H. Garcia-Molina, J. D. Ullman, J. Widom
Database Systems: The Complete Book
Prentice Hall, 2nd ed., 2008 - L. Wiese
Advanced Data Management: For SQL, NoSQL, Cloud and Distributed Databases
De Gruyter, 2015 - T. Öszu, P. Valduriez
Principles of Distributed Database Systems
Springer, 3rd ed., 2011 - T. White
Hadoop – The Definitive Guide
O’Reilly, 4th ed., 2015
- H. Garcia-Molina, J. D. Ullman, J. Widom
Additional Literature
- J. Lin, C. Dyer
Data-Intensive Text Processing with MapReduce
Morgan and Claypool, 1st ed., 2010 - E. Redmond, J. R. Wilson
Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement
Pragmatic Bookshelf, 2nd ed., 2018 - P. J. Sadalage, M. Fowler
NoSQL Distilled
Addison-Wesley, 2012 - C. Strauch
NoSQL databases
Stuttgart Media University, 2011 - More in lecture notes
- J. Lin, C. Dyer