Data Mining (HWS2024)
The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:
- The Data Mining Process
- Data Representation and Preprocessing
- Clustering
- Classification
- Regression
- Association Analysis
The course consists of a lecture together with accompanying practical exercises as well as student team projects. In the exercises the participants will gather initial expertise in applying state of the art data mining libraries on realistic data sets. The team projects take place in the last third of the term. Within the projects, groups of students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.
Exam Review
The exam review of the former Data Mining II course for FSS2024 is going to take place on Friday, September 20th, at 9 am. Please contact Ms Ezgi Yilmaz upfront if you want to review your exam. The deadline for registering for the exam review is Tuesday, September 17th, EOB.
Instructors
Time and Location
- Lecture: Monday, 13.45 – 15.15, Room A001 Building B6, 23 Part A (Start: 09.09.2024)
- Exercises: Students should attend one of the three exercise groups. The contents are identical.
- Thursday, 12.00 – 13.30, Room D007 (2) Building B6,27 Part D (in the backyard of B6, 23 Part A) (Franz Krause)
- Thursday, 13.45 – 15.15, Room D007 (2) Building B6,27 Part D (in the backyard of B6, 23 Part A) (Franz Krause)
- Thursday, 15.30 – 17.00, Room D007 (2) Building B6,27 Part D (in the backyard of B6, 23 Part A) (Andreea Iana)
Grading
- 75 % written exam (we offer only a single exam and no re-take as the course is offered every semester)
- 25 % project work (20% report, 5% presentation)
Registration
- For attending the course, please register for the lecture in Portal 2. The course is limited to 90 participants. There will be no “first come – first serve”. Students in higher semesters and students that have failed the course will be preferred, equally ranked students will be drawn randomly.
- You don't have to register for the Exercise.
Outline and Course Materials
Week | Monday (Lecture) | Thursday (Exercise) |
02.09.2024 | no lecture | Introduction to Python (13:45–15:15) |
09.09.2024 | Introduction to Data Mining | Intro |
16.09.2024 | Classification 1 | Classification 1 |
23.09.2024 | Classification 2 | Classification 2 |
30.09.2024 | Introduction to the student projects (see Ilias) | Public holiday |
07.10.2024 | Regression | Regression |
14.10.2024 | Preprocessing | Preprocessing |
21.10.2024 | Feedback on project outlines | Project Work |
28.10.2024 | Clustering and Anomalies | Clustering |
04.11.2024 | Association Analysis and Subgroup Discovery | Association Analysis |
11.11.2024 | Project feedback session | Project Work |
18.11.2024 | Project feedback session | Project Work |
25.11.2024 | Project feedback session | Project Work |
02.12.2024 | Project Presentations | Project Presentations |
Important dates for the student projects:
- Sunday, October, 13th, 23:59: Submission of project outlines
- Sunday, December 8th, 23:59: Submission of final project reports
For all students which are not familiar with Python/
Literature
Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar: Introduction to Data Mining, 2nd Global Edition, Pearson.
Aurélien Géron: Hands-On Machine Learning with Scikit-Learn and TensorFlow. O'Reilly.
Software
Videos and Screen Casts
- Video recordings of the Data Mining I lectures and screen casts of the exercises are available here.