Data Mining (HWS 2022)
The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:
- The Data Mining Process
- Data Representation and Preprocessing
- Clustering
- Classification
- Regression
- Association Analysis
- Text Mining
The course consists of a lecture together with accompanying practical exercises as well as student team projects. In the exercises the participants will gather initial expertise in applying state of the art data mining tools on realistic data sets. The team projects take place in the last third of the term. Within the projects, students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.
Exam Review
The exam review for HWS2022 will take place on Thursday, 2 February 2023, starting from 14:00.
You have to register for the exam review by writing a mail to Bianca Lermer until Tuesday, 31 January 2023.
Instructors
Time and Location
- Lecture: Wednesday, 10.15 – 11.45, Room B6 A1.01
- Exercises: Students should attend one of the three exercise groups. The contents are identical.
- Thursday, 12.00 – 13.30, Room B6 A 1.04 (Nico Heist)
- Thursday, 13.45 – 15.15, Room B6 A 1.04 (Alex Brinkmann)
- Thursday, 15.30 – 17.00, Room B6 A 1.04 (Sven Hertling)
Final exam
- 75 % written exam
- 25 % project work (20% report, 5% presentation)
Registration
- For attending the course, please register for the lecture in Portal 2. The course is limited to 90 participants. There will be no “first come – first serve”. Students in higher semesters and students that have failed the course in FSS2022 will be preferred, equally ranked students will be drawn randomly.
- You don't have to register for the Exercise.
Lecture Videos, Slides and Exercises
Slides:
- 07.09.2022: Introduction
- 14.09.2022: Clustering
- 21.09.2022: Classification Part 1
- 28.09.2022: Classification Part 2
- 05.10.2022: Introduction to Student Projects
- 12.10.2022: Regression
- 19.10.2022: Text Mining
- 26.10.2022: Association Analysis
Exercises:
- 08.09.2022: Python Intro , Simple Preprocessing
- 15.09.2022: Clustering
- 22.09.2022: Classification Part 1
- 29.09.2022: Classification Part 2
- 13.10.2022: Regression
- 20.10.2022: Text Mining
- 27.10.2022: Association Analysis
Additional material (e.g., exercise solutions) can be found in the ILIAS group of the course.
Outline
Lectures and exercises take place on campus, unless specified otherwise.
Week | Wednesday | Thursday |
05.09.2022 | Lecture: Introduction to Data Mining | Exercise: Python Intro / Preprocessing |
12.09.2022 | Lecture: Clustering | Exercise: Cluster Analysis |
19.09.2022 | Lecture: Classification 1 | Exercise: Classification |
26.09.2022 | Lecture: Classification 2 | Exercise: Classification |
03.10.2022 | Team Project Introduction and Team Building | Project Work (no exercise) |
10.10.2022 | Lecture: Regression | Exercise: Regression |
17.10.2022 | Lecture: Text Mining | Exercise: Text Mining |
24.10.2022 | Lecture: Association Analysis | Exercise: Association Analysis |
31.10.2022 | Project feedback session | Project Work (no exercise) |
07.11.2022 | Project feedback session | Project Work |
14.11.2022 | Project feedback session | Project Work |
21.11.2022 | Project feedback session | Project Work |
28.11..2022 | Project feedback session | Project Work |
05.12.2022 | Project Presentations | Project Presentations |
Important project dates:
- Project Proposal due: Monday, October 10th, 23:59
- Project Report due: Friday, December 9th, 23:59
Literature
Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar: Introduction to Data Mining, 2nd Global Edition, Pearson.
Vijay Kotu, Bala Deshpande: Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner. Morgan Kaufmann.
Aurélien Géron: Hands-On Machine Learning with Scikit-Learn and TensorFlow. O'Reilly.
Software
Videos and Screen Casts
- Video recordings of the Data Mining I lectures and screen casts of the exercises are available here.
Course Evaluations