Data Mining (FSS2024)

The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:

  • The Data Mining Process
  • Data Representation and Preprocessing
  • Clustering
  • Classification
  • Regression
  • Association Analysis
  • Text Mining

The course consists of a lecture together with accompanying practical exercises as well as student team projects.  In the exercises the participants will gather initial expertise in applying state of the art data mining libraries on realistic data sets. The team projects take place in the last third of the term. Within the projects, groups of students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

  • Exam Review

    The exam review for FSS2024 will take place on Wednesday, September 25th 2024 at 10 am.

    You have to register for the exam review by sending a mail to Alexander Brinkmann until Friday, September 21st 2024.

  • Instructors

  • Time and Location

    • Lecture: Wednesday, 10.15 – 11.45, Room A5 B144 (Start: 14.2.2024)
    • Exercises: Students should attend one of the three exercise groups. The contents are identical.
      • Thursday, 10.15 – 11.45, Room B6 A104 (Keti)
      • Thursday, 12.00 – 13.30, Room B6 A104 (Alex)
      • Thursday, 13.45 – 15.15, Room B6 A104 (Ralph)
  • Grading

    • 75 % written exam (we offer only a single exam and no re-take as the course is offered every semester)
    • 25 % project work (20% report, 5% presentation)
  • Registration

    • For attending the course, please register for the lecture in Portal 2. The course is limited to 90 participants. There will be no “first come – first serve”. Students in higher semesters and students that have failed the course in HWS 2023 will be preferred, equally ranked students will be drawn randomly.
    • You don't have to register for the Exercise.

Outline and Course Materials


Lecture: Introduction to Data Mining
Tutorial: Introduction to Python (Solution)

Exercise: Preprocessing/Visualization (Solution)

21.02.2024Lecture: Cluster AnalysisExercise: Cluster Analysis (Solution)
28.02.2024Lecture: Classification 1Exercise: Classification (Solution)
06.03.2024Lecture: Classification 2Exercise: Classification (Solution)
13.03.2024Lecture: Classification 3Exercise: Classification (Solution)
20.03.2024Lecture: RegressionExercise: Regression (Solution)
 - Easter Break - 

Lecture: Text Mining

Exercise: Text Mining (Solution)

Introduction to the student projects 
(Example questions)

Preparation of project outline
22.04.2024Submission of project outlines (Deadline: 23:59)
24.04.2024Lecture: Association AnalysisExercise: Association Analysis (Solution)
25.04.2024Feedback on project outlines
30.04.2024Project WorkFeedback on demand
08.05.2024Project WorkFeedback on demand
15.05.2024Project WorkFeedback on demand
17.05.2024Submission of project reports (Deadline: 23:59)
22.05.2024Presentation of project results 

Final exam


For all students which are not familiar with Python/Jupyter Notebooks, we offer an introduction on Wednesday, 14 February 2024 between 15:30 and 17:00 in room A 101, B6.