Data Mining (HWS 2025)

The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:

  • The Data Mining Process
  • Data Representation and Preprocessing
  • Classification
  • Regression
  • Clustering
  • Association Analysis

The course consists of a lecture together with accompanying practical exercises as well as student team projects.  In the exercises the participants will gather initial expertise in applying state of the art data mining libraries on realistic data sets. The team projects take place in the last third of the term. Within the projects, groups of students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

No lecture in the first week

Due to the course assignments, there will be no lecture in the first week (September 1st 2025).

Python Introduction

For all students which are not familiar with Python/Jupyter Notebooks, we offer an introduction on Thursday, 4 September 2025 between 13:45 and 15:15 in exercise room A104 Building B6, 26 Part A.

If you want to join the Python intro and cannot make it for the specified timeslot, please write a mail to Ezgi Yilmaz with your preferred timeslot (12:00–13:30 or 15:30–17:00) until Wednesday, 3 September 13:00.

  • Instructors

  • Time and Location

    • Lecture: Monday, 13:45 – 15:15, Room A 001 Building A 5,6 Part A (Start: 9th September 2025)
    • Exercises: Students should attend one of the three exercise groups. The contents are identical.
      • Thursday, 12:00 – 13:30, A104 Building B6, 26 Part A
      • Thursday, 13:45 – 15:15, A104 Building B6, 26 Part A
      • Thursday, 15:30 – 17:00, A104 Building B6, 26 Part A
  • Grading

    • 75 % written exam (we offer only a single exam and no re-take as the course is offered every semester)
    • 25 % project work (20% report, 5% presentation)
  • Registration

    • For attending the course, please register for the lecture in Portal 2. The course is limited to 90 participants. There will be no “first come – first serve”. Students in higher semesters and students that have failed the course will be preferred, equally ranked students will be drawn randomly.
    • You don't have to register for the Exercise.

Outline and Course Materials

WeekMonday(Offline Lecture)Online Lecture
(see Ilias Course)
Thursday (Exercise)
01.09.2025no lecture Introduction to Python (13:45–15:15)
08.09.2025Introduction to Data Mining Intro
15.09.2025Preprocessing Preprocessing
22.09.2025Classification 1Nearest CentroidsClassification 1
29.09.2025Classification 2Comparing ClassifiersClassification 2
06.10.2025RegressionEnsemblesRegression
13.10.2025Clustering and AnomaliesHierarchical ClusteringClustering
20.10.2025Feedback on project outlinesTime SeriesTime Series
27.10.2025Association Analysis and Subgroup DiscoveryMulti Modal DataAssociation Analysis
03.11.2025Project feedback session Project Work
10.11.2025Project feedback session Project Work
17.11.2025Project feedback session Project Work
24.11.2025Project feedback session Project Work
01.12.2025Q&A Project Presentations

Important dates for the student projects:

  • Sunday, October 5th, 23:59 Deadline for team formation (all students without a team will be assigned afterwards)
  • Tuesday, October 14th, 23:59: Submission of project outlines
  • Sunday, November 30th, 23:59: Submission of final project reports
  • Wednesday, December 3rd, 23:59 Submission of project presentation (PDF)