Data Mining (FSS2024)

The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:

  • The Data Mining Process
  • Data Representation and Preprocessing
  • Clustering
  • Classification
  • Regression
  • Association Analysis
  • Text Mining

The course consists of a lecture together with accompanying practical exercises as well as student team projects.  In the exercises the participants will gather initial expertise in applying state of the art data mining libraries on realistic data sets. The team projects take place in the last third of the term. Within the projects, groups of students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

  • Exam Review

    The exam review for HWS2023 will take place on Thursday, February 29th 2024.

    You have to register for the exam review by writing a mail to Bianca Lermer until Sunday, February 25th 2024. We will then allocate a time slot for the review to you.

  • Instructors

  • Time and Location

    • Lecture: Wednesday, 10.15 – 11.45, Room A5 B144 (Start: 14.2.2024)
    • Exercises: Students should attend one of the three exercise groups. The contents are identical.
      • Thursday, 10.15 – 11.45, Room B6 A104 (Keti)
      • Thursday, 12.00 – 13.30, Room B6 A104 (Alex)
      • Thursday, 13.45 – 15.15, Room B6 A104 (Ralph)
  • Grading

    • 75 % written exam (we offer only a single exam and no re-take as the course is offered every semester)
    • 25 % project work (20% report, 5% presentation)
  • Registration

    • For attending the course, please register for the lecture in Portal 2. The course is limited to 90 participants. There will be no “first come – first serve”. Students in higher semesters and students that have failed the course in HWS 2023 will be preferred, equally ranked students will be drawn randomly.
    • You don't have to register for the Exercise.

Outline and Course Materials

WeekWednesdayThursday
14.02.2024

Lecture: Introduction to Data Mining
Tutorial: Introduction to Python (Solution)

Exercise: Preprocessing/Visualization (Solution)

21.02.2024Lecture: Cluster AnalysisExercise: Cluster Analysis (Solution)
28.02.2024Lecture: Classification 1Exercise: Classification (Solution)
06.03.2024Lecture: Classification 2Exercise: Classification 
13.03.2024Lecture: Classification 3Exercise: Classification 
20.03.2024Lecture: RegressionExercise: Regression
 - Easter Break - 
10.04.2024

Lecture: Text Mining

Exercise: Text Mining
17.04.2024Introduction to the student projects 
and group formation
Preparation of project outline
22.04.2024Submission of project outlines (Deadline: 23:59)
24.04.2024Lecture: Association AnalysisExercise: Association Analysis
25.04.2024Feedback on project outlines
30.04.2024Project WorkFeedback on demand
08.05.2024Project WorkFeedback on demand
15.05.2024Project WorkFeedback on demand
17.05.2024Submission of project reports (Deadline: 23:59)
22.05.2024Presentation of project results 
XX.06.2024

Final exam

 

For all students which are not familiar with Python/Jupyter Notebooks, we offer an introduction on Wednesday, 14 February 2024 between 15:30 and 17:00 in room A 101, B6.