Photo credit: Anna Logue

Data Mining (HWS2019)

The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:

  • Goals and Principles of Data Mining
  • Data Representation and Preprocessing
  • Clustering
  • Classification
  • Regression
  • Association Analysis
  • Text Mining
  • Systems and Applications (e.g. Retail, Finance, Web Analysis)

The course consists of a lecture together with accompanying practical exercises as well as student team projects.  In the exercises the participants will gather initial expertise in applying state of the art data mining tools on realistic data sets. The team projects take place in the last third of the term. Within the projects, students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

Exam Review

The exam review for the exam of FSS2019 will take place on Friday, September 27th, 14:00, building B6, 26 room C1.01 (use the blue door and go to the first floor). There is no second exam for FSS2019. The next oportunity to retake the project and exam is in HWS2019.

  • Time and Location

    • Lecture: Wednesday, 10.15 - 11.45, Room A5 6, C015
    • Exercise 1: Thursday, 12.00 - 13.30, Room B6 26, A104  Nicolas Heist (RapidMiner)
    • Exercise 2: Thursday, 13.45 - 15.15, Room A5 6, C012  Sven Hertling (Python)
    • Exercise 3: Thursday, 15.30 - 17.00, Room A5 6, C012  Ralph Peeters (Python)

    Note: there are three parallel exercise groups, you are supposed to attend only one.

  • Instructors

  • Final exam

    • 75 % written exam
    • 25 % project work (20% report, 5% presentation)
  • Registration

    • For attending the course, please register for the lecture in Portal 2. The course is limited to 80 participants. There will be no „first come - first serve“. Students in higher semesters will be preferred, equally ranked students will be drawn randomly.
    • We offer three alternative times (Thursdays 12.00, 13.45 and 15.30) for the exercise session. Choose one and attend the exercise at the corresponding time (you don't have to register for it).

Slides and Exercises

Slides:

Exercises:

Solutions and additional material can be found in the ILIAS group of the course.

Outline

For all students which are not familiar with Python/Jupyter Notebooks, we offer an introduction on Wednesday, September 4th, 2019 between 15:30 and 17:00 in room A5 6, C015.

Week Wednesday Thursday
02.09.2019

Introduction to Data Mining

Introduction to Python (see above)

Exercise Preprocessing/Visualization

09.09.2019 Lecture Clustering Exercise Clustering
16.09.2019 Lecture Classification 1 Exercise Classification 
23.09.2019 Lecture Classification 2 Exercise Classification 
30.09.2019 Lecture Classification 3 Holiday (no exercise)
07.10.2019 Lecture Regression Exercise Regression
14.10.2019 Lecture Text Mining  Exercise Text Mining
21.10.2019 Lecture Association Analysis Exercise Association Analysis
28.10.2019 Introduction to Student Projects 
and Group Formation (Attendance obligatory)
Preparation of Project Outlines
04.11.2019 Feedback on demand Project Work
11.11.2019 Feedback on demand Project Work
18.11.2019 Feedback on demand Project Work
25.11.2019 Feedback on demand Presentation of project results
02.12.2019 Submission of project results