Data Mining (HWS 2018)
The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:
- Goals and Principles of Data Mining
- Data Representation and Preprocessing
- Clustering
- Classification
- Association Analysis
- Text Mining
- Systems and Applications (e.g. Retail, Finance, Web Analysis)
The course consists of a lecture together with accompanying practical exercises as well as student team projects. In the exercises the participants will gather initial expertise in applying state of the art data mining tools on realistic data sets. The team projects take place in the last third of the term. Within the projects, students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.
Exam Review
The review for the first and second exam of FSS2018 will take place on Friday, 28 September, at 13:00 in Room B6 C101.
Time and Location
- Lecture: Wednesday, 10.15 – 11.45, Room A5, C0.14
- Exercise 1: Thursday, 12.00 – 13.30, Room B6, A2.04 (Python)
- Exercise 2: Thursday, 13.45 – 15.15, Room B6, A2.04 (RapidMiner)
- Exercise 3: Thursday, 15.30 – 17.00, Room B6, A2.04 (Python)
Note: there are three parallel exercise groups, you are supposed to only attend one.
Instructors
- Prof. Dr. Heiko Paulheim
- Nicolas Heist (Exercise 1)
- Oliver Lehmberg (Exercise 2)
- Kiril Gashteovski (Exercise 3)
Final exam
- 60 % written exam
- 40 % project work
Registration
For attending the course, please register for the lecture in Portal 2 (link to lecture and exercise).The course is limited to 80 participants. From this semester on we will have a new process for the course registration and allocation. There will be no “first come – first serve”. Students in higher semesters will be preferred, equally ranked students will be drawn randomly. You can register from 13th of August until 29th of August.
We offer three alternative times (Thursdays 12.00, 13.45 and 15.30) for the exercise session. Choose one and attend the exercise at the corresponding time (you don't have to register for it).
Outline
Week Wednesday Thursday 03.09.2018 Introduction to Data Mining -- 10.09.2018 Lecture Clustering Introduction RapidMiner/ Python,
Exercise Clustering17.09.2018 Lecture Classification 1 Exercise Classification 24.09.2018 Lecture Classification 2 Exercise Classification 01.10.2018 Holiday -- 08.10.2018 Introduction to Student Projects
and Group Formation (Attendance obligatory)Work on Project proposals 15.10.2018 Lecture Classification 3 Exercise Classification 22.10.2018 Lecture Text Mining Exercise Text Mining 29.10.2018 Lecture Association Analysis -- 05.11.2018 Feedback on demand Exercise Association Analysis 12.11.2018 Feedback on demand Project Work 19.11.2018 Feedback on demand Project Work 26.11.2018 Feedback on demand Submission of project results 03.12.2018 Presentation of project results Presentation of project results Literature
Pang-Ning Tan, Michael Steinbach, Vipin Kumar: Introduction to Data Mining, Pearson.
Vijay Kotu, Bala Deshpande: Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner. Morgan Kaufmann.
Software
Videos and Screen Casts
- Video recordings of the Data Mining I lectures and screen casts of the exercises are available here.
Course Evaluations