SM445/ CS 707: Seminar “Machine Learning on Structured Data” (FSS 2026)
Structured data such as data frames and tables, relational databases, graphs, and knowledge bases are ubiquitous in applications. This seminar explores state-of-the-art approaches for machine learning on such data, including symbolic/
Schedule
The schedule contains all relevant deadlines of this seminar. Please find the schedule here (PDF, 148 kB).
Organization
- This seminar is organized by Prof. Dr. Rainer Gemulla, Simon Forbat, and Julie Naegelen.
- Available for up to 7 Master students (4 ECTS) and up to 5 Bachelor students (5 ECTS).
- Prerequisites: Solid background in machine learning (MSc students), Wirtschaftsinformatik IV (BSc students)
Goals
In this seminar, you will
- Read, understand, and explore scientific literature
- Summarize a current research topic in a concise report (10 single-column pages + references)
- Give two presentations about your topic (3 minutes flash presentation, 15 minutes final presentation)
- Moderate a scientific discussion about the topic of one of your fellow students
- Review (drafts of) reports of fellow students
Registration
Please register via Portal² until the official deadline.
If you are accepted into the seminar, attend the kickoff (see schedule) and provide at least 4 topic areas of interest of your preference (your own and / or example topic areas; see below) via email to Julie Naegelen until the deadline (see schedule).
The actual topic assignment takes place soon afterwards; we will notify you via email. Our goal is to assign one of your preferred topic areas to you.
Topic areas and topics
You will be assigned a larger topic area in an active, relevant field of machine learning based your preferences. Your goals in this seminar are
- Provide a short, concise overview of this topic area (1/4). A good starting point may be a book chapter, survey paper, or recent research paper. Here you take a birds-eyes view and are expected to discuss the main goals, challenges, and relevance of your topic area. Topic areas are selected at the beginning of the seminar.
- Present a self-selected topic within this topic area in more detail (3/4). A good starting point is a recent or highly-influential research paper. Here you dive deep into one particular topic and are expected to discuss and explain the concrete problem statement, concrete solution or contribution, as well as your own thoughts. The actual topic is selected before the first tutor meeting.
You are generally free to propose your topic area of interest as long as it aligns with the overall theme and objectives of the seminar.
Topic areas
Topic ares that are especially suitable for BSc students are marked as such.
- Models for tabular data
- Symbolic approaches (BSc), such as gradient-boosted decision trees
- Neural approaches, such as TURL
- Foundation models, such as TabPFN / TabICL
- LLM-based approaches (BSc), see survey
- Models for relational data
- Implementations
- GNN frameworks (BSc) such as PyTorch Geometric or Deep Graph Library; must include some experiments
- Accelerated / efficient GNN implementation, see survey
- Applications
- Propose your own topic
Important Conferences and Journals
Find below a list of some of the most important conferences/
- International Conference on Machine Learning (ICML)
- Conference on Neural Information Processing Systems (NeurIPS)
- International Conference on Learning Representations (ICLR)
- International Conference on Knowledge Discovery and Data Mining (KDD)
- AAAI Conference on Artificial Intelligence (AAAI)
- International Joint Conference on Artificial Intelligence (IJCAI)
- International Conference on Very Large Data Bases (VLDB)
- International Conference on on Management of Data (SIGMOD)
- …
These websites typically allow to browse their papers and provide search options.
Supplementary materials and references
- “Giving Conference Talks” (PDF, 1 MB) by Prof. Dr. Rainer Gemulla
- “Three Minute Thesis” by Sean McGraw
- "Writing for Computer Science" by Justin Zobel, Springer, 2014
