SM445/ CS 707: Seminar “Machine Learning on Structured Data” (FSS 2026)
This seminar studies machine learning models for structured data, covering tabular, relational, and graph-based approaches. We will discuss models such as gradient-boosted decision trees, neural architectures for tabular data, relational deep learning, and knowledge graph models. You will learn to explore, read, and understand scientific literature in machine learning and to present your results in a written report and a presentation.
Schedule
TBD.
Organization
- This seminar is organized by Prof. Dr. Rainer Gemulla, Simon Forbat, and Julie Naegelen.
- Available for up to 7 Master students (4 ECTS) and up to 5 Bachelor students (5 ECTS).
- Prerequisites: Solid background in machine learning (MSc students), Wirtschaftsinformatik IV (BSc students)
Goals
In this seminar, you will
- Read, understand, and explore scientific literature
- Summarize a current research topic in a concise report (10 single-column pages + references)
- Give two presentations about your topic (3 minutes flash presentation, 15 minutes final presentation)
- Moderate a scientific discussion about the topic of one of your fellow students
- Review (drafts of) reports of fellow students
Registration
Please register via Portal² until the official deadline.
If you are accepted into the seminar, attend the kickoff (see schedule) and provide at least 4 topic areas of interest of your preference (your own and / or example topic areas; see below) via email to Julie Naegelen until the deadline (see schedule).
The actual topic assignment takes place soon afterwards; we will notify you via email. Our goal is to assign one of your preferred topic areas to you.
Topic areas and topics
You will be assigned a larger topic area in an active, relevant field of machine learning based your preferences. Your goals in this seminar are
- Provide a short, concise overview of this topic area (1/4). A good starting point may be a book chapter, survey paper, or recent research paper. Here you take a birds-eyes view and are expected to discuss the main goals, challenges, and relevance of your topic area. Topic areas are selected at the beginning of the seminar.
- Present a self-selected topic within this topic area in more detail (3/4). A good starting point is a recent or highly-influential research paper. Here you dive deep into one particular topic and are expected to discuss and explain the concrete problem statement, concrete solution or contribution, as well as your own thoughts. The actual topic is selected before the first tutor meeting.
You are generally free to propose your topic area of interest as long as it aligns with the overall theme and objectives of the seminar.
Potential topic areas and topics:
- Models for tabular data:
- Symbolic approaches (BSc) — gradient-boosted decision trees
- End-to-end neural approaches — TURL
- Foundation models — TabPFN / TabICL
- LLM-based approaches — start with a recent survey, then choose one approach
- Models for relational data:
- Relational deep learning:
- Benchmarks and Feature Engineering (BSc) — RelBench
- Deep Learning Methods (MSc) — RelGNN, GraphSAGE in RelBench
- Knowledge graphs:
- Symbolic approaches — AnyBURL, AMIE
- Transductive approaches — ComplEx
- Inductive approaches — NBFNet, A*Net
- Foundation models — ULTRA, HYPER
- GraphRAG
- Relational deep learning:
- Implementations:
- Frameworks:
- Compare GNN frameworks (must include some experiments) (BSc)
- Accelerated GNN implementations:
- Efficiency (MSc)
- Frameworks:
- Applications
- Propose your own topic.
Important Conferences and Journals
Find below a list of some of the most important conferences/
- International Conference on Machine Learning (ICML)
- Conference on Neural Information Processing Systems (NeurIPS)
- International Conference on Learning Representations (ICLR)
- International Conference on Knowledge Discovery and Data Mining (KDD)
- AAAI Conference on Artificial Intelligence (AAAI)
- International Joint Conference on Artificial Intelligence (IJCAI)
- International Conference on Very Large Data Bases (VLDB)
- International Conference on on Management of Data (SIGMOD)
- …
These websites typically allow to browse their papers and provide search options.
Supplementary materials and references
- “Giving Conference Talks” (PDF, 1 MB) by Prof. Dr. Rainer Gemulla
- "Writing for Computer Science" by Justin Zobel, Springer, 2014
