DWS Area: Web-based Systems

Billiger.de Products: Bilingual Entity Matching Benchmark released

We are happy to announce the release of Billiger.de Products, a bilingual benchmark for evaluating entity matching systems on German, English, and cross-language product data. Entity matching is the task of determining whether two records refer to the same real-world entity, for example, whether two ...

Mannheim Data Integration Benchmark (MaDI-Bench) released

We are happy to announce the release of the Mannheim Data Integration Benchmark (MaDI-Bench). Data integration combines heterogeneous data from multiple sources into a single, coherent dataset. Data integration involves a sequence of interdependent tasks including schema matching, value ...

Paper on WebMall Benchmark for Web Agents accepted at SIGIR 2026

We are happy to announce that the paper “WebMall – A Multi-Shop Benchmark for Evaluating Web Agents” by Ralph Peeters, Aaron Steiner, Luca Schwarz, Julian Caspary and Christian Bizer been accepted for publication at the 49th International ACM SIGIR Conference on Research and Development in ...

Paper on End-to-End Data Integration using LLMs accepted at Beyond SQL Workshop

We are happy to announce that the paper “Automatic End-to-End Data Integration using Large Language Models” by Aaron Steiner and Christian Bizer been presented at the workshop “Beyond SQL – AI for Complex Data Management” at the IEEE International Conference on Data Engineering (ICDE). The ...

Paper on Agent Interfaces to the Web accepted at the Web Conference 2026

We are happy to announce that the paper “MCP vs RAG vs NLWeb vs HTML: A Comparison of the Effectiveness and Efficiency of Different Agent Interfaces to the Web” by Aaron Steiner, Ralph Peeters, and Christian Bizer been accepted for the the ACM Web Conference 2026 as a short paper. The Web conference ...

PyDI – Python Data Integration Framework Version 0.2 released

We are happy to announce the release of Version 0.2 of the PyDI Data Integration Framework. The new release features improved data normalization and translation methods, JSON Schema integration, additional use cases, as well as new tutorials and extended documentation.

Comparison of Agent Interfaces to the Web presented at Göttingen AI Developer Meetup

Aaron Steiner has presented the results of a series of experiments comparing the effectiveness and efficiency of different interfaces for LLM agents to interact with websites at the Göttingen AI Developer Meetup.

Paper on LLM-based Table Annotation presented at ADBIS 2025

Keti Korini presented our paper Evaluating Knowledge Generation and Self-refinement Strategies for LLM-Based Column Type Annotation at the 29th European Conference on Advances in Databases and Information Systems in Tampere, Finland.

Alexander Brinkmann has successfully defended his PhD Thesis

Alexander Brinkmann has successfully defended his PhD thesis titled “Integrating Product Data from the Web using Deep Learning Techniques” today.

WebMall Benchmark for Evaluating LLM-Agents Released

We are happy to announce the initial release of the WebMall benchmark for evaluating the capability of LLM-agents to find and compare product offers across multiple e-shops.