WDC Products: Multi-Dimensional Entity Matching Benchmark released
We are happy to announce the release of the multi-dimensional WDC Products Benchmark for entity matching. WDC Products is based on product data that has been extracted in 2020 from 3259 e-shops that mark up product offers within their HTML pages using the schema.org vocabulary. It contains overall ...
WebDataCommons releases 86.4 billion quads Microdata, Embedded JSON-LD, RDFa, and Microformat data originating from 14.2 million websites
The DWS group is happy to announce the new release of the WebDataCommons Microdata, JSON-LD, RDFa and Microformat data corpus.
SOTAB wins Dataset Track of SemTab Challenge at ISWC 2022
We are happy to announce that the Web Data Commons – Schema.org Table Annotation Benchmark (WDC SOTAB) has won the Dataset Track of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab) at the International Semantic Web Conference 2022.
Anna Primpeli has successfully defended her PhD Thesis
Anna Primpeli has successfully defended her PhD thesis titled “Reducing the Labeling Effort for Entity Resolution using Distant Supervision and Active Learning” today.
Team WBSG wins ACM SIGMOD Programming Contest 2022
We are happy to announce that team WBSG consisting of the PhD students Alexander Brinkmann and Ralph Peeters has won the ACM SIGMOD Programming Contest 2022. Altogether 55 teams from all over the world participated in the contest.
Two Papers accepted at ESWC 2022
We are happy to announce that the papers “Impact of the Characteristics of Multi-Source Entity Matching Tasks on the Performance of Active Learning Methods” by Anna Primpeli and Christian Bizer and the paper “Supervised Knowledge Aggregation for Knowledge Graph Completion” by Patrick Betz, Christian ...
Two Posters about Product Matching using Deep Learning accepted at WWW2022 Conference
We are happy to announce that two posters about product matching using deep learning have been accepted at ACM TheWebConf 2022 (WWW2022). The conference takes place 25 – 29 April 2022. Below, you find the titles and abstracts of both posters as well as links to the arxiv pre-prints.
WebDataCommons releases 82.1 billion quads Microdata, Embedded JSON-LD, RDFa, and Microformat data originating from 14.6 million websites
The DWS group is happy to announce the new release of the WebDataCommons Microdata, JSON-LD, RDFa and Microformat data corpus.
Paper accepted at ISWC 2021
The paper “Graph-boosted Active Learning for Multi-Source Entity Resolution” by Anna Primpeli and Christian Bizer was accepted at ISWC 2021.
Paper accepted for PVLDB 2021
The paper “Dual-Objective Fine-Tuning of BERT for Entity Matching” by Ralph Peeters and Christian Bizer has been accepted for publication by the Proceedings of the VLDB Endowment (PVLDB) 2021. The paper will be presented at the VLDB 2021 conference in Copenhagen, Denmark in August.