DS4DM – Data Search for Data Mining

The research project Data Search for Data Mining (DS4DM) extends the data mining plattform Rapidminer with data search and data integration functionalities.

Project Partners

  • Rapidminer GmbH
  • University of Mannheim, Data and Web Science Group

Key Idea

Analysts increasingly have the problem that they know that some data which they need for a project is available somewhere on the Web or in the corporate intranet but they are unable to find the data. The goal of the Data Search 4 Data Mining (DS4DM) project is to extend data mining plattform Rapidminer with data search and data integration functionalities which enables analysts to find relevant data in potentially very large data corpora and to semi-automatically integrate the discovered data with existing local data.

Expected Outcomes

  • Data search engine providing
    • keyword-based data search methods
    • correspondence-based data search methods
    • correlation-based data search methods
  • Interative data integration environment
  • Indexing components for intranet and Web data sources


The project is funded by the German Federal Ministry of Education and Research (BMBF) under the funding scheme KMU-innovativ with a funding amount of 400k €.

Duration: 2015–2018

Personnel at University of Mannheim

Project Summary

Main Project Website

Additional project resources



  1. Kleppmann, B., Bizer, C., Yaqub, E., Temme, F., Schlunder, P., Arnu, D. and Klinkenberg, R.: Density- and Correlation-based Table Extension. In Gemulla, R., LWDA 2018 : Proceedings of the Conference „Lernen, Wissen, Daten, Analysen“ Mannheim, Germany, 22–24 August 2018 (S. 191–194). CEUR Workshop Proceedings, RWTH: Aachen
  2. Anna Lisa Gentile, Sabrina Kirstein, Heiko Paulheim and Christian Bizer.  Extending RapidMiner with data search and integration capabilities. The Semantic Web: ESWC 2016 Satellite Events, Springer, 2016
  3. Petar Ristoski, Christian Bizer, Heiko Paulheim: Mining the Web of Linked Data with RapidMiner. Journal of Web Semantics, 2015.
  4. Oliver Lehmberg, Dominique Ritze, Petar Ristoski, Robert Meusel, Heiko Paulheim, Christian Bizer: The Mannheim Search Join Engine. Journal of Web Semantics, 2015 (In Press).
  5. Dominique Ritze, Oliver Lehmberg, Christian Bizer: Matching HTML Tables to DBpedia. 5th International Conference on Web Intelligence, Mining and Semantics (WIMS2015), Limassol, Cyprus, July 2015.