Extending RapidMiner with Data Search and Integration Capabilities

Anna Lisa Gentile, Sabrina Kirstein, Heiko Paulheim and Christian Bizer.

Analysts are increasingly confronted with the situation that data which they need for a data mining project exists somewhere on the Web or in an organization’s intranet but they are unable to find it. The data mining tools that are currently available on the market offer a wide range of powerful data mining methods but hardly support analysts in searching for suitable data as well as in integrating data from multiple sources.

Our demonstration at ESWC2016 showed an extension to RapidMiner, a popular data mining framework, which enables analysts to search for relevant datasets and integrate discovered data with data that they already know. In particular, we support the iterative extension of data tables with additional attributes. To this end we propose (1) a data search and integration framework and (2) an initial Open Source implementation of the framework as a RapidMiner extension. 

The demo was one of the 19 demos accepted for presentation at the 13th European Semantic Web Conference (ESWC2016) and won the best demonstration award.

 The Rapidminer Extension is developed as part of the BMBF-funded research project DS4DM.