RapidMiner Linked Open Data Extension

Winner of the Semantic Web Challenge 2014

The RapidMiner Linked Open Data Extension is an extension to the open source data mining software RapidMiner. It allows using data from Linked Open Data both as an input for data mining as well as for enriching existing datasets with background knowledge. The RapidMiner Linked Open Data Extension is based on the earlier FeGeLOD framework (which is discontinued now).

Possible usages include (click to see details):

Unlike many related approaches, the RapidMiner Linked Open Data Extension may work in a completely unsupervised fashion, which means that almost no knowledge about the data source used and about technologies such as RDF and SPARQL is required to use it.

Download

The RapidMiner Linked Open Data Extension is available from the RapidMiner marketplace.

To install the extension, go to the “Help”->“Updates and Extensions” menu in RapidMiner, and search for “Linked Open Data”.

Operators

The extension provides three main categories of operators:

  1. Data importers that load data from Linked Open Data into RapidMiner for further processing
  2. Linkers that create links from a given dataset to a dataset in Linked Open Data (e.g., linking a CSV file to DBpedia)
  3. Generators that gather data from Linked Open Data and add it as attributes in the data set at hand

There are different kinds of generators in the extension, such as

  • Adding data attributes, such as population
  • Adding types, such as “G20 country”
  • Adding aggregated relations, such as number of companies located in a city

Adding arbitrary data using customizable SPARQL statementsThe operators provided by the Linked Open Data Extension can be used in conjunction with built-in RapidMiner operators as well as other extensions to build powerful Data Mining processes.

Documentation

All operators, as well as example workflows, are described in the user manual.

Publications

The extension itself, as well as the underlying algorithms, are described in:

The following publications discuss various applications that use the RapidMiner LOD Extension (or its predecessor FeGeLOD):

 

Support & Community

If you use the RapidMiner LOD extension, you may want to join the Google Group at

https://groups.google.com/forum/#!forum/rmlod

or contact the user community via its mailing list:

rmlodmail-googlegroups.com

Team

Project lead:

Current team:

Past contributors: