Example: Hybrid Recommender System Using Linked Open Data

This process is using the Rapid Miner Linked Open Data extension and the Recommender extension, to build a hybrid Linked Open Data enabled recommender system for books. In this process we are showing how different types of features extracted from Linked Open Data can be combined together to build content-based recommender, which then can be combined with collaborative user and item based recommender system. The corresponding RapidMiner workflow can be downloaded from myExperiment. The data was obtained from the Linked Open Data-enabled Recommender Systems Challenge, and the input data files for the process can be downloaded here.

Fig. 1 depicts the overall workflow in RapidMiner. First, we are reading the input CSV file, where all books are already linked to DBpedia. We are using the links to extract additional features from DBpedia. For that purpose, we are using three generators, Direct Types, Specific Relation and Custom SPARQL Generator. All generators are configured with the standard DBpedia SPARQL endpoint. The Specific Relation generator is used to extract the direct categories of each book in the dataset.

We are using the Custom SPARQL Generator to retrieve the genres of the author, and genres that influenced the author, i.e. the genres of the authors that influenced the current author, or authors that were influenced by the current author. Fig. 2 depicts part of that query.

The output of the generators are joined in single datase, which is then converted into a format required by the item attribute based k-NN operator of the Recommender extension. More about the input formats, as well as instruction about the Recommender extension, can be found in the extension manual. Using the item attribute based k-NN operator, a content based recommender is built, which is using all features generated from DBpedia. 

Next, we are reading the training data, which contains user ratings for different books. The data is used to build two collaborative based recommenders: user k-NN based and Item k-NN based recommender. Both recommeders are combined with the previously built content based recommender, using the Model Combiner operator. The output of this operator is a single model that can be used for predicting ratings of unseen books.