Supplementary material for ICDM16 paper “What You Will Gain By Rounding: Theory and Algorithms for Rounding Rank”

This page contains supplementary material for our ICDM16 paper “What You Will Gain By Rounding: Theory and Algorithms for Rounding Rank” by Stefan Neumann, Rainer Gemulla, and Pauli Miettinen.

Abstract

When factorizing binary matrices, we often have to make a choice between using expensive combinatorial methods that retain the discrete nature of the data and using continuous methods that can be more efficient but destroy the discrete structure. Alternatively, we can first compute a continuous factorization and subsequently apply a rounding procedure to obtain a discrete representation. But what will we gain by rounding? Will this yield lower reconstruction errors? Is it easy to find a low-rank matrix that rounds to a given binary matrix? Does it matter which threshold we use for rounding? Does it matter if we allow for only non-negative factorizations? In this paper, we approach these and further questions by presenting and studying the concept of rounding rank. We show that rounding rank is related to linear classification, dimensionality reduction, and nested matrices. We also report on an extensive experimental study that compares different algorithms for finding good factorizations under the rounding rank model.

Publications

S. Neumann, R. Gemulla, P. Miettinen
What You Will Gain By Rounding: Theory and Algorithms for Rounding Rank [pdf, extended version]
To appear in ICDM, 2016

Resources

Source code and synthetic dataset generators (tar.gz)