The paper was co-authored by Babette Bühler, formerly a student in the Mannheim Master in Data Science and now a PhD student at the Hector Research Institute of Education Sciences and Psychology at the University of Tübingen, and Heiko Paulheim. It discusses a new approach for classifying tables on the Web. While current approaches use explicit features derived from the HTML code, the paper explores a different approach by rendering the table as an image, and using pre-trained neural networks for image classification.
A preprint of the paper will be available soon.
The code and data for the paper are available here.