Version 2.0 of the CaLiGraph knowledge graph was released today. The graph comprises a detailed ontology with over 1 million classes, and a total of over 16 million entities.
To extract entities, most extraction approaches create an entity per Wikipedia page. CaLiGraph, on the other hand, also applies entity recognition in tables and listings on Wikipedia pages, which leads to a much larger coverage. At the same time, CaLiGraph learns the patterns which connect entities in a listing, which allows it to also extract structured information about the entities discovered.
The dataset, as well as more information, can be found at online.