This seminar covers topics related to integrating data from large numbers of independent data sources. This includes large-scale schema matching, identity resolution, data fusion, set completion, data search, and data profiling.
In this seminar, you will
The following books are good starting points for getting an overview of the topic of large-scale data integration:
Explore the list of topics below and select at least 3 topics of your preference. Send a ranked list of your selected topics via email to anna
Students are free to suggest additional topics of their choice that are related to large-scale data integration.
The final seminar presentations will take place on Friday 04.05 and Monday 07.05 in room A305, B6 Building A.
We have assigned the following timeslots:
|Collective Instance Matching||Friday, 04.05, 10:30 - 10:55|
|Holistic Schema Matching||Friday, 04.05, 10:55 - 11:20|
|Active Learning for Entity Resolution - Blocking||Friday, 04.05, 11:20 - 11:45|
|Active Learning for Entity Resolution - Query Strategies||Friday, 04.05, 11:45 - 12:10|
|Data Search for Table Extension||Friday, 04.05, 12:10 - 12:35|
|Dataspace Profiling||Monday, 07.05, 10:15 - 10:40|
|Wrapper Induction for Knowledge Base Completion||Monday, 07.05, 10:40 - 11:05|
|Set Completion using Semi-Structured Web Data||Monday, 07.05, 11:05 - 11:30|
|Truth Discovery for Knowledge Base Completion||Monday, 07.05, 11:30 - 11:55|