SM445/CS 707: Data and Web Science Seminar (HWS 2022)

The Data and Web Science seminar covers recent topics in data and web science. This term's seminar is for BSc students only and focuses on foundational concepts and algorithms in data mining.

Organization

Goals

In this seminar, you will

  • Read, understand, and explore technical and scientific literature
  • Summarize a data mining topic in a concise report (10 single-column pages + references)
  • Give two presentations about your topic (3 minutes flash presentation, 15 minutes final presentation)
  • Moderate a scientific discussion about the topic of one of your fellow students
  • Review a (draft of a) report of a fellow student

Schedule

  • Register as described below.
  • Attend the online kickoff meeting on TBD.
  • Work individually throughout the semester according to the seminar schedule.
  • Meet your advisor for guidance and feedback.

Registration

Register via Portal 2 until Sep 5.

If you are accepted into the seminar, provide at least 4 topics of your preference (your own and/or example topics; see below) by Sep 11 via email to Daniel Ruffinelli. The actual topic assignment takes place soon afterwards; we will notify you via email. Our goal is to assign one of your preferred topics to you.

Topics

Each student explores and presents a fundamental concept or algorithm in the area of data mining. Your presentation and report is typically based on (i) on a book chapter (or two) and (ii) additional recent literature. Your goal is to clearly explain the approach, why it is relevant, and a short discussion of recent work and applications. We provide example topics from Zaki and Meira, Jr.'s 2020 book below. If you want, you may suggest a different topic within the area of data mining (talk to us before the topic assignments).

Example topics

  1. Dimensionality reduction
  2. Itemset mining
  3. Sequence mining
  4. Graph pattern mining
  5. Rule mining
  6. Representative-based clustering
  7. Hierarchical clustering
  8. Density-based clustering
  9. Graph clustering
  10. Linear discriminant analysis
  11. Support vector machines
  12. Decision trees
  13. Multi-layer perceptrons (MLP)

Supplementary materials and references