Data Mining

Instructor: Hossein Jowhari
Semester: Fall 2019 (98-1)
Time: SUN, TUE 9:00-10:30 am
Textbooks:
(1) Data Mining: Concepts and Techniques (3rd Ed). Jiawei Han, Micheline Kamber and Jian Pei (book slides)
(2) Mining Massive Data Sets (2nd Ed). Jure Leskovec, Anand Rajaraman, Jeff Ullman (pdf)

Lectures :

Lecture Subject Material References
1 General information about the course slides -
2 Introduction to data mining slides Textbook (1) - chapter 1, Textbook (2) - chapter 1
3 Statistical descriptions of data, Some useful inequalities Textbook (1) - chapter 2, Sushant Sachdeva's notes on concentration bounds pdf
4 Finding similar items, Distance measures, minHash Textbook (2) - chapter 3
5 minHash, Locality Sensitive Hashing (LSH) Textbook (2) - chapter 3
6 LSH families of functions Textbook (2) - chapter 3
7 Clustering, K-means algorithm Textbook (1) - chapter 10
8 K-means algorithm, K-center Clustering Textbook (1) - chapter 10, Sanjoy Dasgupta's notes pdf, Michael Dinitz's notes pdf

Assignments :

  • hw1
  • hw2, data set
  • Useful Links :

  • Sample plots using matplotlib
  • A tutorial on how to draw boxlots using python