Data Mining
COSC-6412
Fall 2007
York University


Semester: Fall 2007
Course/Sect#: COSC-6412
Time: Tue 10am-11:30am
Thu 10am-11:30am
Location: FS 106
Instructor: Aijun An
Office: CSB 2048
Office Hours: Thu: 11:30am - 1:00pm
Phone #: 416-736-2100 x44298
e-mail: aan@cse.yorku.ca


Welcome to the Data Mining course, COSC-6412, for Fall 2007. Materials, instructions, and notices for the course will accumulate here over the semester.


Message Board

January 7, 2008
Grades are posted. You can check yours by using "courseInfo 6412 2007-08 F".
December 4, 2007
The final exam is set for Tuesday December 11 at 10:00am - 12:00 noon in room CB 129 (Chemistry Building room 129). Please note that both time and location have been changed.
December 4, 2007
A solution to A1 is posted. Please see Solutions to Q2, Q4, Q5 and Q6 and Solution to Q3.
November 14, 2007
The deadline for submitting A2 is extended to November 18 at 6pm.
November 5, 2007
An FAQ page for A2 is set up. Please see A2 Frequently Asked Questions.
November 1, 2007
Assignment 2 is posted. See the link below in the "Assignments" section.
October 26, 2007
Paper presentation schedule is posted.
October 22, 2007
The reading list for student paper presentations is posted. See the links below in the "Paper Review and Presentation" section for the reading list and requirements for the presentation.
September 30, 2007
Assignment 1 is posted. See the link below under "Assignments".
September 5, 2007
The web site is set up. Welcome to the course!


Description

Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, and clustering.


Prerequisites

  • Required: an introductory course on database systems and an introductory course on probability.
  • Preferred: basic knowledge on statistics.


Reference Books and Materials

  • Jiawei Han and Micheline Kamber, Data Mining -- Concepts and Techniques, Morgan Kaufmann, Second Edition, 2006.
  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
  • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
  • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
  • Some conference/journal papers (More will be posted over the semester).


Grading Scheme

  • Assignments (20%)
  • Final exam (30%)
  • Paper review and presentation (10%)
  • Course project (30%)
  • Participation (10%)


Lecture Notes


Assignments

  • Assignment 1 (10%) (Due Tuesday October 16 in class)
  • Assignment 2 (10%) (Due Friday November 16 by 5pm, extended to Sunday November 18 by 6pm)


Paper Review and Presentation


Project


Useful On-line Information