Reading List:
Choose one of the papers listed below and email me your selection.
Association Rule and Frequent Pattern Mining
-
Beyond Market Baskets: Generalizing Association Rules to
Correlations, Craig Silverstein, Sergey Brin, Rajeev
Motwani, Data Mining and Knowledge Discovery, 2, 1998,
pp. 39-68
- H-Mine:
Hyper-structure Mining of Frequent Patterns in Large Databases, J.
Pei, J. Han, H. Lu, S. Tang, and D. YangProc. of
the 2001 IEEE International Conference on Data Mining (ICDM'01),
San Jose, California, Novermber 29-December 2, 2001.
- J. Pei and J. Han,
Constrained Frequent Pattern Mining: A Pattern-Growth View, ACM SIGKDD
Explorations (Special Issue on Constraints in Data Mining), June 2002.
- Scalable
Techniques for Mining Causal Structures, Craig
Silverstein, Rajeev Motwani, Sergey Brin, and Jeff D.
Ullman, Proceedings of the 24th International Conference
on Very Large Data Bases (VLDB), 1998
- H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hätönen, H. Mannila
Pruning and Grouping Discovered Association Rules
Proceedings of the First International Conference on Knowledge Discovery in Databases
(KDD'95), Montrea, Canada, 1995.
- Brian Lent, Arun Swami and Jennifer Widom,
Clustering Association Rules, Proceedings of ICDE'97, Birmingham, English
1997.
- Qian Wan and Aijun An
An
Efficient Approach to Mining Indirect Associations, Journal of Intelligent
Information Systems (JIIS), Kluwer Academic Publishers, Vol.27, No.2, 2006.
- Xindong Wu, Chengqi Zhang and Shichao Zhang, Efficient Mining of Both Positive and
Negative Association Rules. ACM Transactions on Information Systems, 22(2004), 3: 381-405.
(SCI).
- Guozhu Dong and Jinyan Li Efficient Mining of Emerging Patterns: Discovering Trends and
Differences, KDD 1999: 43-52.
Spatial Association Rule Mining
- Koperski, K., and Han, J.,
Discovery of Spatial Association Rules in Geographic
Information Databases, Proc. 4th Int. Symp. Advances in Spatial Databases, 1995.
- Shekhar, S. and Huang, Y.,
Discovering Spatial Co-location Patterns: A Summary of Results, 2001.
- Malerba, D., Esposito, F. and Lisi, F.,
Mining Spatial Association Rules in Census Data, 2002.
Data Stream Mining
- Frequent Sequence Mining
- Approximate Frequency Counts over Data Streams, by Gurmeet
Singh Manku, Rajeev Motawani, in the
International Conference on Very Large Data Bases (VLDB) 2002.
- M.J. Zaki. SPADE: An
Efficient Algorithm for Mining Frequent
Sequences, Machine Learning, Vol.42, No.1/2, 2001.
- Jiong Yang, Wei Wang, Philip S. Yu: Infominer:
mining surprising periodic patterns. KDD 2001: 395-400
- R. C. Agarwal, C. C. Aggarwal, and V. Parsad.
Depth first generation of long patterns. In SIGKDD, 2000.
- J. Pei, J. Han, and W. Wang. Mining Sequential Patterns with
Constraints in Large Databases", Proc. the 11th International Conference on Information
and Knowledge Management (CIKM'02), McLean, VA, November 4-9, 2002.
- CloSpan:
Mining Closed Sequential Patterns in Large Databases, Xifeng Yan, Jiawei Han, Ramin
Afshar, Proceedings of the Third SIAM International Conference on Data Mining, San Francisco,
CA, USA, May, 2003.
- Finding Recent Frequent Itemsets Adaptively over Online
Data Streams,
by Joong Hyuk Chang, Won Suk Lee, in the ACM International Conference on Knowledge
Discovery and Data Mining (SIGKDD) 2003.
- Classification
- Clustering
Graph Mining
- Efficiently mining frequent
trees in a forest, Mohammed J. Zaki, KDD 2002.
- Frequent Subgraph
Discovery, Michihiro Kuramochi and George Karypis, ICDM, 2001.
- Substructure Similarity Search in Graph Databases. Xifeng Yan, Philip Yu, Jiawei Han,
SIGMOD'05.
-
Frequent Subtree Mining - An Overview, Yun Chi, Siegfried Nijssen, Richard Muntz, Joost Kok, Fundamenta Informaticae Special Issue on
Graph and Tree Mining, 2005.
Decision Tree Learning
- RainForest: A framework for fast decision tree
construction of large datasets, In VLDB'98, pp. 416-427, New York, NY, 1998.
- Boosting,
Bagging, and C4.5, J. R. Quinlan, AAAI'96, pp 725-730.
-
Learning Trees and Rules with Set-valued
Features, William W. Cohen, Proceedings of the Thirteenth National
Conference on Artificial Intelligence (AAAI-96), 1996.
- Cesar Ferri, Peter Flach and Jose Hernandez-Orallo, Learning Decision Trees Using the Area Under the ROC Curve, Proceedings of the 19th International Conference on Machine Learning, Morgan Kaufmann, July 2002,
pp.139-146.
Decision Rule Learning
Learning from Imbalanced Datasets
- PNrule: A New Framework for Learning Classifier
Models in Data Mining (A Case-Study in Network Intrusion Detection), Ramesh Agarwal and Mahesh V. Joshi,
2001.
- Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P, SMOTE:
Synthetic Minority Over-sampling TEchnique, Journal of Artificial
Intelligence Research, 16, 2002, 341-378.
Clustering
-
CACTUS-Clustering Categorical Data Using Summaries, Venkatesh Ganti,
Johannes Gehrke, Raghu Ramakrishnan, Proc. 5th ACM SIGKDD
International Conference on Knowledge Discovery and Data
Mining (KDD-99), 1999 Aug, pp. 73-83.
-
Clustering Large Datasets in Arbitrary Metric Spaces, Venkatesh
Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison L.
Powell, James C. French, Proceedings of the 5th
International Conference on Data Engineering, 23-26 March
1999, Sydney, Austrialia, IEEE CS Press, 1999, pp.
502-511
- ROCK:
A Robust Clustering Algorithm for Categorical Attributes,
Sudipto Guha, Rajeev Rastogi, Kyuseok Shim, Proceedings
of the 15th International Conference on Data Engineering,
23-26 March 1999, Sydney, Austrialia, IEEE CS Press,
1999, pp. 512-521.
-
BIRCH: an efficient data clustering method for very large
databases, Tian Zhang, Raghu Ramakrishnan, Miron
Livny, Proceedings of the 1996 ACM SIGMOD international
conference on Management of data , 1996, pp. 103-114.
- CURE:
An Efficient Clustering Algorithm for Large Databases,
Sudipto Guha, Rajeev Rastogi, Kyuseok Shim, Proceedings
of the ACM SIGMOD Conference, 1998.
-
A Density-Based Algorithm for Discovering Clusters in Large
Spatial Databases with Noise, M. Ester M., H.-P.
Kriegel, J. Sander, X. Xu, Proc. 2nd Int. Conf. on
Knowledge Discovery and Data Mining (KDD-96), Portland,
OR, 1996, pp. 226-231
Mining XML Documents
- J. W. W. Wan and G. Dobbie, Mining association rules from XML documents
using XQuery, In Proceedings of the second workshop on Australasian
information security, Data Mining and Web Intelligence, and Software
Internationalisation.
- Winkler and Spiliopoulou,
Extraction of Semantic XML DTDs from Texts Using Data Mining Techniques, K-CAP 2001
Workshop on Knowledge Markup and semantic annotation, Victoria, B.C., Canada, 2001.
- Braga, Campi, Ceri, Klemettinen, and Lanzi,
A Tool for Extracting XML Association Rules from XML Documents, ICTAI'02.
Web Mining
- Larry Page, Sergey Brin, R. Motwani, T. Winograd,
The PageRank
Citation Ranking: Bringing Order to the Web, Technical Report, Computer Science Department, Stanford University, 1998.
- J. Kleinberg, Authoritative sources in a hyperlinked environment, In Proc. Ninth Ann. ACM-SIAM Symp. Discrete Algorithms, pages 668-677, ACM Press, New York, 1998.
- Data mining of user navigation patterns,
J. Borges and M. Levene, In Web Usage Analysis and User
Profiling, pp. 92-111. Published by Springer-Verlag as Lecture Notes in
Computer Science, Vol. 1836, 2000.
.
- A
Framework for Collaborative, Content-Based and Demographic
Filtering, Michael J. Pazzani, Artificial
Intelligence Review.
- Learning to Extract Symbolic Knowledge
from the World Wide Web, M. Craven, D. DiPasquo, D. Freitag, A. McCallum,
T. Mitchell, K. Nigam and S. Slattery, Proceedings of the 15th National
Conference on Artificial Intelligence (AAAI-98), pp. 509-516,
Madison, WI. AAAI Press.
Privacy Preserving Data Mining
- Privacy Preserving Mining of Association Rules,
by Evfimievski, R. Srikant, R. Agrawal and J. Gehrke, KDD 2002.
- Using Randomized Response Techniques for
Privacy-Preserving Data Mining, by Wenliang Du and Zhijun Zhan, SIGKDD 2003.
- Privacy-Preserving K-Means Clustering over Vertically
Partitioned Data, by Jaideep Vaidya and Chris Clifton, SIGKDD 2003.
- Collaborative Filtering with Privacy, by John Canny, IEEE S&P 2002.