Data Mining
Concepts and Techniques
- 4th Edition
- Morgan Kaufmann July 2, 2022
- 752
Chapter 1: Introduction 1.1. What is data mining? 1.2. Data mining: an essential step in knowledge discovery 1.3. Diversity of data types for data mining 1.4. Mining various kinds of knowledge 1.5. Data mining: confluence of multiple disciplines 1.6. Data mining and applications 1.7. Data mining and society 1.8. Summary 1.9. Exercises 1.10. Bibliographic notes Bibliography Chapter 2: Data, measurements, and data preprocessing 2.1. Data types 2.2. Statistics of data 2.3. Similarity and distance measures 2.4. Data quality, data cleaning, and data integration 2.5. Data transformation 2.6. Dimensionality reduction 2.7. Summary 2.8. Exercises 2.9. Bibliographic notes Bibliography Chapter 3: Data warehousing and online analytical processing 3.1. Data warehouse 3.2. Data warehouse modeling: schema and measures 3.3. OLAP operations 3.4. Data cube computation 3.5. Data cube computation methods 3.6. Summary 3.7. Exercises 3.8. Bibliographic notes Bibliography Chapter 4: Pattern mining: basic concepts and methods 4.1. Basic concepts 4.2. Frequent itemset mining methods 4.3. Which patterns are interesting?—Pattern evaluation methods 4.4. Summary 4.5. Exercises 4.6. Bibliographic notes Bibliography Chapter 5: Pattern mining: advanced methods 5.1. Mining various kinds of patterns 5.2. Mining compressed or approximate patterns 5.3. Constraint-based pattern mining 5.4. Mining sequential patterns 5.5. Mining subgraph patterns 5.6. Pattern mining: application examples 5.7. Summary 5.8. Exercises 5.9. Bibliographic notes Bibliography Chapter 6: Classification: basic concepts and methods 6.1. Basic concepts 6.2. Decision tree induction 6.3. Bayes classification methods 6.4. Lazy learners (or learning from your neighbors) 6.5. Linear classifiers 6.6. Model evaluation and selection 6.7. Techniques to improve classification accuracy 6.8. Summary 6.9. Exercises 6.10. Bibliographic notes Bibliography Chapter 7: Classification: advanced methods 7.1. Feature selection and engineering 7.2. Bayesian belief networks 7.3. Support vector machines 7.4. Rule-based and pattern-based classification 7.5. Classification with weak supervision 7.6. Classification with rich data type 7.7. Potpourri: other related techniques 7.8. Summary 7.9. Exercises 7.10. Bibliographic notes Bibliography Chapter 8: Cluster analysis: basic concepts and methods 8.1. Cluster analysis 8.2. Partitioning methods 8.3. Hierarchical methods 8.4. Density-based and grid-based methods 8.5. Evaluation of clustering 8.6. Summary 8.7. Exercises 8.8. Bibliographic notes Bibliography Chapter 9: Cluster analysis: advanced methods 9.1. Probabilistic model-based clustering 9.2. Clustering high-dimensional data 9.3. Biclustering 9.4. Dimensionality reduction for clustering 9.5. Clustering graph and network data 9.6. Semisupervised clustering 9.7. Summary 9.8. Exercises 9.9. Bibliographic notes Bibliography Chapter 10: Deep learning 10.1. Basic concepts 10.2. Improve training of deep learning models 10.3. Convolutional neural networks 10.4. Recurrent neural networks 10.5. Graph neural networks 10.6. Summary 10.7. Exercises 10.8. Bibliographic notes Bibliography Chapter 11: Outlier detection 11.1. Basic concepts 11.2. Statistical approaches 11.3. Proximity-based approaches 11.4. Reconstruction-based approaches 11.5. Clustering- vs. classification-based approaches 11.6. Mining contextual and collective outliers 11.7. Outlier detection in high-dimensional data 11.8. Summary 11.9. Exercises 11.10. Bibliographic notes Bibliography Chapter 12: Data mining trends and research frontiers 12.1. Mining rich data types 12.2. Data mining applications 12.3. Data mining methodologies and systems 12.4. Data mining, people, and society
Data Mining: Concepts and Techniques, Fourth Edition introduces concepts, principles, and methods for mining patterns, knowledge, and models from various kinds of data for diverse applications. Specifically, it delves into the processes for uncovering patterns and knowledge from massive collections of data, known as knowledge discovery from data, or KDD. It focuses on the feasibility, usefulness, effectiveness, and scalability of data mining techniques for large data sets.
After an introduction to the concept of data mining, the authors explain the methods for preprocessing, characterizing, and warehousing data. They then partition the data mining methods into several major tasks, introducing concepts and methods for mining frequent patterns, associations, and correlations for large data sets; data classification and model construction; cluster analysis; and outlier detection. Concepts and methods for deep learning are systematically introduced as one chapter. Finally, the book covers the trends, applications, and research frontiers in data mining.