CSC 422 - Automated Learning and Data Analysis
Catalog Description:This course provides an introduction to concepts and methods for extracting knowledge or other useful forms of information from data. This activity, also known under names including data mining, knowledge discovery, and exploratory data analysis, plays an important role in modern science, engineering, medicine, business, and government. Students will apply supervised and unsupervised automated learning methods to extract patterns, make predictions and identify groups from data. Students will also learn about the overall process of data collection and analysis that provides the setting for knowledge discovery, and concomitant issues of privacy and security. Examples and projects introduce the students to application areas including electronic commerce, information security, biology, and medicine. Students cannot get credit for both CSC 422 and CSC 522.
Contact Hours:
- Lecture: 3 hours
Co-requisites: None
Restrictions: None
Coordinator: Dr. Min Chi
Textbook: Introduction to Data Mining
Course Outcomes:
- Identify and contrast the major types of data and data representations with clear examples;
- List and explain the problems arising in preparing data for analysis, and the methods for addressing these problems;
- List and explain representative benefits and dangers of automated learning and data analysis;
- Identify ethical issues in data analysis applications, such as the impacts of data bias;
- Implement and apply various methods for supervised and unsupervised automated learning (e.g. Decision Trees, KNN, Naive Bayes, ANNs Regression, Clustering);
- Compare the strengths, weaknesses, and prerequisites of automated learning techniques;
- Explain and contrast methods for evaluating the performance of automated learning algorithms (e.g. holdout, k-fold crossvalidation, and leave-one-out crossvalidation);
- Design a detailed plan of analysis for a realistic data set;
- Apply automated data analysis tools to carry out a data analysis plan;
- Motivate, justify, and qualify conclusions obtained from an analysis.
Topics:
- Data Types
- Data Preparation
- Exploratory Data Analysis
- Decision Trees
- PCA
- Evaluating Classifiers
- KNN
- Ensemble Methods
- Naive Bayes
- Bayes Net
- Linear Regression
- Logistic Regression
- Artificial Neural Nets
- Deep Learning
- Support Vector Machines
- Clustering
- Association Analysis
See Course Listings