PRE-REQUISITE:
Note: Graduate students may be able to use this course to satisfy the
semi-core of Scientific and Statistical Computing.
Course structure
In this course we will focus on the following five topics related to
data mining and data warehouse:
This course is unique in several aspects. First, this course will cover
topics that attempt to bridge statistics and information theory through
a concept of patterns. In doing so, a framework is provided to apply
statistical approach for conducting EDA (Exploratory Data Analysis) for
data mining, and information theory is used to interpret the meaning
behind the discovery through EDA. Second, the instructor will share his
multi-disciplinary collaboration experience in a number of related fields
such as statistics and probability, information theory, database, and
computational intelligence. Third, the instructor will emphasize the
importance of implementation to demonstrate the practicality of the
approach discussed in this course.
Various tools have been implemented and will be made available for
this course. Commercial tools that may be used in this course include:
Insightful I-Miner data mining tool, S-PLUS, Mathcad. Other tools
developed from our previous research that may also be used in this
course include: Oracle based integrated environment for data warehousing
and data mining, ActiveX and/or Java Data Constructor utility, Patent
pending ActiveX and Java software for model discovery and probabilistic
inference, ActiveX Bayesian network software, S-PLUS script for
discovering signification event association patterns, and Mathcad
application for change point detection.
Textbook and web resources
Permission by department for students registering CSCI 780;
Equivalent background of MATH 241 and CSCI 086 for students
registering CSCI 381.3