CSE 590d Winter 2004

Computer-Based Learning Environments

Faculty coordinator: Steven Tanimoto

Meeting location: Sieg 322

Meeting time: Tues. 1:30-2:20


This quarter we will focus on mining event logs from educational applications for student modelling, particularly diagnostic assessment.  We'll be looking at papers from the data mining literature, and the educational assessment literature.

Organizational meeting
Please find 2-4 paper suggestions related to the topic for the quarter.  Read the abstracts and skim the papers to confirm their applicability and quality, and email the title, author a URL and a short (< 1 paragraph) synopsis of each to cse590d@cs by 1/11 (to give me time to put them all on the website.)  We'll base the rest of the quarter's readings on these paper suggestions, so the quality of your seminar experience is in your own hands.  All the suggested readings, including those we don't officially cover, will be listed below.

A Methodology for Evaluating Predictions of Transfer and an Empirical Aplication to Data from a Web-Based Intelligent Tutoring System: How to Improve Knowledge Tracing in Dialog Based Tutors
Neil T. Heffernan, Ethan A. Croteau
Lincoln Ritter
A Data Clustering Algorithm for Mining Patterns from Event Logs
Risto Vaarandi
Adam Carlson
Leverage Points for Improving Educational Assessment
Robert J. Mislevy, Linda S. Steinberg and Russell G Almond
Daryl Lawton
Martin, J. & VanLehn, K. (1995). Student assessment using Bayesian nets. International Journal of Human-Computer Studies, 42, pp. 575-591. Daryl Lawton
INFACT overview

VanLehn, K. & Niu, Z. (2001). Bayesian student modeling, user interfaces and feedback: A sensitivity analysis. International Journal of Artificial Intelligence in Education, 12(2) 154-184.  For background, visit  the Andes project page Daryl Lawton
Won Ng will talk about her work training HMMs to identify patterns in PixelMath logs.
For a brief introduction to Hidden Markov Models, take a look at:
Rabiner, L. R. (1989). "A tutorial on Hidden Markov Models and selected applications in speech recognition." Proceedings of the IEEE, 77(2): 257-286.
Sections I and II (pp. 257-262) give some concrete examples of HMMs, although the important aspects are the concepts described and not so much the actual terminology or equations.
Won Ng
Bill Winn will talk about the tools he has developed for analyzing sketch data.  Followed by a brainstorming session on mining this data
Bill Winn
Final wrapup - be prepared to discuss your idea for a student assessment data mining research project

Paper suggestions:
Using Knowledge Levels with AHA! for Discovering Interesting Relationships
Cristobal Romero, Paul De Bra, Sevastian Ventura, Carlos de Castro

Detecting Emerging Concepts in Textual Data Mining
William M. Pottenger, Ph.D. and David R. Gevry

Mining Complex Models from Arbitrarily Large Databases in Constant Time
Geoff Hulten, Pedro Domingos
Decision trees for large data streams (like web logs.)

A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering
Pedro Domingos, Geoff Hulten
Clustering large scale datasets

Catching Up with the Data: Research Issues in Mining Data Streams
Pedro Domingos, Geoff Hulten

A Data Clustering Algorithm for Mining Patterns from Event Logs
Risto Vaarandi
Another clustering algorithm for event logs.

Mining Concept-Drifting Data Streams Using Ensemble Classifiers
Haixun Wang, Wei Fan, Phillip S. Yu, Jiawei Han
Data mining for concepts that vary

Model-Based Clustering and Visualization of Navigation Patterns on a Web Site or possibly here.
Igor Cadez, David Heckerman, Christopher Meek, Padhriac Smyth, Steven White
Training and clustering HMMs to represent different types of visitors to a web site.  Could be adapted to clustering educational event logs?

Predicting Student Performance: An Application of Data Mining Methods with the Educational Web-Based System LON-CAPA
Behrouz Minaei-Bidgoli, Deborah A. Kashy, Gerd Kortemeyer, William F. Punch
Using genetic algorithms to optimize the performance of classifiers.

Data Mining at Colorado
http://www.ctlt.org/projects/data_mining/ and http://wwwctlt.org/documents/datamining/course_info/data_mining_summary.pdf
Interesting activity, but not a paper.  It's consistent that there would be so much mining going on in Colorado.
Extracting Experience through Protocol Analysis
PC Matthews, S Ahmed, M Aurisicchio
Lots of interesting hits on "data mining" "protocol analysis". This looks interesting as a type of paper.

Bayesian Modeling for Adaptive Hypermedia Systems
Nicola Henze and Wolfgang Nejdl
Not data mining, but seems to deal with estimating student state relative to a highly developed internal tutorial structure...

There are several papers  from Center for Research on Evaluation, Standards, and Student Testing (CRESST), University of California at Los Angeles that we may also be interested in:

Leverage Points for Improving Educational Assessment
Robert J. Mislevy, Linda S. Steinberg and Russell G Almond

A Four-Process Architecture for Assessment Delivery, With Connections to Assessment Design
Russell G. Almond and Linda Steinberg and Robert J. Mislevy
Introduction to the Biomass Project: An Illustration of Evidence-Centered Assessment Design and Delivery Capability
Linda S. Steinberg, Robert J. Mislevy, Russell G. Almond, Andrew B. Baird, Cara Cahallan, Louis V. Dibello, Deniz Senturk, and Duanli Yan, Howard Chernick, Ann C. H. Kindfield
Argument Substance and Argument Structure in Educational Assessment
Robert J. Mislevy

On the Structure of Educational Assessments
Robert J. Mislevy

Graphical Models and Computerized Adaptive Testing

Bayes Nets in Educational Assessment: Where do the Numbers come from?

Modeling Conditional Probabilities in Complex Educational Assessments
Robert J. Mislevy, Russell Almond, Lou Dibello, Frank Jenkins, Linda Steinberg, and Duanli Yan, Deniz Senturk

Design and Analysis in Task-Based Language Assessment
Robert J. Mislevy, Linda S. Steinberg and Russell G. Almond

Making sense of Data from Complex assessment

A sample Assessment using the four process Framework

Data Mining and Its Applications in Higher Education
Jing Luan

Data Preparation for Data Mining
Shichao Zhang, Chengqi Zhang and Qiang Yang

Web-log Mining for Quantitative Temporal-Event Prediction
Qiang Yang, Hui Wang and Wei Zhang

H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases
Jian Pei, Jiawei Han, Hongjun Lu, Shojiro Nishio, Shiwei Tang and Dongqing Yang

Process Mining: Discovering Workflow Models from Event-Based Data
A.J.M.M. Weijters and W.M.P van der Aalst