Nov 19 - Lesson Plan 1. Introduction to Learning Exercise: Write down a learning problem. Whiteboard: Kinds of learning: What is the task? - Learning to classify data ==> classification learning - Learning a probabilistic model of the world ==> Bayesian learning - Discovering "interesting" properties of data ==> data-mining - Learning to parse a language ==> grammar learning - Learning to act in the world ==> reinforcement learning - Learning to "operationalize" knowledge ==> speed-up learning Not distinct! Much overlap How is the thing being learned represented? - A set of formulas in first-order logic - A set of formulas in propositional logic - A decision tree ** - A policy or universal plan - A Bayesian network ** - A neural network ** Basic concepts: - Training data - Attributes or Features -- how is data described? - Complete or Incomplete -- are all attributes of all examples known? - Labeled vs unlabeled -- are categories given, or discovered? - Test data - Generalization - Bias I: Prior knowledge - Bias II: Occum's razor, minimum description length - Error Students switch papers. Discuss some of the learning problems in terms of these concepts. 2. Learning Bayesian Networks Simpliest (but important!) case: Parameter Learning with Complete Data Given: Bayes net structure Training set - each item has value for EVERY variable in the network Maximum likelihood estimate of parameters (values in the CPT's) == values of CPT's that maximizes likelihood of the observations == actual statistics of the training data! 3. Assignment: Spam Filtering by Machine Learning 4. Beyond MLE: (a) Combining data with prior beliefs about the parameters (b) Parameter learning where the data is incomplete (does not include all variables in every example)