The last unit was the introduction to Bayes networks. This week covers Machine Learning, which is the problem of figuring out the structure of the networks to begin with if they aren't known. There are two main sub-categories of machine learning.
- Supervised Learning
- Unsupervised Learning
- What is learned? parameters, structure, hidden concepts
- What from? target labels, replacement principles, feedback (reinforcement)
- What for? prediction, diagnostics, summarize, etc.
- How? passive (just observations), active (agent changes environment), online, off-line
- Outputs. classification, regression
- Details. generative, discriminative
Occam's Razor. Everything else equal, choose the less complex hypothesis. Or make things as simple as possible but not simpler. Want to minimize the generalization error, not the training data error.
Unsupervised learning is mainly about density estimation. There are several approaches like clustering or dimensionality reduction. Blind signal (or source) separation is another interesting application for unsupervised learning. The example given is how to separate a recording of two speakers into two separate streams. Some practical tips about choosing k (the number of clusters) in k-means learning.
- Add some constant penalty per k to the log-likelihood
- Guess initial k
- Run Expectation Maximization
- Remove unnecessary clusters
- Create new random clusters near poorly represented data
- Repeat from EM step
I did actually use Python to answer one of the quiz questions (how many unique words in a "bag of words"), and like last time I did arithmetic for Bayes rule in a spreadsheet. I haven't done any Lisp programming yet.