# CSE/NB 528 Optional Homework: Learning in Neurons and Networks

This homework is optional – it contains extra credit problems.

[Extra credit points will be added separately to the course total,

so you will not be penalized if you skip extra credit problems.]

midnight the last day of classes (Friday, June 7, 2013 midnight).

`Extra Credit Submission Procedure:`
`Create a Zip file called "528-extracredit-lastname-firstname" containing the following:`
`(1)  Document with write-up specifying the extra credit problem you are attempting, with your `
`answers to any questions asked in the problem, as well as any figures, plots, or graphs `
`supporting your answers,`
`(2) Your Matlab program files,`
`(3) Any other supporting material needed to understand/run your solutions in Matlab.`

`Upload your file by 11:59pm Friday, June 7, 2013. `
` `

`1.    Unsupervised Learning (20 points): Write Matlab code to implement Oja’s Hebb `
`          rule (Equation 8.16 in the Dayan & Abbott textbook) for a single linear neuron `
`          (as in Equation 8.2) receiving as input the 2D data provided in c10p1.mat`
`          but with the mean of the data subtracted from each data point. Use “load –ASCII `
`          c10p1.mat” and type “c10p1” to see the 100 (x,y) data points. You may plot them using `
`          “scatter(c10p1(:,1),c10p1(:,2))”. Compute and subtract the mean (x,y) value from each `
`          (x,y) point. Display the points again to verify that the data cloud is now centered around`
`          0. Implement a discrete-time version (like Equation 8.7) of the Oja rule with a = 1. `
`          Start with a random w vector and update it according to w(t+1) = w(t) + delta*dw/dt, `
`          where delta is a small positive constant (e.g., delta = 0.01) and dw/dt is given by the Oja `
`          rule (assume tw = 1). In each update iteration, feed in a data point u = (x,y) from `
`          c10p1. If you’ve reached the last data point in c10p1, go back to the first one and `
`          repeat. Keep updating w until the change in w, given by norm(w(t+1) - w(t)), is negligible `
`          (i.e., below an arbitrary small positive threshold), indicating that w has converged.`
`a.                To illustrate the learning process, print out figures displaying the current weight vector `
`           w and the input data scatterplot on the same graph, for different time points during the `
`           learning process.`
`b.               Compute the principal eigenvector (i.e., the one with largest eigenvalue) of the zero-`
`           mean input correlation matrix (this will be of size 2 x 2). Use the matlab function “eig” `
`           to compute its eigenvectors and eigenvalues. Verify that the learned weight vector w `
`           is proportional to the principal eigenvector of the input correlation matrix (read `
`           Sections 8.2 and 8.3).`
` `

2.    Supervised Learning (20 points): In class, we discussed neural networks that use either a threshold or sigmoid activation function. Consider networks whose neurons have linear activation functions, i.e., each neuron’s output is given by g(a) = ba+c, where a is the weighted sum of inputs to the neuron, and b and c are two fixed real numbers.

a.     Suppose you have a single neuron with a linear activation function g as above and input x = [x1,…,xn]T and weights W = [W1,…,Wn]T. Write down the squared error function for this input if the true output is y.

b.    Write down the weight update rule for the neuron based on gradient descent on the above error function.

c.     Now consider a network of linear neurons with one hidden layer of m units, n input units, and one output unit. For a given set of weights wkj in the input-hidden layer and Wj in the hidden-output layer, write down the equation for the output unit as a function of wkj, Wj, and input x (you can write your answer in vector-matrix form or using summations). Show that there is a single-layer linear network with no hidden units that computes the same function.

d.    Given your result in (c), what can you conclude about the computational power of N-hidden-layers linear networks for N = 1, 2, 3, …? Explain your answer.

` `
` `

3.     Reinforcement Learning (20  points)

Implement actor critic learning (equations 9.24 and 9.25 in the Dayan & Abbott textbook)

`in Matlab for the maze of figure 9.7, with learning rate epsilon = 0.5 for both actor and critic, and `
`beta = 1 for the critic. Starting from zero weights for both the actor and critic, plot learning `
`curves as in figures 9.8 and 9.9. Next, start from a policy in which the agent is biased to `
`go left at both B and C, with initial probability 0.99 (and with 0.5 initial probability at A). `
`How does this affect learning at A?`
`(Note: You will need to sample from a discrete distribution to get an action for each location. `
`You can write your own code for this or use code available online such as this (with n = 1).)`