CSE/NB 528 Optional Homework: Learning in Neurons and Networks

This homework is optional – it contains extra credit problems.

[Extra credit points will be added separately to the course total,

so you will not be penalized if you skip extra credit problems.]

Please turn in your solutions to these extra credit problems by

midnight the last day of classes (Friday, June 7, 2013 midnight).

Extra Credit Submission Procedure:

Create a Zip file called "528-extracredit-lastname-firstname" containing the following:

(1)  Document with write-up specifying the extra credit problem you are attempting, with your

answers to any questions asked in the problem, as well as any figures, plots, or graphs

supporting your answers,

(2) Your Matlab program files,

(3) Any other supporting material needed to understand/run your solutions in Matlab.

Upload your Zip file to this dropbox.

Upload your file by 11:59pm Friday, June 7, 2013.

1.    Unsupervised Learning (20 points): Write Matlab code to implement Oja’s Hebb

          rule (Equation 8.16 in the Dayan & Abbott textbook) for a single linear neuron

          (as in Equation 8.2) receiving as input the 2D data provided in c10p1.mat

          but with the mean of the data subtracted from each data point. Use “load –ASCII

          c10p1.mat” and type “c10p1” to see the 100 (x,y) data points. You may plot them using

          “scatter(c10p1(:,1),c10p1(:,2))”. Compute and subtract the mean (x,y) value from each

          (x,y) point. Display the points again to verify that the data cloud is now centered around

          0. Implement a discrete-time version (like Equation 8.7) of the Oja rule with a = 1.

          Start with a random w vector and update it according to w(t+1) = w(t) + delta*dw/dt,

          where delta is a small positive constant (e.g., delta = 0.01) and dw/dt is given by the Oja

          rule (assume t_w = 1). In each update iteration, feed in a data point u = (x,y) from

          c10p1. If you’ve reached the last data point in c10p1, go back to the first one and

          repeat. Keep updating w until the change in w, given by norm(w(t+1) - w(t)), is negligible

          (i.e., below an arbitrary small positive threshold), indicating that w has converged.

a.                To illustrate the learning process, print out figures displaying the current weight vector

           w and the input data scatterplot on the same graph, for different time points during the

           learning process.

b.               Compute the principal eigenvector (i.e., the one with largest eigenvalue) of the zero-

           mean input correlation matrix (this will be of size 2 x 2). Use the matlab function “eig”

           to compute its eigenvectors and eigenvalues. Verify that the learned weight vector w

           is proportional to the principal eigenvector of the input correlation matrix (read

           Sections 8.2 and 8.3).

2. Supervised Learning (20 points): In class, we discussed neural networks that use either a threshold or sigmoid activation function. Consider networks whose neurons have linear activation functions, i.e., each neuron’s output is given by g(a) = ba+c, where a is the weighted sum of inputs to the neuron, and b and c are two fixed real numbers.

a. Suppose you have a single neuron with a linear activation function g as above and input x = [x₁,…,x_n]^T and weights W = [W₁,…,W_n]^T. Write down the squared error function for this input if the true output is y.

b. Write down the weight update rule for the neuron based on gradient descent on the above error function.

c. Now consider a network of linear neurons with one hidden layer of m units, n input units, and one output unit. For a given set of weights w_kjin the input-hidden layer and W_j in the hidden-output layer, write down the equation for the output unit as a function of w_kj, W_j, and input x (you can write your answer in vector-matrix form or using summations). Show that there is a single-layer linear network with no hidden units that computes the same function.

d. Given your result in (c), what can you conclude about the computational power of N-hidden-layers linear networks for N = 1, 2, 3, …? Explain your answer.

3. Reinforcement Learning (20 points)

Implement actor critic learning (equations 9.24 and 9.25 in the Dayan & Abbott textbook)

in Matlab for the maze of figure 9.7, with learning rate epsilon = 0.5 for both actor and critic, and

beta = 1 for the critic. Starting from zero weights for both the actor and critic, plot learning

curves as in figures 9.8 and 9.9. Next, start from a policy in which the agent is biased to

go left at both B and C, with initial probability 0.99 (and with 0.5 initial probability at A).

How does this affect learning at A?

(Note: You will need to sample from a discrete distribution to get an action for each location.

You can write your own code for this or use code available online such as this (with n = 1).)