Neural Network Structures

Lateral Inhibition

 

Adapted By:      Yash Gad

 

Written By:        Thomas Anastasio

 

Grade Level:      9 - 12

 

Subjects:              Applied Mathematics – Matrix Algebra

                                    Biology – Neuroscience 

 

Description:        A look at one of the most fundamental and commonly seen neural computations involving groups of neurons in a network.

 

Objectives:          Students will be able to analyze the outputs of sets of laterally inhibited neurons in different situations, and represent these using simple functions.

 

Material:                (Optional) Computers with mathematics software, such as Mathematica or Matlab

 


Background

Lateral inhibition is the most commonly known form of computation by neural networks. Laterally inhibitory architectures are characterized by inhibition off to the sides of some form of straight ahead excitation.

Lateral inhibition finds many uses throughout the brain. In this unit, we will study computational models of lateral inhibitory networks that process signals spatially temporally.

Spatial processing is a result of the interactions between neighboring input neurons, which will have overlapping fields of activity.

Temporal processing is a result of these interactions occurring over time, with the activity from the output layer being fed back to the input layer through recurrent connections.


Scenario I – Basic Lateral Inhibition

The goal will be to construct a laterally inhibitory network, and assign vectors for the input and output units, and a matrix to declare the connection weights between them.

From this point onwards, networks will now be too large to make a diagram for. We will instead rely purely on the matrix representations of the units and their connections.

To begin, create 11 input neurons and 11 output neurons. This will be a row of 11 elements for each. For the matrix describing lateral inhibition, make each input unit send a strong excitatory projection to its corresponding output unit (right “below”), and inhibitory projections to the output units on either side. Since we don’t want there to be an imbalance on the edges, we will say that output neurons 1 and 11 are neighbors (we will see how to express this shortly).

When there was only one projection from an input neuron to its output neuron, we could describe the weights with a simple row of terms. For this and future assignments, we will want instead to dedicate one row of weights to each input neuron. This row will have a number of elements equal to the number of output neurons (which allows us to use more or less outputs than inputs).

For this scenario, the weights for Input neuron 1 are given by:

Weight1 = [2  –1   0   0   0   0   0   0   0   0  –1]

Likewise, the weights for Input neuron 2 are given by:

Weight2 = [–1  2  –1   0   0   0   0   0   0   0   0]

The rest of the connection matrix can be written accordingly, so that we have:

Now that we have the basic architecture in place, it is time to actually have it do something. We will feed this network a step function (which is exactly what is sounds like – a function that has some fixed value for a specific range, and is zero elsewhere). For this example, we will introduce a value of 3 to units 4 to 8, and a value of zero to all others. When we do the computation, this is the result we see:

The blue curve (the top curve if you are viewing this in black and white) is the original step that we presented. The slopes on the sides are simply artifacts of the way MatLab plots graphs – this is still a step function. The green curve (the bottom curve) represents a plot of the activities of the output neurons. For this particular example, we have not set the threshold we described in the first unit. When we look at the output without any kind of filter, we can clearly see the kind of computation being performed spatially.

This particular network configuration has taken the step function, and performed a second derivative. The output graph is this second derivative, only rotated around the x-axis. In a more general sense, the input neurons have become sensitive to changes in the input signal.

A real world example to relate this to would be a visual scene. If the step represented the presence of the visual scene, the output would be sensitive to the “edges” of the scene.

 

 

 

 

 

 

Scenario II – Difference of Gaussian Weight profiles

The Difference of Gaussian (DOG) profile is a common connectivity profile in neural network models. It is easily constructed as simply the difference of 2 Gaussian curves, with different variances. A short tutorial on Gaussian curves can be found at:

http://mathworld.wolfram.com/NormalDistribution.html

The first Gaussian we will use, g, with a variance (var) of 0.75 can be constructed using the following formula:

 

Another Gaussian, d, with a variance of 1.5, can be made the same way. Both of these will be discrete Gaussians, if we use only integers (a nice range would be   something like –5 to 5). A DOG p can be constructed using:

 

p = g – (0.5 * d)

 

Note that to make the DOG profile, the broader Gaussian (the one with the larger variance) is being scaled and subtracted from the narrower Gaussian (the one with the smaller variance). This profile can be incorporated into the weight matrix, but we just need to flip the first and second halves of the graph so that the input and outputs are aligned. The result looks something like this:

 

Using the step input you generated before, you can now look at the output of this network with a DOG weight profile.

 

The output appears as a smoothed version of the flipped second derivative that was observed in Scenario I. This makes sense, since the connectivity profile used in this case is a smoothed version of the original profile. We will see an even more pronounced effect when we use Gaussian pairs with larger variances (such as 1.5 and 3.0, or 3.0 and 6.0).

In an example that you may want to try on your own, or use as an advanced problem, we built a larger version of the above scenario (using 51 outputs and 51 inputs). If we presented our 3 DOG weight distributions with a spiky input (the top graph in blue), the result is the output seen in the 3 bottom graphs. Effectively what happens is that broader DOG distributions are less able to follow rapid changes in the input pattern. 

 

 

 

 

 

 

 

 

 

 

 

 

Scenario III

The lateral inhibition networks we worked with before are quite powerful spatial processors, but to get temporal processing we need recurrent (i.e. feedback) connections. In general, recurrent connections can occur from a layer to a previous layer, or between units in the same layer, or both.

Recurrent lateral inhibitory networks typically employ recurrent connections between the units in the output layer, as well as the “feed-forward” type connections from the input to the output layer that we previously worked with.

As for feed-forward connection weights, recurrent connection weights are represented in a matrix. The states of the output units in such a recurrent network will then be a function both of feed-forward and recurrent connections. It is necessary to represent (discrete) time steps in recurrent networks, because the states of the output units at time step (n+1) are functions of the states of the input and output units at time step n. This can be expressed by: 

(1)

where u and x are the input and output unit state vectors, and V and W are the feed-forward and recurrent connection weight matrices. y(x) is a nonlinear function meant to represent the real limits on neural firing-rate. It ensures that the elements of x are greater than zero but less than some saturation level a

(2).

When the recurrent connection profile (making up matrix W) is a DOG having both positive and negative values, the recurrent connections form both positive and negative feedback loops. The result of this for network dynamics is that some units can be driven to the saturation limit a while other units are driven to zero. The overall strength of the feedback can be controlled by the value of parameter b. The network is said to relax into an activity pattern that is specific for a given, constant input. Because time in the network is discrete, the state equation describing relaxation in the network (1) can be solved iteratively.

Now for an example. Construct a recurrent, lateral inhibitory network. To make things simple, set the feed-forward weight matrix to be the identity matrix, and make the recurrent weight profiles a DOG. Use variances of 3.0 and 15.0 for the Gaussians, and scale the broader curve by 0.3. Remember to rotate the 2 halves of the result. Make an input vector which is the positive half-cycle of a sinusoid. Set the saturation level a to 10, and compute the output to this input as the recurrent network relaxes for 20 iterations.

Set rate parameter b to 0.1 for your first experiment, and 1.0 for the second.

The result for an example we did for 51 input and 51 output units is:

The curve on the left is the first experiment, and the second is on the right. The blue curve is the initial output (at the first time step), and the other graphs represent subsequent time steps. You will probably notice several things:

 

1)     The curves seem to converge to a steady level of activity across the network over time

2)     The first experiment didn’t really have much of an effect

3)     In the second experiment, something really strange happened J

 

In the second experiment, those inputs that received inputs near the peak of the sinusoid were driven to saturation over time. The rest were driven to zero. This is what is called a winner-take-all network. This type of pattern is frequently seen in the brain when you only want a special subset of the output neurons to respond to your input.