Multi-Layered Neural Network

nn
view applet and source

So after a fierce battle with my own neurons, I am ready to release part II of my Processing series: “Neural Network! Huah! What is it good for? (Sing it again, now.)”

This example implements a multi-layered neural network that learns via “back propogation.” It’s specifically trained to solve XOR. In other words, there are two inputs and the desired result is input1 XOR input2.

0,1 –> 1
1,0 –> 1
0,0 –> 0
1,1 –> 0

The structure looks something like this:

11102006163

However, I think there might be a flaw in my back propogation learning algorithm. For whatever reason, with the above neural structure, I can only successfully train my network (starting with random connection weights between -1 and 1) approximately 60% of the time. For the other 40%, the network gets stuck and can’t find the proper solution. If I add two more neurons to the hidden layer, like so. . .

11102006164

. . . it trains flawlessly, finding a reasonable solution space after a few thousand training iterations 100% of the time (or at least as far as I can reasonably test.) What am I missing?

Anyway, a more involved tutorial about the theory, concepts, algorithms, and code behind neural networks is forthcoming. . . at some point. . . after I invent that machine that makes time that is . .

If you are downloading the source, note that the code for the nn.jar package is contained in /xor/code/src/nn. Because I’m using a large number of classes in the design of the network, I didn’t want to restrict myself to Processing tabs. Update (2/08/10): New download link: http://www.shiffman.net/teaching/nature/nn/

Perceptron

perceptron
view applet and source

Long overdue, I’ve started working on a series of examples that implement neural networks. First up is the simplest, a little Perceptron that learns whether points live on one side of a line (in Cartesian space) or the other.

y = x*0.9-0.2

In this example, the perceptron is trained via an array of known point objects (with known answers), and the resulting “guess” line is displayed in real-time. I made the learning constant rather low so that one can see the slow progression of changing weights. I’ve been spending some quality time with Artifical Intelligence, by George Luger. It’s a wonderful book, and even better, it’s free for download online!

All the code is in the link, but here’s a quick peek at the meat of the matter: a function inside the Perceptron class that adjusts weights according to 3 input values and their corresponding “known” output. (Note if the perceptron’s guess output produces the desired result, the weights are not changed.)

A more involved write-up will arrive online at some point. . .

// Function to train the Perceptron
// Weights are adjusted based on "desired" answer
void train(float[] vals, int desired) {
  // Sum all the weights
  float sum = 0;
  for (int i = 0; i < weights.length; i++) {
    sum += vals[i]*weights[i];
  }
  // The result is the sign of the sum
  int result = 1;  // Start with 1
  if (sum < 0) result = -1; // If less than zero, change to -1
  // Compute factor to change weight
  // (DESIRED - RESULT): note this can only be 0, -2, or 2
  // Multiply by learning constant
  float weightChange = c*(desired - result);
  // Adjust weights based on weightChange * input
  for (int i = 0; i < weights.length; i++) {
    weights[i] += weightChange * vals[i];         
  }
}

For related work, check out Aaron Steed’s site.