One of the most under-appreciated types of adaptive system are the ones that use Hebbian learning. It is because of their simplicity that they get ignored, but as we shall show, they do have some practical applications for which they are really good. In Synapse Hebbian learning is embodied by the Hebbian Layer component and the Hebbian and Oja's update rules.
Hebbian learning is named after Donald Hebb, a Canadian psychologist who in 1949 wrote a then revolutionary paper in which he stated:
"Let us assume that the persistence or repetition of a reverberatory activity (or "trace") tends to induce lasting cellular changes that add to its stability. When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased."
The theory can be summarized as "cells that fire together, wire together".
Starting with a matrix of weights (inputs x outputs), this rule can be mathematically expressed as follows:
In plain English, it states that the change in weights equals a learning constant (alpha) of the inputs times the outputs. So when an input triggers an output, you update the connection between those two. It is extraordinarily simple and has some nice practical features.
Anti-Hebbian Novelty Filtering
In practical terms, such a setup means that when a Hebbian layer is presented with an input pattern it has already seen, it will trigger a strong response. When it is presented by a pattern that is new, it will output a weak response. Now should we invert that, we get a novelty filter: a filter that detects new or anomalous patterns. The practical application of those are many - computer security (intrusion detection, network monitoring), machine supervision (breakdown detection), stock market supervision (detecting impeding crashes or trade irregularities), fraud detection and bio-monitoring, to name a few.
The best part: it's amazingly simple. To invert it, all we need to do is set a negative alpha constant and we get what is known as "anti-Hebbian" learning. The constant should be set to a small value as the Hebbian update can diverge otherwise.
So how do we use this little gem in Synapse?
First of all, let's get some data. You can find sample inputs from a plant here:
The first 200 samples represent the plant in normal operation, while the 100 last show the plant in anomalous operation. Here is a plot of the entire data set, can you see the difference between the normal operation and the abnormal one?
As you can see, it is pretty difficult to right away see the difference between the abnormal and the normal plant output. The Hebbian has however no problems with that.
Open the data in Synapse using the CSV File format and make sure to set the validation to 50%. That will result in the first 50 samples of the validation set being normal operation data, while the remaining 100 samples will be abnormal operation data. That way, if we plot the validation output of the system, we should be able to see the difference.
Select the Hebbaian Layer and in the Setting Browser, set the Learning Forward /"Step" property to -0.01. That's the anti-Hebbian constant (alpha in equation).
Select the lower right plot and in the Setting Browser, set its Buffering/"Set" property to "Validation". This instructs the plot to display validation channel data. The other two plots have the default value, which is the training channel data.
That's it. Go to training and press the play button. You'll very soon see something like this:
If we zoom in on the lower right plot:
As you can see, the Hebbian Layer has quite correctly detected the abnormal operation that starts on sample 50 of on the validation set. And this is done by one simple component with one very simple update rule.
If you had any trouble with following the steps above, you can download a complete Synapse solution here: novelty.xml
In the second part of this we will look at the Hebbian Layer, we will see how it can be used to compress data and extract features from data when it performs non-linear/local Principal Component Analysis.