1	PSYCO 452 Week 2: Association As A Building Block Of Connectionism Laws of Association Building Associations Into Networks The Hebb Rule The Delta Rule
2	Association Forever Memory has been studied for thousands of years Aristotle was interested in chains of memory, and proposed laws of association to explain successive thoughts Aristotle wrote “Acts of recollection happen because one change is of a nature to occur after another” Aristotle considered three different kinds of relationships between the starting image and its successor: similarity, opposition, and (temporal) contiguity
3	Laws Of Association Aristotelian theory evolved into British empiricism, and later into the association psychology Scholars argued over the nature of the laws that permitted two ideas to be associated, so that the occurrence of one idea would lead naturally to the subsequent occurrence of the other One law that persisted throughout this evolution was some form of the law of contiguity
4	First Building Block: Association One of the key building blocks for a connectionist system is a method for storing associations between and input and output pattern Let us begin by considering a couple of simple methods by which this sort of association could be achieved We will focus on bringing the law of contiguity to life: “When two elementary brain-processes have been active together or in immediate succession, one of them, on reoccurring, tends to propagate its excitement into the other” (James, 1890)
5	Classical Conditioning
6	Simultaneity & Conditioning
7	Hebb And Association “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes place in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased” (Hebb, 1949) Principle of contiguity!
8	Content Addressable Memory Modern views of Hebb learning involve the strengthening of synapses (both excitatory and inhibitory) as well as the weakening of synapses These two processes have been combined to create many interesting models of content addressable memories
9	Memory Addressing Modes “Address” addressable memory Retrieve items by content-independent location
10	Distributed Memory A simple distributed memory system consists of two sets of processors, and a set of modifiable connections between them
11	Hebb Rule 1 Present two patterns of activity Associate the patterns because of their temporal contiguity Later, one pattern will cue the other
12	Hebb Rule 2 Make more excitatory the connections between same-state processors Make more inhibitory the connections between opposite-state processors
13	Hebb Rule 3 To recall, activate processors with the cue Their activity sends a signal through existing connections
14	Hebb Rule 4 The network signal should reconstruct the other pattern in the second set of processing units
15	Demonstrating Associative Learning Let’s examine the Hebb rule in action Let us also determine some conditions in which Hebb learning does not work very well
16	Desired Weight Changes
17	Algebra Of The Hebb Rule Let W(t) be a matrix of connection weights at time t Let a and b be two to-be-associated vectors Hebb learning becomes: W(t+1) = W(t) + a b’ The outer product defines Hebb learning!
18	Learning Associations
19	Algebra Of Recall Recall from the memory involves filtering a signal through existing weights to produce output activity r = Wc
20	Recall Proven
21	Limitations Of Hebb Rule We can use linear algebra to reveal some interesting limitations of Hebb learning For instance, what if we relax the mutual orthogonality constraint? What if the correlation between c and a is equal to 0.5?
22	Correlation And Error
23	Correcting The Hebb Rule We would like to develop a new kind of Hebb learning rule This rule would permit the network to correctly recall correlated patterns This rule would also allow the network to improve its performance with repeated presentations of patterns
24	Error And Weight Change
25	The Delta Rule The delta rule can be viewed as a Hebb-style association between an input vector and an (output) error vector Repeated applications will reduce error The amount of learning depends on the amount of error The delta rule can be written as: D_t+1 = h ((t - o) · c^T)
26	Linear Independence One vector can be created by combining (adding) two others If we have a set of vectors, and none of the vectors can be created by combining the others, the set of vectors is said to be linearly independent If the vectors are such that one can be created by combining some of the others, then the set is linearly dependent
27	Demonstrating The Delta Rule Let’s examine the delta rule in action Let us note some instances in which it serves as an improvement over Hebb learning But let us also note that it is still subject to limitations
28	Two More Building Blocks How do we move beyond the sorts of limitations that we have noted in the simple distributed memory? First, we need to add nonlinearities into the processing units, letting them make decisions Second, we need to add some methods by which layers of these nonlinearities can be coordinated together These will be our topics in later lectures