University of Alberta Dictionary of Cognitive Science: Learning Rate

In almost every supervised learning rule used to train a connectionist network the value added to a weight during learning is equal to the activity of the processor at the input end of the connection, multiplied by the activity of the processor at the output end of the connection, multiplied by some fractional value called a learning rate (Dawson, 2004). The learning rate is a value that controls the speed of learning, albeit in a non-intuitive manner. A user must use the learning rate to compromise between ideal mathematics and practical simulation. The smaller the learning rate, the closer is the network’s learning to the learning defined by calculus, because very small learning rates approximate infinitesimal changes in a system. However, the smaller the learning rate, the slower the simulation of learning. One can speed up the simulation of learning by using a larger learning rate, but this moves the simulation further from mathematics. It is very easy to create a system that oscillates in a state that is not near a minimum error because the learning rate is too large, preventing the network from making the appropriate steps downhill along an error space.

References:

Dawson, M. R. W. (2004). Minds And Machines: Connectionism And Psychological Modeling. Malden, MA: Blackwell Pub.

(Added April 2011)