7.7 Local Responsibility

Whether a behavior is rewarded or not can depend on local success (e.g., did an agent do its job) or global success (e.g., did the agent doing its job help the system). Reinforcement on the basis of local success is easy to build into a machine. "It is harder to implement a global learning scheme because this requires machinery to find out which agents are connected all the way to the original goal by unbroken chains of subgoals."

When global success is used as a criterion for reinforcement, agents may *not* learn from their experiences! (NB: I think this idea is *very* cool -- it is worth considering for a while, and thinking about the implications for learning!!)

Both approaches to reward have advantages -- maybe a system must be flexible enought to choose which to use, and when to use it.

"The global scheme requires some way to distinguish not only which agents' activities have helped to solve a problem, but also which agents helped with which subproblems." (Question: Is this a version of the credit assignment problem, or does it require a much more refined notion of goals and subgoals. Similarly, when credit assignment is solved by a technique like error backpropagation in a connectionist network, is this focusing on local success, global success, or both?)

Pearl Street | Society of Mind Home Page | Dawson Home Page |