number of parameters (the number of components of ) is much less than the number of states, and changing one parameter changes the estimated value of many states
neural network and statistical methods all assume a static training set over which multiple passes are made. In reinforcement learning, however, it is important that learning be able to occur on-line, while interacting with the environment or with a model of the environment.
nonstationary target functions (target functions that change over time)
inputs are states and the target function is the true value function ,
Better approximation at some states can be gained, generally, only at the expense of worse approximation at other states.