[Colloq] Talk - Tuesday, March 23, 2pm - Reinforcement Learning by Policy Search

Mon Mar 22 09:25:35 EST 2004

*College** of **Computer** and Information Science Colloquium and the AI 
Seminar*

presents*
Leon Peshkin**
**Harvard** **University** *

who will speak on:*
Reinforcement Learning by Policy Search*

      *Tuesday, March 23, 2004*

2:00pm
149 Cullinane Hall

Northeastern University

ABSTRACT

Teaching is hard, criticizing is easy. This metaphor stands behind the 
concept
of reinforcement learning as opposed to supervised learning. Reinforcement
learning means learning a policy---a mapping of observations into
actions---based on feedback from the environment. Learning can be viewed as
browsing a set of policies while evaluating them by trial through 
interaction
with the environment. In this talk I briefly review the framework of
reinforcement learning and present two highlights from my dissertation.
First, I describe an algorithm which learns by ascending the gradient of
expected cumulative reinforcement. I show what conditions enable experience
re-use in learning. Building on statistical learning theory, I address the
question of sufficient experience for uniform convergence of policy
evaluation and obtain sample complexity bounds. Second, I demonstrate an
application of the proposed algorithm to the complex domain of simulated
adaptive packet routing in a telecommunication network. I conclude by
suggesting how to build an intelligent agent and where to apply
reinforcement learning in computer vision and natural language processing.

Keywords: MDP, POMDP, policy search, gradient methods, reinforcement
          learning, adaptive systems, stochastic control, adaptive 
behavior.//