Projects
Doctoral Dissertation
My dissertation is
titled "Policy-based Exploration for Efficient
Reinforcement Learning". My advisors are Prof.
Charles Isbell and Prof. Andrea Thomaz (Interactive
Computing, Georgia Tech).
Abstract -
Reinforcement Learning (RL) is the field of research
focused on solving sequential decision- making tasks
modeled as Markov Decision Processes. Researchers have
shown RL to be successful at solving a variety of problems
like system operations (logistics), robot tasks (soccer,
helicopter control) and computer games (Go, backgammon);
however, in general, standard RL approaches do not scale
well with the size of the problem. The reason this problem
arises is that RL approaches rely on obtaining samples
useful for learning the underlying structure. In this work
we tackle the problem of smart exploration in RL,
autonomously and using human interaction. We propose
policy-based methods that serve to effectively bias
exploration towards important aspects of the domain.
Reinforcement Learning agents use function approximation
methods to generalize over large and complex domains. One
of the most well-studied approaches is using linear
regression algorithms to model the value function of the
decision-making problem. We introduce a policy-based
method that uses statistical criteria derived from linear
regression analysis to bias the agent to explore samples
useful for learning. We show how we can learn exploration
policies autonomously and from human demonstrations (using
concepts of active learning) to facilitate fast
convergence to the optimal policy. We then tackle the
problem of human-guided exploration in RL. We present a
probabilistic method which combines human evaluations,
instantiated as policy signals, with Bayesian RL. We show
how this approach provides performance speedups while
being robust to noisy, suboptimal human signals. We also
present an approach that makes use of some of the inherent
structure in the exploratory human demonstrations to
assist Monte Carlo RL to overcome its limitations and
efficiently solve large-scale problems. We implement our
methods on popular arcade games and highlight the
improvements achieved using our approach. We show how the
work on using humans to help agents efficiently explore
sequential decision-making tasks is an important and
necessary step in applying Reinforcement Learning to
complex problems.
You can find a copy of the dissertation here.
Master's Dissertation
My thesis is titled "
HELP - Human assisted Efficient Learning
Protocols". My advisor is Prof. Michael Littman (CS
department, Rutgers University) and Prof. Zoran Gajic (ECE
department, Rutgers University).
Abstract -
In recent years, there has been growing attention towards the
development of artificial agents that can naturally communicate
and interact with humans. The focus has primarily been on creating
systems that have the ability to unify advanced learning
algorithms along with various natural forms of human interaction
(like providing advice, guidance, motivation, punishment, etc).
However, despite the progress made, interactive systems are still
directed towards researchers and scientists and consequently, the
everyday human is unable to exploit the potential of these
systems. Another undesirable component is that in most cases, the
interacting human is required to communicate with the artificial
agent a large number of times, making the human often fatigued. In
order to improve these systems, this thesis extends prior work and
introduces novel approaches via Human-assisted Efficient Learning
Protocols (HELP).
Three case studies are presented that detail distinct aspects of
HELP - a) representation of the task to be learned and its
associated constraints, b) the efficiency of the learning
algorithm used by the artificial agent and c) the unexplored
"natural" modes of human interaction. The case studies will show
how an artificial agent is able to efficiently learn and perform
complex tasks using only a limited number of interactions with a
human. Each of these studies involves human subjects interacting
with a real robot and/or simulated agent to learn a particular
task. The focus of HELP is to show that a machine can learn better
from humans if it is given the ability to take advantage of the
knowledge provided by interacting with a human partner or teacher.
You can find a copy of the dissertation
here.