
Fundamental Goals for Online Learning
Dr. Martin Zinkevich
Abstract In online learning, there is no specific training and test period: learners are evaluated on how they perform on every problem they face. Moreover, there is not explicit assumption that new problems will be similar to old ones. Such an assumption is invalidated when the examples the learner observes are influenced by another intelligent agent which is influenced in turn by the learner. In some domains, there is no good guarantee one can achieve; in others, performance similar to the offline setting is achievable. The goal of online learning is to design objectives that smoothly interpolate between these two extremes. In the past, online formulations have been developed for a variety of problems, such as regressions, classification, and repeated games. In this talk, I show a unified regret framework in which one simple algorithm can be used that obtains a guarantee in all these settings. Then, I show a second framework which addresses the drawbacks of traditional regret guarantees in the repeated prisoner's dilemma; this framework also sets a novel fundamental goal for reinforcement learning.
Short BioMartin Zinkevich was a NSF Graduate Research Fellow at Carnegie Mellon University where he earned his Ph.D. in Computer Science under Avrim Blum. At CMU, his research focused on the foundations of multi-agent learning. Afterwards, he worked with Amy Greenwald at Brown University and Michael Littman at Rutgers University on understanding multi-agent Q-learning. At present, he is an AICML Postdoctoral Fellow at the University of Alberta, where, among other projects, he is considering how to apply multi-agent techniques to the analysis and play of poker. |