Structured Learning in Sequential Selection Problems

4/28/22 | 4:15pm | E25-111





Assaf Zeevi

Kravis Professor of Business

Abstract: In this talk I will describe two vignettes of sequential decision making problems arising in the OR literature. The first is the classical “house selling” problem (or optimal stopping):  given a random sequence of independent observations revealed one at a time over some finite horizon of play, the objective is to design an algorithm that “stops’’ this sequence to maximize the expected value of the ``stopped” observation. The second vignette is the worker assignment problem (or sequential stochastic assignment):  arriving items (“jobs”) with independent stochastic attributes need to be sequentially matched to a pool of awaiting recipients (“workers”).  Once each job/worker are matched they are no longer admissible for further assignment, and the objective is to maximize the expected cumulative value of the resulting matchings. Both problems date back more than 50 years and can be solved in a relatively straightforward manner using dynamic  programming; both have been used as modeling constructs in numerous applications including recent online marketplaces. Somewhat surprisingly, neither has been studied extensively when the underlying information on the stochastic primitives is unknown a priori. While it is possible to analyze this incomplete information setting with generic multi-purpose learning theoretic methods, these  tend to be quite inefficient when  further “structure” is present in the problem and can be exploited to construct more customized algorithms. The main focus of this talk is to broadly describe possible learning theoretic formulations of the two vignettes, and illustrate some “design principles” for constructing near-optimal policies.

Bio: Assaf Zeevi is Professor and holder of the Kravis chair at the Graduate School of Business, Columbia University. His research and teaching interests lie at the intersection of Operations Research, Statistics, and Machine Learning. In particular, he has been developing theory and algorithms for reinforcement learning, Bandit problems, stochastic optimization,  statistical learning and stochastic networks. Assaf's work has found applications in online retail, healthcare analytics, dynamic pricing, recommender systems, and social learning in online marketplaces.

Assaf received his B.Sc. and M.Sc. (Cum Laude) from the Technion, in Israel, and subsequently his Ph.D. from Stanford University. He spent time as a visitor at Stanford University, the Technion and Tel Aviv University. He is the recipient of several teaching and research awards including a CAREER Award from the National Science Foundation, an IBM Faculty Award, Google Research Award, as well as several best paper awards including the 2019 Lanchester Prize.   Assaf has recently served a term as Vice Dean at Columbia Business School and Editor-in-Chief of Stochastic Systems (the flagship journal of INFORMS' Applied Probability Society). He also serves on various other editorial boards and program committees in the Operations Research and Machine Learning communities, as well as scientific advisory boards for startup companies in the high technology sector.

Event Time: 

2022 - 16:15