Given by Shipra Agrawal at the 2019 INFORMS Annual Meeting in Seattle, WA.
This tutorial discusses some recent advances in sequential decision making models that build upon the basic multi-armed bandit (MAB) setting to greatly expand its purview. Specifically, it discusses progress in algorithm design and analysis techniques for three models: (a) Contextual bandits, (b) Combinatorial bandits, and (c) Bandits with long-term constraints and non-additive rewards, along with applications in several domains such as online advertising, recommendation systems, crowdsourcing, healthcare, network routing, assortment optimization, revenue management, and resource allocation.