Introduction to Bandits in Recommender Systems

by Andrea Barraza-Urbina (NUI Galway) and Dorota Glowacka (University of Helsinki)

Abstract

The multi-armed bandit problem models an agent that simultaneously attempts to acquire new knowledge (exploration) and optimize his decisions based on existing knowledge (exploitation). The agent attempts to balance these competing tasks in order to maximize his total value over the period of time considered. There are many practical applications of the bandit model, such as clinical trials, adaptive routing or portfolio design. Over the last decade there has been an increased interest in developing bandit algorithms for specific problems in recommender systems, such as news and ad recommendation, the cold start problem in recommendation, personalization, collaborative filtering with bandits, or combining social networks with bandits to improve product recommendation.

The aim of this tutorial is to provide participants with the basic knowledge of the following concepts: (a) the exploration-exploitation dilemma and its connection to learning through interaction; (b) framing of the recommender systems problem as an interactive sequential decision-making task that needs to balance exploration and exploitation; (c) basic fundamentals behind bandit approaches that address the exploration-exploitation dilemma; and (d) a general picture of the state-of-the-art of bandit-based recommender systems. With this tutorial we hope to enable participants to start working on bandit-based recommender systems and to provide a framework that would empower them to develop more advanced approaches.

The tutorial is divided into three sections focused on: (1) general motivation and introduction to classic bandit approaches; (2) hands-on session where a simple synthetic recommendation task representing a bandit problem with linear rewards will be used; and (3) overview of a variety of applications of bandit algorithms in recommendation systems summarizing the current state and an outline of challenges applying bandit algorithms in recommendation systems.

This introductory tutorial is aimed at an audience with background in computer science, information retrieval or recommender systems who have a general interest in the application of machine learning techniques in recommender systems. The prerequisite knowledge is basic familiarity with machine learning and basic knowledge of statistics and probability theory. The tutorial will provide practical examples based on Python code and Jupyter Notebooks.

Stop Thinking, Just Do!