Subgaussian Importance Sampling for Off-Policy Evaluation and Learning

7 March 2022
kmlcube

Authors

Alberto Maria Metelli, Alessio Russo, Marcello Restelli

Abstract

Importance Sampling (IS) is a widely used building block for a large variety of off-policy estimation and learning algorithms. However, empirical and theoretical studies have progressively shown that vanilla IS leads to poor estimations whenever the behavioral and target policies are too dissimilar. In this paper, we analyze the theoretical properties of the IS estimator by deriving a probabilistic deviation lower bound that formalizes the intuition behind its undesired behavior. Then, we propose a class of IS transformations, based on the notion of power mean, that are able, under certain circumstances, to achieve a subgaussian concentration rate. Differently from existing methods, like weight truncation, our estimator preserves the differentiability in the target distribution.

Full paper

Subgaussian Importance Sampling for Off-Policy Evaluation and Learning

Authors

Abstract

Programming is a woman’s job

Artificial intelligence in everyday life

Federated Learning To Predict Oxygen Needs

Deepfake: typologies and reflections, deep learning and GANs

Artificial Neural Networks to Understand the Functioning of the Mind

AI algorithm for diagnosing Covid-19 and other pathologies

Online Planning for F1 Race Strategy Identification

Subgaussian Importance Sampling for Off-Policy Evaluation and Learning