Self-Normalized Off-Policy Estimators for Ranking

Ben London, Alexander Buchholz, Giuseppe Di Benedetto, Jan Malte Lichtenberg, Yannik Stein, Thorsten Joachims

September 2023

PDF

Abstract

We propose two new estimators for off-policy evaluation of ranking policies, based on the idea of self-normalization. Importantly, these estimators are parameter-free and asymptotically unbiased. Experiments with synthetic data demonstrate that our estimators can be more accurate than other importance weighting estimators, owing to their ability to control variance, while adding minimal bias. From this, we conclude that self-normalization offers an optimal balance of accuracy and practicality for off-policy ranker evaluation.

Type

Workshop

Publication

CONSEQUENCES Workshop – RecSys