Counterfactual Accuracies of Alternative Models. ML-IRL: Machine Learning in Real Life Workshop at ICLR 2020.
Abstract: Typically we fit a model by optimizing performance on training data. Here we focus on the case of a binary classifier that predicts ‘yes’ or ‘no’ for any given test point. We explore a notion of confidence in a particular prediction by asking: If we were to fit an alternative classifier from our model class to the same training data, how much training accuracy would we have to give up so that the prediction for the test point would change?