Model Selection as a Multiple-Objective Programming Problem

Abstract

In this paper, we address the statistical problem of model selection, i.e. choosing a model from a set of candidates based on a sample of data. The problem is most commonly framed as a response to overfitting, that is the tendency of a model to fit the training data too closely, thereby failing to generalize well to new data or make reliable out-of-sample predictions. A standard approach to mitigating overfitting involves balancing model goodness-of-fit, typically measured by the objective function used in estimation, against model complexity, often quantified by the number of parameters. While most model selection methods aim to approximate or correct certain theoretical quantities in finite samples (e.g., the Kullback–Leibler divergence in the case of AIC), we propose an alternative approach. We formulate model selection as a multi-criteria optimization problem and apply the weighted sum method to balance competing objectives. We identify desirable asymptotic properties for model selection procedures and derive necessary and sufficient conditions on the weights that ensure these properties are satisfied. Additionally, we demonstrate that these conditions are closely connected to limit theorems for objective functions. Our results underscore the differences between model selection in nested model frameworks and in settings where models may be arbitrarily related.

Date
Jul 1, 2025 — Jul 3, 2025
Location
Varese, Italy
Raffaello Seri
Raffaello Seri
Professor of Econometrics

My research interests include statistics, numerical analysis, operations research, psychology, economics and management.

Related