Regularization in Mixture-of-Experts Models with Ultra-High Dimensionality
Conference
65th ISI World Statistics Congress 2025
Format: IPS Abstract - WSC 2025
Keywords: complex and high-dimensional modelling
Session: IPS 824 - Unveiling the Power of Mixture Models in a Data-Rich World
Thursday 9 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)
Abstract
Mixture-of-experts (MoE) provide a flexible statistical model to capture unobserved heterogeneity in data. In modern applications of MoEs, the dimension of the feature space is large compared to a typical sample size of a training dataset, and hence statistical inference becomes intrinsically challenging. In this work, we propose and study, for the first time, penalized likelihood estimation and feature selection methods in ultrahigh-dimensional spaces for sparse MoEs, where the experts belong to generalized linear regression models (GLM). We refer to this model as sparse MoE-GLM. Under general conditions, we establish consistency in estimation and feature selection for the proposed methods. We assess the empirical performance of the methods through extensive simulations and illustrate their application in a real data analysis. Our work offers a comprehensive and detailed analysis of regularization methods for sparse MoE-GLM models in an ultrahigh-dimensional setting. This work is based on a chapter from the PhD thesis of my student, Pengqi Liu, at McGill University.