Theory of Posterior Concentration for Generalized Bayesian Additive Regression Trees
Conference
65th ISI World Statistics Congress 2025
Format: CPS Abstract - WSC 2025
Session: CPS 14 - Ordinal Data and Tree-Based Methods
Wednesday 8 October 4 p.m. - 5 p.m. (Europe/Amsterdam)
Abstract
Bayesian Additive Regression Trees (BART) are a powerful semiparametric ensemble learning tech- nique for modeling nonlinear regression functions. Although initially BART was proposed for predicting only continuous and binary response variables, over the years multiple extensions have emerged that are suitable for estimating a wider class of response variables (e.g. categorical and count data) in a multitude of application areas. In this paper we describe a Generalized framework for Bayesian trees and their additive ensembles where the response variable comes from an expo- nential family distribution and hence encompasses a majority of these variants of BART. We derive sufficient conditions on the response distribution, under which the posterior concentrates at a mini- max rate, up to a logarithmic factor. In this regard our results provide theoretical justification for the empirical success of BART and its variants.