A framework for statistical inference via randomized algorithms
Conference
64th ISI World Statistics Congress
Format: IPS Abstract
Keywords: inference, randomization
Session: IPS 502 - Bernoulli Society New Researcher Award Session 2023
Tuesday 18 July 2 p.m. - 3:40 p.m. (Canada/Eastern)
Abstract
Large datasets are now common in many areas, and their analysis poses major challenges.
Randomized algorithms, such as randomized sketching and subsampling, are a promising approach to ease the computational burden. However, randomized algorithms also produce non-
deterministic outputs, leading to the problem of quantifying their accuracy.
In this paper, we develop a statistical inference framework for uncertainty quantification via
randomized methods such as sketching. This framework allows inference using either multiple
runs of the same randomized algorithm (similar to the use of subsampling), or by estimating the
unknown parameters of the limiting distribution. Using our framework, we develop methods for
statistical inference in the fundamental problem of least squares regression. These methods rest
on characterizing the asymptotic distribution—as the sample size grows—of various estimators
obtained via sketching (such as sketch-and-solve, partial sketching, iterative sketching), for a
variety of distributions of sketching matrices. The results are supported via a broad range
of simulations. Our analysis rests on, and further develops techniques from random matrix
theory.