65th ISI World Statistics Congress 2025

65th ISI World Statistics Congress 2025

Continuous Testing

Conference

65th ISI World Statistics Congress 2025

Format: CPS Abstract - WSC 2025

Keywords: e-values, evidence, hypothesis test

Session: CPS 1 - Statistical Theory

Tuesday 7 October 4 p.m. - 5 p.m. (Europe/Amsterdam)

Abstract

Testing has developed into the fundamental statistical framework for falsifying hypotheses. Unfortunately, tests are binary in nature: a test either rejects a hypothesis or not. Such binary decisions do not reflect the reality of many scientific studies, which often aim to present the evidence against a hypothesis and do not necessarily intend to establish a definitive conclusion. We propose the continuous generalization of a test, which we use to measure the evidence against a hypothesis. Such a continuous test can be viewed as a continuous and non randomized interpretation of the classical ‘randomized test’. This offers the benefits of a randomized test, without the downsides of external randomization. Another interpretation is as a literal measure, which measures the amount of binary tests that reject the hypothesis. Our work unifies classical testing and the recently proposed e-values: e-values bounded to [0, 1/α] are continuously interpreted size α randomized tests. Taking α to 0 yields the regular e-value: a ‘level 0’ continuous test. Moreover, we generalize the traditional notion of power by using generalized means. This produces a framework that contains both classical Neyman-Pearson optimal testing and log-optimal e-values, as well as a continuum of other options. The traditional p-value appears as the reciprocal of generally invalid ] continuous test. In an illustration in a Gaussian location model, we find that optimal continuous tests are of a beautifully simple form