Optimal Data Analysis after 80 Years Struggle to Calculate Bayes Error Rate
Conference
64th ISI World Statistics Congress
Format: CPS Poster
Keywords: classification, clustered-data, discrimination
Session: CPS Posters-03
Monday 17 July 4 p.m. - 5:20 p.m. (Canada/Eastern)
Abstract
The minimum classification error, known as the Bayes error rate, is the basis of optimal data analysis. The error was introduced into discriminant analysis by Fisher in 1936. In the absence of analytical, numerical, or resampling methods to calculate Bayes error rates, hundreds of researchers proposed sharp bounds for its estimation. After 86 years of struggling to calculate the Bayes error rate, it can now be calculated analytically and computed at bayeserror.com. This paper reviews some selected works on the Bayes error rate and highlights their roles in optimal data analysis. Optimal data analysis is a Bayesian error rate paradigm focused on minimizing the error rate of the model. It tries to solve some of the problems associated with interpreting data. It can be compared with classical, Bayesian, sequential, or statistical evidence paradigms.