Probabilistic Vector Machines
Conference
64th ISI World Statistics Congress
Format: CPS Paper
Session: CPS 69 - Machine learning
Tuesday 18 July 5:30 p.m. - 6:30 p.m. (Canada/Eastern)
Abstract
Over the last few decades, kernel based methods have added an important set of tools to any professional statistician providing competitive, model free, alternatives to traditional statistical methodologies. In particular, in two-class supervised classification problems, kernel based Support Vector Machines (SVMs) are known to be among the most accurate predictors of unknown class memberships, and sequences of weighted SVMs can be used to accurately estimate the corresponding class probabilities. However, existing global all-in-one extensions of this approach to multiclass problems can be computationally too demanding, and do not scale well when the number of different classes grows. In this work, we will present an improved method to build reliable k-class probability estimates from global weighted SVMs with good scaling properties as k increases. Numerical experiments show that class probability estimation based on weighted SVMs is often more accurate than competing distribution free machine learning approaches, and more reliable than model based statistical methodologies when their assumptions can not be guaranteed. A public domain R package implementing the methods proposed in this paper is under preparation.