What is big in big data: a sampling statistician perspective
Conference
64th ISI World Statistics Congress
Format: SIPS Abstract
Keywords: data-collection, data-quality-management, error, selection bias, survey_sampling
Tuesday 18 July 8:30 a.m. - 9:40 a.m. (Canada/Eastern)
Abstract
BigData users and BigData research community are fast growing bigger and bigger while statisticians at large seem to become divided between those who are enthusiastic and those who are concerned, or downright hostile. Is BigData also a big step ahead, truly advancing our ability to extract meaningful information and actual knowledge from data? Is BigData underplaying traditional statistical inference as we know it? Supplanting survey methodology as a low- cost futuristic option? As a (mainly) sampling statistician, I will attempt to unravel the multifaceted relationship connecting BigData to sampling methodology. Starting by reflecting on why it should be interesting to look at BigData from a sampling statistician's perspective, I will delve into the somewhat ambiguous definition of BigData and share some very personal considerations and views on the matter. In the process, a number of open questions will arise, while discussing a personal selection of insights that are trackable through the vast body of statistical literature around BigData and sampling methodology.