65th ISI World Statistics Congress 2025

65th ISI World Statistics Congress 2025

Penalized spline model-based composite estimator of proportion in a finite population using probability and non-probability sample

Conference

65th ISI World Statistics Congress 2025

Format: CPS Abstract - WSC 2025

Keywords: bayesian approach, complex sampling design, probit regression, propensity scores

Abstract

Sample surveys are employed in official statistics, public opinion and market research, sociology and many other fields of science. Traditionally used probability samples are faced with a nonresponse, and nonresponse rate is increasing with increasing number of surveys, unwillingness to take part and mobility of the population. Survey nonresponse influence decreases in the accuracy of the estimates, and a need arises to find additional data sources which may improve accuracy of the estimates of the finite population parameters. Many nonprobability data sets are available nowadays. Their examples are information from the social networks, like blogs and comments, personal documents, videos; administrative data sources arising in the traditional business systems; machine generated data, like data from fixed and mobile sensors, computer systems. Estimates obtained from such data sets are biased because of uncontrolled data collection bias. Naturally arises a desire to use nonprobability samples for improvement of the accuracy of the estimators obtained in probability samples. Much research has been going on in this area during the last decade. We would like to make an input to the field by proposing a composite estimator for the proportion in the finite population. It is a weighted combination of the estimators obtained in nonprobability and probability samples, each of them employs the probit regression model estimated using a Bayesian approach. Sampling design is taken into account in the case of the probability sample. The accuracy of the composite estimator is compared with the estimates obtained using other data aggregation methods by simulation.