64th ISI World Statistics Congress

64th ISI World Statistics Congress

MODEL-BASED SINGLE-MONTH UNEMPLOYMENT ESTIMATES FOR THE BRAZILIAN LABOUR FORCE SURVEY INCORPORATING GOOGLE TRENDS DATA

Author

DB
Denise Silva

Co-author

Conference

64th ISI World Statistics Congress

Format: IPS Paper

Keywords: big data, nowcasting, official_statistics, repeated_surveys, time_series

Session: IPS 314 - The use of alternative data through modelling in Official Statistics

Thursday 20 July 10 a.m. - noon (Canada/Eastern)

Abstract

The Brazilian Labour Force Survey publishes monthly unemployment estimates based on three month rolling data at the national level. The need to produce single-month estimates to monitor the labour market emerged markedly after the COVID-19 outbreak at both federal and state levels. Multivariate state space models that integrate survey data and Google Trends series for nowcasting were developed. Their potential to improve estimates for small samples at state level, or for specific population groups such as young people, were evaluated. A different set of searched keyword series were considered for each state as Brazil is a large country and has diverse workforce behaviour. In addition, three approaches for targeting the predictors in the dimensionality reduction process were compared: elastic net, clustering, and bivariate state space models. High dimensionality problems were solved using a dynamic factor state space model. The analysis period spans from January 2012 until December 2021. The models also account for the autocorrelation of sampling errors due to sample overlap and the increased volatility in the labour force series in 2020. The results indicate that the use of bivariate structural models as an intermediate stage for keyword reduction constitutes a better strategy to select the correlated Google Trends series, when compared to elastic net and clustering. The significance of multivariate models incorporating big data is noticeable at the national level and nowcast estimates using Google Trends series outperformed the univariate model. On the other hand, no evidence was found that the trajectories of the selected searches bring useful information to improve state level or youth unemployment estimates.