65th ISI World Statistics Congress 2025

65th ISI World Statistics Congress 2025

Text Data Insights and Machine Learning Innovations in Monetary Policy Shock Identification and Macroeconomic Forecasting in the Philippines

Conference

65th ISI World Statistics Congress 2025

Format: CPS Abstract - WSC 2025

Keywords: forecasting, machine learning, monetarypolicy, text analysis

Session: CPS 50 - Machine Learning in Banking and Finance

Tuesday 7 October 4 p.m. - 5 p.m. (Europe/Amsterdam)

Session: CPS 50 - Machine Learning in Banking and Finance

Tuesday 7 October 5:10 p.m. - 6:10 p.m. (Europe/Amsterdam)

Abstract

This study introduces a new methodology for identifying monetary policy shocks in the Philippines and enhancing macroeconomic forecasts through the integration of textual data. The research evaluates the added value of incorporating NLP-derived data from official monetary policy communications published by the Philippine Central Bank. Using sentiment analysis and Term Frequency-Inverse Document Frequency (TF-IDF) methods, it extracts economic sentiments and policy directions, enriching the forecasting model's dataset. The integration of these high-dimensional textual data, especially TF-IDF scores as supplementary factors alongside Principal Component Analysis (PCA) from macroeconomic variables, significantly improves forecasting accuracy. Notably, Gradient Boosting Machines (GBMs) that integrate both macroeconomic and textual data outperform the baseline PCA-augmented VAR model, akin to Stock and Watson (2002), by 42% in forecast accuracy.

Additionally, the research employs machine learning techniques, including SVM Regression, Random Forests, LSTM, and GBMs, building on the Romer and Romer (2004) methodology, to detect exogenous variations in monetary policy rates. This diverse toolkit provides a thorough assessment of monetary policy shocks during the inflation-targeting regime from 2002 to 2020. The identified shocks account for a considerable portion of the variance in policy rates—about 60% using the original framework, and nearly 80% when incorporating a comprehensive set of indicators, including textual data.

The integration of textual data not only enhances economic forecasts but also refines the identification of monetary policy impacts. This approach effectively addresses the 'price puzzle' commonly seen in traditional impulse response analyses, underscoring its robustness in analyzing the dynamics of emerging market economies. Furthermore, these findings support further exploration into advanced word vector models like GloVe and Word2Vec, which excel at capturing deep linguistic patterns due to their nuanced comprehension of word meaning and context.