Investigating the impact on predictive performance in joint models when misspecifying the relation between outcomes
Conference
65th ISI World Statistics Congress 2025
Format: IPS Abstract - WSC 2025
Keywords: association, joint models, longitudinal
Session: IPS 757 - New Developments and Insights in Joint Modeling of Longitudinal and Survival Outcomes
Thursday 9 October 10:50 a.m. - 12:30 p.m. (Europe/Amsterdam)
Abstract
The increasing availability of patient information leads to collecting many different data types. Big data is the key element to new developments and personalised medicine. The motivation comes from the Cystic Fibrosis (CF) US registry data. CF is a genetic disorder that requires frequent patient monitoring to maintain lung function over time and minimise the onset of acute respiratory events known as pulmonary exacerbations. An important association has already been characterised between key biomarkers such as FEV1 and BMI with time-to-exacerbation. Environmental exposures and community characteristics (geomarkers) are powerful predictors of health. Therefore, it is of interest to investigate whether socioeconomic deprivation and traffic-related air pollution exposures could further improve prediction on the disease progression.
Analysing all data simultaneously poses many challenges when the structure of the data is complex. In particular, FEV1 measures are collected over time, and it has been established previously that the decline of this biomarker is highly associated with time-to-exacerbation. It is, however, unknown how other outcomes (biomarkers/geomarkers/survival data) are associated with each other. Using an extensive simulation study, we investigate the impact of misspecifying (or ignoring) the association between the outcomes on the predictive performance measures in settings of multivariate mixed model and joint models of longitudinal and survival data. We expect to improve the individualised predictions of CF patients when incorporating all data and appropriately modelling their associations. In this new era of rich medical data sets, it is often challenging to effectively combine all the available data.