Survival causal rule ensemble method considering the prognostic factors for estimating heterogeneous treatment effect
Conference
64th ISI World Statistics Congress
Format: CPS Abstract
Keywords: causal_rule_ensemble, heterogeneous_treatment_effect, survival_analysis
Session: CPS 22 - Survival statistics
Monday 17 July 4 p.m. - 5:25 p.m. (Canada/Eastern)
Abstract
The application of real-world data in medical studies has greatly developed in recent years. Compared with clinical trial data, it is collected from routine medical practice. The subjects’ backgrounds in real-world data are characteristically heterogeneous. Correct evaluation of treatment effects from heterogeneous subjects is therefore an important issue in medical studies. To statistically evaluate such treatment effects, the heterogeneous treatment effect (HTE), also known as conditional average treatment effect (CATE) is applied. HTE generally refers to the causal effect of treatment on the outcome of interest for individuals with different background covariates in the potential outcome framework. HTE is therefore defined as the difference between the potential outcome of treatment and the control condition on the background covariates.
There are many studies about HTE estimation, but especially in recent years, ensemble learning methods, such as BART (Hill, 2011) and causal forest (Athey et al., 2019), have been preferred because of their flexibility and high accuracy. The recent ensemble learning methods for HTE estimating dramatically improve the prediction accuracy of the HTE, but interpretation of the results is hampered because of their black box model. Further, most previous methods have only considered the HTE and have ignored the prognostic effect in HTE modeling, which could affect the prediction accuracy of HTE. In addition, many previous methods have assumed a continuous or binary outcome. The survival outcome is also performing an essential role in medical research, and studies about HTE estimation for survival outcome have been increasing. We therefore focus upon improving the weakness of the previous methods and propose a novel HTE estimation method for survival data.
In this presentation, our novel method is based on Rulefit (Friedman and Popescu, 2008) and is used to estimate the HTE for survival outcome. Our proposed method constructs the model using Cox-proportional hazard model framework and defines the HTE as the log hazard difference between treatment and control groups. The Rulefit method provides a model with the combination of rules and modified linear terms. The model can be easily interpreted according to the rules and corresponding coefficient, as well as for the interpretability based on the HTE and consideration of the prognostic effect in model construction.
We modified the Rulefit procedure. Firstly, the rules are generated using a survival gradient boosting tree and with consideration of the prognostic effect. The rules are then divided into treatment effect rules that include the interaction between the treatment indicator and covariates, and the prognostic effect rules that do not. Next, to ensure HTE is an interpretable model, the difference between the treatment and control for each treatment effect rule is necessary and a pair of coefficients should be estimated for each rule. We therefore apply a group lasso to fit these rules into sparse Cox-hazard model. Finally, the HTE for each rule can be simply calculated from the difference between its paired coefficients.