The Imaginary Healthy Patient
Introduction
This ebook is an online supplementary materials for the article titled “The Imaginary Healthy Patient.” The aim of the analysis is to focus on what we call imaginary healthy patients, that it, individuals who have unrecognized anxiety and depressive troubles.
There are five parts. The first presents the data and data cleaning. The second part gives the codes and results of the classification. The third part shows the estimations of SHAP values on ensemble methods. The fourth and fifth parts provide robustness checks when substituting the MHI-5 score with the MHI-3 score, and when using self-assessed health .
Part 1
Chapter 1 Load Data presents the formatting procedures and provides a detailed guide on computing the MHI-5 score. In Chapter 2 Data Cleaning, we provide the codes used to cleanse our data and information on missing values. Chapter 3 Sunlight Data discusses the acquisition of sunlight data. Chapter 4 Descriptive Statistics presents descriptive statistics, providing a comprehensive overview of the MHI-5 score and all variables involved in our study. It draws comparisons between the entire sample and various subsamples, differentiating individuals with low MHI-5 scores from those with high scores.
Part 2
Chapter 5 Estimations provides the codes used to train the different classifiers. The results and performances of the models are shown in Chapter 6 Results.
Part 3
Chapter 7 Estimations takes center stage, showing how to compute SHAP values for our ensemble methods. The results and some clustering on the resulting SHAP values are shown in Chapter 8 Results.
Part 4
This part replicates the estimations performed in Part 2 and Part 3 using another definition of the target variable as a robustness check. The identification of anxiety or depressive troubles is made using the MHI-3 instead of the MHI-5. A similar strategy to that employed in the first part is adopted. Chapter 9 Data Cleaning shows the operations performed to cleanse the data. Chapter 10 Descriptive Statistics presents descriptive statistics. Chapter 11 Classification Estimations and Chapter 12 Classification Results present the estimated classifiers and their results, respectively. Finally, Chapter 13 SHAP Estimations and Chapter 14 SHAP Results use the SHapley Additive exPlanations method to examine the role of each variable in the variation from the baseline in the predicted score associated with the classifications by the model.
Part 5
This part replicates the estimations performed in Part 2 and Part 3 using another definition of the target variable as a robustness check. The imaginary healthy patient is defined using the MHI-5 score and the self-assessed health (SAH) status. A similar strategy to that employed in the first part is adopted. Chapter 15 Data Cleaning shows the operations performed to cleanse the data. Chapter 16 Descriptive Statistics presents descriptive statistics. Chapter 17 Estimations and Chapter 18 Results present the estimated classifiers and their results, respectively. Finally, Chapter 19 Estimations and Chapter 20 SHAP Results use the SHapley Additive exPlanations method to examine the role of each variable in the variation from the baseline in the predicted score associated with the classifications by the model.