An international multi-cohort investigation of self-reported sleep and future depressive symptoms in older adults
Cohorts and analytic samples
We harmonized and aggregated four cohorts of older adults from the US: the Study of Osteoporotic Fractures12 (SOF), the Osteoporotic Fractures in Men Study13 (MrOS), the Memory and Aging Study14 (MAP), and the Minority Aging Research Study15 (MARS). We also developed a Rotterdam Study16 (RS) sample, harmonized to have identical variable names and coding as the US sample. RS and US samples were kept separate because current European data sharing laws generally prohibit RS from being directly analyzed within US-based institutions. Our harmonization procedures across these cohorts were reported previously17. Briefly, content experts were given list of all items and then used an iterative process to group them into conceptual domains and subdomains. Within (sub)domains, we selected up to one item from each cohort and then recoded the items to be comparable across cohorts. Finally, experts not involved in the harmonization process rated the perceived harmonizability of the sleep items.
To select the initial visit and accompanying follow-up visit for MrOS, SOF, and RS, we balanced several factors including the follow-up interval length, cohort age, data availability, and harmonizability. These considerations resulted in initial visits and follow-up visits that were 3–6 years apart. Because MAP and MARS have annual assessments, for each participant we selected the initial visit as the first observation year with complete sleep data, and the follow-up visit as the observation year that was closest to, but not beyond, 6 years from the selected initial visit.
Inclusion criteria for the US and RS analytic samples were: (1) age ≥ 60; (2) no known or suspected dementia at the initial visit; (3) complete data on self-report sleep characteristics, depressive symptoms, and antidepressant use at the initial visit (i.e., no missing data related to sleep or the outcomes of interest); (4) complete data for depressive symptoms at the follow-up visit; and (5) no clinically relevant depressive symptoms at the initial visit. All participants provided informed consent and Institutional Review Boards at respective institutions approved each study.
Further cohort information and details of how inclusion/exclusion criteria were defined in each cohort are provided in Supplementary Tables 1–2. A flow chart depicting sample derivations is provided in Supplementary Fig. 1.
Measures
Outcome
‘Clinically relevant depressive symptoms’ was selected as our primary outcome because it is interpretable, clinically relevant and harmonizable. Three depression scales were used across the cohorts: the 10-item Center for Epidemiologic Studies Depression Scale18 (CESD-10; MAP and MARS), the 20-item CESD19 (RS), and the 15-item Geriatric Depression Scale (GDS)20 (MrOS and SOF). Based on prior studies’ determinations of optimal cutoffs for the diagnosis of major depressive disorder, the presence of clinically relevant depressive symptoms at follow-up was indicated by a GDS score ≥ 620, a CESD-10 score ≥ 421 or a 20-item CESD ≥ 1619.
Self-reported sleep
We selected nine harmonized self-reported sleep health and sleep disorder symptom characteristics that were potentially relevant for future depressive symptoms: ‘sleep quality’, ‘daytime symptoms’, ‘midpoint timing’, ‘sleep efficiency’, ‘sleep duration’, ‘difficulty falling asleep’, ‘difficulty staying asleep’, ‘frequency of snoring’ and ‘frequency of stopping breathing during sleep’. All self-reported sleep characteristics were rigorously harmonized across cohorts and previously judged to have high harmonizability, except ‘daytime symptoms’, which had moderate harmonizability17. The ‘daytime symptoms’ sleep characteristic was more heterogeneously measured, with cohort items inquiring about tiredness and fatigue, difficulty staying awake, and sleep problems hindering daytime activities. Because different cohorts administered these different items, we did not have information on the degree to which participants viewed these various daytime symptoms as being similar; thus, we take a conservative approach and refer to them only as daytime symptoms. For interpretability and comparability, we categorized each sleep characteristic based on potentially adverse levels for older adults. Whenever possible, definitions were based on pre-existing and published cutoffs22. Otherwise, we used the frequency distributions and clinical content to indicate potentially adverse levels of sleep characteristics. See Table 1 for definitions of harmonized sleep measures and their cut-offs and Supplementary Table 3 for the original cohort item wording of each item.
We considered five self-reported sleep composite scores derived from the combinations of the nine categorical sleep characteristics: ‘All-Unweighted’, ‘All-Weighted’, ‘SATED’, ‘Selected’, and ‘Insomnia with Short Sleep’. The All-Unweighted and All-Weighted scores incorporate the full set of sleep features, with the former a simple sum of sleep indicators and the latter a sum incorporating weights derived in the other (external) sample (i.e. weights for US score derived in RS, and vice versa). The Selected score was derived to offer a composite score requiring only the sleep items with the strongest associations with depression, selected based model results from on the other sample. SATED and Insomnia with Short Sleep are composite sleep indices frequently considered in sleep literature7,9. Composite score definitions and derivations are provided in Table 1, with technical details in Supplementary Text 1 and Supplementary Table 5.
Covariates
Education, marital status, cohort (US only), sex, race (US only), age, and follow-up time were considered potential confounders and included in all models. These are denoted as base covariates. Smoking status, alcohol use, body mass index, number of physical health comorbidities (considering stroke, thyroid disease, heart attack/congestive heart failure, hypertension, diabetes), use of sedating medications, and use of non-sedating antidepressant medications were hypothesized to be either confounders or theoretical mediators10. As the design of our study does not allow for formal differentiation of confounders versus mediators, we refrain from distinguishing their role and instead refer to them as secondary covariates.
While a variety of medications can have subtle effects on sleep, we focused on those with well-described and consistent effects on sleep and/or mood when defining specific medications to be included in the ‘sedating’ and ‘non-sedating antidepressant’ categories. Sedating medications included any tricyclic antidepressant, mirtazapine, nefazodone, non-benzodiazepine non-barbituate sedative hypnotic medications, or trazodone, coded from individual lists of drugs. Non-sedating antidepressant medications included any antidepressant except tricyclic antidepressants, mirtazapine, or nefazodone. In the US cohort, medications were collected via visual assessment of medication containers brought to the visit. In the RS cohort, medication use was determined based on pharmacy dispensary records.
Four covariates in the US sample each had up to < 0.5% missing data. Covariate missingness in RS was < 0.5%, except for education (1.37%) and BMI (2.5%). Missing covariate values were imputed using the MissForest package in R. Further details of categorization and measurement of all covariates are provided in Table 2 and Supplementary Table 4.
Statistical analyses
Variables in the US and RS cohorts were harmonized so that identical code could be directly applied to each dataset, except for RS code excluding indicators of race and cohort. For all analyses, code was originally developed at the University of Pittsburgh for the US cohort and then sent to Erasmus MC University Medical Center to be run independently on the RS cohort. In preliminary analyses, we used descriptive statistics to assess sleep and covariate distributions in full and stratified samples and explored Spearman correlations among sleep indicators. Across all analyses, our focus was on interpreting effect sizes and 95% confidence intervals for inference. When examining sleep and depression associations, we considered a Risk Ratio (RR) of 1.86 as a guiderail to indicate a potentially moderate effect size23. To reduce family-wise error, we used Benjamini Hochberg multiple comparison corrections across tests within each of our three aims to underscore the most robust findings. R Studio 2023.09.1 was used for all analyses.
Aims 1 and 2: individual sleep characteristics and composite scores
We used generalized linear models with a log link (i.e., Poisson regression) and robust standard errors to regress clinically relevant depressive symptoms at follow-up on each individual sleep feature (Aim 1) and each composite sleep score (Aim 2), adjusting for base covariates. This approach produces RR estimates that are interpretable and robust for low-incidence outcomes. A separate model was fit for each sleep characteristic or score.
For each individual sleep item and composite score, we used contrasts to estimate RRs across the observed range, with the reference group set at zero (optimal sleep health based on that characteristic or score). Given our interest in health screening and reducing false negatives, we examined the sensitivity associated with each model in Aims 1 and 2 plus a model including only base covariates. However, for completeness, we also secondarily report specificity and accuracy. When computing these performance metrics, we indicated a person was ‘positive’ for future clinically relevant depressive symptoms if their predicted probability was > 0.07, the average rate of future clinically relevant depressive symptoms across US and RS samples.
We performed three sets of sensitivity analyses for Aims 1 and 2: (1) allowing for new use of any antidepressant medication to count as incident depressive symptoms, and thus excluding participants using sedating or non-sedating antidepressants at the initial visit; (2) removing the sleep item from the CESD-10 and CESD-20 (the GDS does not include a sleep item); and (3) adding secondary covariates to the models to assess whether effect sizes remained consistent.
Exploratory aim 3: moderation
Using Poisson regression, we explored whether sex, age, race (US only), and cohort (US only) moderated the association of individual and composite sleep measures with future clinically relevant depressive symptoms. These analyses are exploratory because we generally did not have specific hypotheses about which demographic profiles would have better or worse sleep health for each sleep score or sleep characteristic. We also ran models within samples stratified by age, sex, race (US only), and cohort (US only), adjusting for base covariates.
link
