Neuropsychological test validation of speech markers of cognitive impairment in the Framingham Cognitive Aging Cohort

Pearson correlations between the top 15 most contributing features from the original linguistic feature set and the language-based NPTs.

Abstract

Aim: Although clinicians primarily diagnose dementia based on a combination of metrics such as medical history and formal neuropsychological tests, recent work using linguistic analysis of narrative speech to identify dementia has shown promising results. We aim to build upon research by Thomas JA & Burkardt HA et al. (J Alzheimers Dis. 2020;76:905–2) and Alhanai et al. (arXiv:1710.07551v1. 2020) on the Framingham Heart Study (FHS) Cognitive Aging Cohort by 1) demonstrating the predictive capability of linguistic analysis in differentiating cognitively normal from cognitively impaired participants and 2) comparing the performance of the original linguistic features with the performance of expanded features.

Methods:
Data were derived from a subset of the FHS Cognitive Aging Cohort. We analyzed a sub-selection of 98 participants, which provided 127 unique audio files and clinical observations (n = 127, female = 47%, cognitively impaired = 43%). We built on previous work which extracted original linguistic features from transcribed audio files by extracting expanded features. We used both feature sets to train logistic regression classifiers to distinguish cognitively normal from cognitively impaired participants and compared the predictive power of the original and expanded linguistic feature sets, and participants’ Mini-Mental State Examination (MMSE) scores.

Results:
Based on the area under the receiver-operator characteristic curve (AUC) of the models, both the original (AUC = 0.882) and expanded (AUC = 0.883) feature sets outperformed MMSE (AUC = 0.870) in classifying cognitively impaired and cognitively normal participants. Although the original and expanded feature sets had similar AUC, the expanded feature set showed better positive and negative predictive value [expanded: positive predictive value (PPV) = 0.738, negative predictive value (NPV) = 0.889; original: PPV = 0.701, NPV = 0.869].

Conclusion:
Linguistic analysis has been shown to be a potentially powerful tool for clinical use in classifying cognitive impairment. This study expands the work of several others, but further studies into the plausibility of speech analysis in clinical use are vital to ensure the validity of speech analysis for clinical classification of cognitive impairment.

Publication
Exploration of Medicine
Jason A. Thomas
Jason A. Thomas
PhD
Medical Data & AI Scientist | Strategist | Informatician | Tech lead - Senior Data & AI Scientist - Philips

My research interests include 1) building foundational layers (data, infrastructure, knowledge representation, talent, culture) to support biomedical data science and 2) applying data science & AI methods on data to drive business value and improve patient outcomes.

Related