Metabolomics-Based Machine Learning Diagnostics of Post-Acute Sequelae of SARS-CoV-2 Infection

Abstract:

Background: COVID-19 has taken millions of lives and continues to affect people worldwide. Post-Acute Sequelae of SARS-CoV-2 Infection (also known as Post-Acute Sequelae of COVID-19 (PASC) or more commonly, Long COVID) occurs in the aftermath of COVID-19 and is poorly understood despite its widespread effects.

Methods: We created a machine-learning model that distinguishes PASC from PASC-similar diseases. The model was trained to recognize PASC-dysregulated metabolites (p ≤ 0.05) using molecular descriptors.

Results: Our multi-layer perceptron model accurately recognizes PASC-dysregulated metabolites in the independent testing set, with an AUC-ROC of 0.8991, and differentiates PASC from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), Lyme disease, postural orthostatic tachycardia syndrome (POTS), and irritable bowel syndrome (IBS). However, it was unable to differentiate fibromyalgia (FM) from PASC.

Conclusions: By creating and testing models pairwise on each of these diseases, we elucidated the unique strength of the similarity between FM and PASC relative to other PASC-similar diseases. Our approach is unique to PASC diagnosis, and our use of molecular descriptors enables our model to work with any metabolite where molecular descriptors can be identified, as these descriptors can be generated and compared for any metabolite. Our study presents a novel approach to PASC diagnosis that partially circumvents the lengthy process of exclusion, potentially facilitating faster interventions and improved patient outcomes.

Source: Cai E, Kouznetsova VL, Tsigelny IF. Metabolomics-Based Machine Learning Diagnostics of Post-Acute Sequelae of SARS-CoV-2 Infection. Metabolites. 2025 Dec 17;15(12):801. doi: 10.3390/metabo15120801. PMID: 41441042; PMCID: PMC12734907. https://pmc.ncbi.nlm.nih.gov/articles/PMC12734907/ (Full text)

Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19

Abstract:

Background: Scalable identification of patients with post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms, which has led to suboptimal accuracy, demographic biases, and underestimation of the PASC.

Methods: In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying cohorts of patients with PASC. We used longitudinal electronic health records data from over 295,000 patients from 14 hospitals and 20 community health centers in Massachusetts. The algorithm employs an attention mechanism to simultaneously exclude sequelae that prior conditions can explain and include infection-associated chronic conditions. We performed independent chart reviews to tune and validate the algorithm.

Findings: The PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying PASC cohorts compared to the ICD-10-CM code U09.9. The algorithm identified a cohort of over 24,000 patients with 79.9% precision. Our estimated prevalence of PASC was 22.8%, which is close to the national estimates for the region. We also provide in-depth analyses, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC.

Conclusions: PASC precision phenotyping boasts superior precision and prevalence estimation while exhibiting less bias in identifying patients with PASC. The cohort derived from this algorithm will serve as a springboard for delving into the genetic, metabolomic, and clinical intricacies of PASC, surmounting the constraints of prior PASC cohort studies.

Source: Azhir A, Hügel J, Tian J, Cheng J, Bassett IV, Bell DS, Bernstam EV, Farhat MR, Henderson DW, Lau ES, Morris M, Semenov YR, Triant VA, Visweswaran S, Strasser ZH, Klann JG, Murphy SN, Estiri H. Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19. Med. 2025 Mar 14;6(3):100532. doi: 10.1016/j.medj.2024.10.009. Epub 2024 Nov 8. PMID: 39520983; PMCID: PMC11911085. https://pmc.ncbi.nlm.nih.gov/articles/PMC11911085/ (Full text)

Understanding symptom clusters, diagnosis and healthcare experiences in myalgic encephalomyelitis/chronic fatigue syndrome and long COVID: a cross-sectional survey in the UK

Abstract:

Objectives: This study aims to provide an in-depth analysis of the symptoms, coexisting conditions and service utilisation among people with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and long COVID. The major research questions include the clustering of symptoms, the relationship between key factors and diagnosis time, and the perceived impact of National Institute for Health and Care Excellence (NICE) guidelines on patient care.

Design: Cross-sectional survey using secondary data analysis.

Setting: Community-based primary care level across the UK, incorporating online survey participation.

Participants: A total of 10 458 individuals responded to the survey, of which 8804 confirmed that they or a close friend/family member had ME/CFS or long COVID. The majority of respondents were female (83.4%), with participants from diverse regions of the UK.

Primary and secondary outcome measures: Primary outcomes included prevalence and clustering of symptoms, time to diagnosis, and participant satisfaction with National Health Service (NHS) care, while secondary outcomes focused on symptom management strategies and the perceived effect of NICE guidelines.

Results: Fatigue (88.2%), postexertional malaise (78.2%), cognitive dysfunction (88.4%), pain (87.6%) and sleep disturbances (88.2%) were the most commonly reported symptoms among participants with ME/CFS, with similar patterns observed in long COVID. Time to diagnosis for ME/CFS ranged widely, with 22.1% diagnosed within 1-2 years of symptom onset and 12.9% taking more than 10 years. Despite updated NICE guidelines, only 10.1% of participants reported a positive impact on care, and satisfaction with NHS services remained low (6.9% for ME/CFS and 14.4% for long COVID).

Conclusions: ME/CFS and long COVID share overlapping but distinct symptom clusters, indicating common challenges in management. The findings highlight significant delays in diagnosis and low satisfaction with specialist services, suggesting a need for improved self-management resources and better-coordinated care across the NHS.

Source: Mansoubi M, Richards T, Ainsworth-Wells M, Fleming R, Leveridge P, Shepherd C, Dawes H. Understanding symptom clusters, diagnosis and healthcare experiences in myalgic encephalomyelitis/chronic fatigue syndrome and long COVID: a cross-sectional survey in the UK. BMJ Open. 2025 Apr 2;15(4):e094658. doi: 10.1136/bmjopen-2024-094658. PMID: 40180399. https://bmjopen.bmj.com/content/15/4/e094658 (Full text)

The long COVID evidence gap in England

Introduction:

The term long COVID, also known as post-COVID-19 condition, was coined in spring, 2020, by individuals with ongoing symptoms following COVID-19 in response to unsatisfactory recognition of this emerging syndrome by health-care practitioners.

In September to November, 2020, clinical codes for persistent post-COVID-19 condition and related referrals were introduced and became available for use by health-care practitioners to record details of clinical encounters in electronic health records (EHRs) in England. EHRs, which cover a large proportion of individuals living in England, are increasingly used to help understand the epidemiology of disease alongside the effectiveness and safety of interventions.
Many factors influence the completeness of information in EHRs, including help-seeking behaviour of patients and the discretion and data-recording behaviour of practitioners. Longitudinal population-based studies often include participant self-reports of illness; hence, these studies might be subject to reporting and participation biases. Comparing reported illness in studies to recorded illness in the EHRs of the same individuals might be helpful in understanding the epidemiology and clinical recognition of emerging conditions such as long COVID.
Source: Knuppel A, Boyd A, Macleod J, Chaturvedi N, Williams DM. The long COVID evidence gap in England. Lancet. 2024 May 6:S0140-6736(24)00744-X. doi: 10.1016/S0140-6736(24)00744-X. Epub ahead of print. PMID: 38729195. https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(24)00744-X/fulltext (Full text)

Long COVID Diagnostic with Differentiation from Chronic Lyme Disease using Machine Learning and Cytokine Hubs

Abstract:

The absence of a diagnostic for long COVID (LC) or post-acute sequelae of COVID-19 (PASC) has profound implications for research and potential therapeutics. Further, symptom-based identification of patients with long-term COVID-19 lacks the specificity to serve as a diagnostic because of the overlap of symptoms with other chronic inflammatory conditions like chronic Lymedisease (CLD), myalgic encephalomyelitis-chronic fatigue syndrome (ME-CFS), and others. Here, we report a machine-learning approach to long COVID diagnosis using cytokine hubs that are also capable of differentiating long COVID from chronic Lyme.

We constructed three tree-based classifiers: decision tree, random forest, and gradient-boosting machine (GBM) and compared their diagnostic capabilities. A 223 patient dataset was partitioned into training (178 patients) and evaluation (45 patients) sets. The GBM model was selected based on performance (89% Sensitivity and 96% Specificity for LC) with no evidence of overfitting.

We tested the GBM on a random dataset of 124 individuals (106 PASC and 18 Lyme), resulting in high sensitivity (97%) and specificity 90% for LC). A Lyme Index composed of two features ((TNF-alpha +IL-4)/(IFN-gamma + IL-2) and (TNF-alpha *IL-4)/(IFN-gamma + IL-2 + CCL3) was constructed as a confirmatory algorithm to discriminate between LC and CLD.

Source: Bruce Patterson, Jose Guevara-Coto, Javier Mora et al. Long COVID Diagnostic with Differentiation from Chronic Lyme Disease using Machine Learning and Cytokine Hubs, 18 January 2024, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-3873244/v1] https://www.researchsquare.com/article/rs-3873244/v1 (Full text)

Long Covid

Abstract:

Long COVID, also known as post-acute sequelae of SARS-CoV-2 infection (PASC), refers to a constellation of persistent symptoms and health issues that continue beyond the acute phase of COVID-19. This chapter provides an overview of the pathogenesis, risk factors, manifestations, major findings, and diagnosis and treatment strategies associated with Long COVID.

Hypotheses regarding the pathogenesis of Long COVID are discussed, encompassing various factors such as persistent viral reservoirs, immune dysregulation with or without reactivation of herpesviruses (e.g., Epstein-Barr Virus and human herpesvirus), dysbiosis, autoimmunity triggered by infection, endothelial dysfunction, microvessel blood clotting, and dysfunctional brainstem and/or vagal signaling. The chapter also highlights the risk factors associated with Long COVID and its occurrence in children.

The major findings of Long COVID, including immune dysregulation, vessel and tissue damage, neurological and cognitive pathology, eye symptoms, endocrinal issues, myalgic encephalomyelitis and chronic fatigue syndrome, reproductive system involvement, respiratory and gastrointestinal symptoms, and the chronology of symptoms, are thoroughly explored.

Lastly, the chapter discusses the challenges and current approaches in the diagnosis and treatment of Long COVID, emphasizing the need for multidisciplinary care and individualized management strategies.

Source: Asiya Kamber Zaidi and Puya Dehgani-Mobaraki. Long Covid. Progress in Molecular Biology and Translational Science, Volume 202, 2024, Pages 113-125 https://www.sciencedirect.com/science/article/abs/pii/S1877117323001771

Psychometric evaluation of the DePaul Symptom Questionnaire-Short Form (DSQ-SF) among adults with Long COVID, ME/CFS, and healthy controls: A machine learning approach

Abstract:

Long COVID shares a number of clinical features with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), including post-exertional malaise, severe fatigue, and neurocognitive deficits. Utilizing validated assessment tools that accurately and efficiently screen for these conditions can facilitate diagnostic and treatment efforts, thereby improving patient outcomes.

In this study, we generated a series of random forest machine learning algorithms to evaluate the psychometric properties of the DePaul Symptom Questionnaire-Short Form (DSQ-SF) in classifying large groups of adults with Long COVID, ME/CFS (without Long COVID), and healthy controls.

We demonstrated that the DSQ-SF can accurately classify these populations with high degrees of sensitivity and specificity. In turn, we identified the particular DSQ-SF symptom items that best distinguish Long COVID from ME/CFS, as well as those that differentiate these illness groups from healthy controls.

Source: McGarrigle WJ, Furst J, Jason LA. Psychometric evaluation of the DePaul Symptom Questionnaire-Short Form (DSQ-SF) among adults with Long COVID, ME/CFS, and healthy controls: A machine learning approach. J Health Psychol. 2024 Jan 28:13591053231223882. doi: 10.1177/13591053231223882. Epub ahead of print. PMID: 38282368. https://pubmed.ncbi.nlm.nih.gov/38282368/

Low handgrip strength is associated with worse functional outcomes in long-Covid

Abstract:

The diagnosis of long-Covid is troublesome, even when functional limitations are present. Dynapenia is a decrease in muscle strength and power production and may explain in part these limitations. This study aimed to identify the distribution and possible association of dynapenia with functional assessment in patients with long-Covid.

A total of 113 inpatients with COVID-19 were evaluated by functional assessment 120 days post-acute severe disease. Body composition, respiratory muscle strength, spirometry, six-minute walk test (6MWT) and hand-grip strength (HGS) were assessed.

Dynapenia was defined as HGS < 30kg/f (men), and < 20kg/f (women). Twenty-five (22%) participants were dynapenic, presenting lower muscle mass (p < 0.001), worse forced expiratory volume in the first second (FEV1) (p = 0.0001), lower forced vital capacity (p < 0.001), and inspiratory (p = 0.007) and expiratory (p = 0.002) peek pressures, as well as worse 6MWT performance (p < 0.001). Dynapenia was associated with worse FEV1, MEP, and 6MWT, independent of age (p < 0.001).

Patients with dynapenia had higher ICU admission rates (p = 0.01) and need for invasive mechanical ventilation (p = 0.007) during hospitalization. The HGS is a simple, reliable, and low-cost measurement that can be performed in outpatient clinics in low- and middle-income countries. Thus, HGS may be used as a proxy indicator of functional impairment in this population.

Source: Camila Miriam Suemi Sato Barros do Amaral AMARAL, Cássia da Luz Goulart GOULART, Bernardo Maia da Silva SILVA et al. Low handgrip strength is associated with worse functional outcomes in long-Covid, 11 December 2023, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-3695556/v1] https://www.researchsquare.com/article/rs-3695556/v1 (Full text)

Complement dysregulation is a predictive and therapeutically amenable feature of long COVID

Abstract:

Background Long COVID encompasses a heterogeneous set of ongoing symptoms that affect many individuals after recovery from infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The underlying biological mechanisms nonetheless remain obscure, precluding accurate diagnosis and effective intervention. Complement dysregulation is a hallmark of acute COVID-19 but has not been investigated as a potential determinant of long COVID.

Methods We quantified a series of complement proteins, including markers of activation and regulation, in plasma samples from healthy convalescent individuals with a confirmed history of infection with SARS-CoV-2 and age/ethnicity/gender/infection/vaccine-matched patients with long COVID.

Findings Markers of classical (C1s-C1INH complex), alternative (Ba, iC3b), and terminal pathway (C5a, TCC) activation were significantly elevated in patients with long COVID. These markers in combination had a receiver operating characteristic predictive power of 0.794. Other complement proteins and regulators were also quantitatively different between healthy convalescent individuals and patients with long COVID. Generalized linear modeling further revealed that a clinically tractable combination of just four of these markers, namely the activation fragments iC3b, TCC, Ba, and C5a, had a predictive power of 0.785.

Conclusions These findings suggest that complement biomarkers could facilitate the diagnosis of long COVID and further suggest that currently available inhibitors of complement activation could be used to treat long COVID.

Source: Kirsten Baillie, Helen E Davies, Samuel B K Keat, Kristin Ladell, Kelly L Miners, Samantha A Jones, Ermioni Mellou, Erik J M Toonen, David A Price, B Paul Morgan, Wioleta M Zelek. Complement dysregulation is a predictive and therapeutically amenable feature of long COVID.
medRxiv 2023.10.26.23297597; doi: https://doi.org/10.1101/2023.10.26.23297597 https://www.medrxiv.org/content/10.1101/2023.10.26.23297597v1.full-text (Full text)

Monocytes subpopulations pattern in the acute respiratory syndrome coronavirus 2 virus infection and after long COVID-19

Abstract:

Introduction and objective: The present study sought to characterize the pattern of monocyte subpopulations in patients during the course of the infections caused by SARS-CoV-2 virus or who presented long COVID-19 syndrome compared to monocytes from patients with zika virus (Zika) or chikungunya virus (CHIKV).

Casuistry: Study with 89 peripheral blood samples from patients, who underwent hemogram and serology (IgG and IgM) for detection of Zika (Control Group 1, n = 18) or CHIKV (Control Group 2, n = 9), and from patients who underwent hemogram and reverse transcription polymerase chain reaction for detection of SARS-CoV-2 at the acute phase of the disease (Group 3, n = 19); and of patients who presented long COVID-19 syndrome (Group 4, n = 43). The monocyte and subpopulations counts were performed by flow cytometry.

Results: No significant difference was observed in the total number of monocytes between the groups. The classical (CD14++CD16) and intermediate (CD14+CD16+) monocytes counts were increased in patients with acute infection or with long COVID-19 syndrome. The monocytes subpopulations counts were lower in patients with infection Zika or CHIKV.

Conclusion: Increase in the monocyte subpopulations in patients with acute infection or with long COVID-19 syndrome may be an important finding of differentiated from the infection Zika or CHIKV.

Source: Pereira VIC, de Brito Junior LC, Falcão LFM, da Costa Vasconcelos PF, Quaresma JAS, Berg AVVD, Paixão APS, Ferreira RIS, Diks IBC. Monocytes subpopulations pattern in the acute respiratory syndrome coronavirus 2 virus infection and after long COVID-19. Int Immunopharmacol. 2023 Oct 5;124(Pt B):110994. doi: 10.1016/j.intimp.2023.110994. Epub ahead of print. PMID: 37804653. https://www.sciencedirect.com/science/article/abs/pii/S156757692301319X