Omics-based computational approaches for biomarker identification, prediction, and treatment of Long COVID

Abstract:

Long COVID, or post-acute sequelae of COVID-19 (PASC), is a major global health problem, with cumulative estimates suggesting that around 400 million people worldwide have been affected. It is characterized by persistent or new symptoms such as fatigue, cognitive impairment, and breathlessness lasting beyond four weeks after acute infection. Diverse clinical manifestations, chronic course, and incompletely understood pathophysiology-including hypotheses involving viral persistence, immune dysregulation, autoimmunity, endothelial dysfunction, and metabolic reprogramming-impede the development of diagnostic criteria, biomarkers, and targeted therapies. We conducted a critical review of 101 Long COVID omics studies, focusing on the computational methods used and their methodological quality.

Using standardized criteria, we evaluated study design, statistical rigor, reproducibility, and clinical relevance across genomics, epigenomics, transcriptomics, proteomics, metabolomics, and multiomics integration, and mapped these findings onto regulatory and translational frameworks. Despite substantial methodological heterogeneity, convergent biological signals emerged.

Genomic studies implicate risk loci in immune and cardiopulmonary pathways. Epigenomic analyses identify differentially methylated regions in immune and circadian genes. Transcriptomic studies reveal persistent dysregulation of innate immune and coagulation pathways, as well as reproducible molecular endotypes. Proteomic studies consistently show abnormalities in the complement cascade and coagulation, with a small panel of complement proteins showing highly reproducible changes across independent cohorts. Metabolomic studies demonstrate sustained mitochondrial dysfunction and altered cellular bioenergetics for up to two years after infection.

Multiomics integration supports at least two major endotypes, characterized by predominant inflammatory versus metabolic dysregulation, and provides a basis for patient stratification and computational treatment discovery. Machine learning models frequently achieve high classification performance, but are rarely externally validated. Critical limitations restrict clinical translation. Most studies are underpowered relative to analytical complexity, use heterogeneous case definitions and controls, and report platform-specific signatures with limited overlap. External validation, preregistered analysis plans, and regulatory-aligned assay development are uncommon. To date, no regulatory-approved diagnostic assay or evidence-based therapeutic intervention has directly emerged from these computational findings.

Future progress requires harmonized phenotyping protocols, adequately powered longitudinal cohorts with external validation, integration of spatial omics and explainable artificial intelligence, and early engagement with regulatory and health-technology assessment pathways. This review provides a critical assessment and a translational roadmap, outlining how methodologically robust computational omics can be advanced toward clinically actionable tools for Long COVID.

Source: Pinero S, Li X, Zhang J, Winter M, Lee SH, Nguyen T, Liu L, Li J, Le TD. Omics-based computational approaches for biomarker identification, prediction, and treatment of Long COVID. Crit Rev Clin Lab Sci. 2026 Jun;63(4):332-358. doi: 10.1080/10408363.2025.2583083. Epub 2025 Dec 9. PMID: 41368891. https://pubmed.ncbi.nlm.nih.gov/41368891/

Toward a Molecular Reclassification of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Integrating Multi-Omics, Machine Learning, and Precision Medicine

Abstract:

Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a complex, multi-system disease characterized by a multitude of symptoms across various organ systems. Diagnosis has relied heavily on heterogeneous clinical symptom presentation and evolving case definitions, with treatment focused on addressing presenting symptoms due to the paucity of validated biomarkers. Meanwhile, advances have been made in understanding the underlying pathophysiology through strong epidemiologic, clinical, and basic science studies. This narrative review synthesizes recent advances that are likely to drive a shift in understanding from symptom-based classification toward a molecularly defined understanding of the disease.

This shift in understanding will likely provide the foundation for future research efforts focused on targeting diagnosis and treatment more effectively. Specifically, we reference the identification of rare genetic risk variants through the HEAL2 deep learning framework, the large-scale DecodeME genome-wide association study, and dynamic epigenetic markers of disease state.

In addition, the findings revealed the downstream consequences of this genetic and epigenetic priming: chronic innate immune activation, CD8+ T cell exhaustion characterized by upregulation of the exhaustion-driving transcription factors Thymocyte Selection-Associated HMG Box (TOX) and Eomesodermin (EOMES), and a cellular energy crisis centered on mitochondrial dysfunction. Furthermore, results of recent studies have revealed sex-specific transcriptomic and proteomic signatures of maladaptive recovery.

We also highlight the role of machine learning and artificial intelligence integrations in translating high-dimensional multi-omics data into actionable biological insights, including the identification of monocyte subsets via Positive Unlabeled Learning, circulating cell-free RNA diagnostic signatures, and integrated multi-modal disease models such as BioMapAI.

The combination of these findings, which highlight multiple identifiable mechanisms of molecular activity, support the feasibility of molecular subtyping, precision diagnostics, and targeted therapeutic strategies for ME/CFS.

Source: Frank J, Nesterovitch N, Movva C, Klimas NG, Nathanson L. Toward a Molecular Reclassification of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Integrating Multi-Omics, Machine Learning, and Precision Medicine. Int J Mol Sci. 2026 May 15;27(10):4436. doi: 10.3390/ijms27104436. PMID: 42196410; PMCID: PMC13207433. https://pmc.ncbi.nlm.nih.gov/articles/PMC13207433/ (Full text)

A multiomics recovery factor predicts long COVID in the IMPACC study

Abstract:

BACKGROUND: Following SARS-CoV-2 infection, approximately 10%–35% of patients with COVID-19 experience long COVID (LC), in which debilitating symptoms persist for at least 3 months. Elucidating the biologic underpinnings of LC could identify therapeutic opportunities.

METHODS: We utilized machine learning methods on biologic analytes provided over 12 months after hospital discharge from more than 500 patients with COVID-19 in the IMPACC cohort to identify a multiomics “recovery factor,” trained on patient-reported physical function survey scores. Immune profiling data included PBMC transcriptomics, serum O-link and plasma proteomics, plasma metabolomics, and blood mass cytometry by time of flight (CyTOF) protein levels. Recovery factor scores were tested for association with LC, disease severity, clinical parameters, and immune subset frequencies. Enrichment analyses identified biologic pathways associated with recovery factor scores.

RESULTS: Participants with LC had lower recovery factor scores compared with recovered participants. Recovery factor scores predicted LC as early as hospital admission, irrespective of acute COVID-19 severity. Biologic characterization revealed increased inflammatory mediators, elevated signatures of heme metabolism, and decreased androgenic steroids as predictive and ongoing biomarkers of LC. Lower recovery factor scores were associated with reduced lymphocyte and increased myeloid cell frequencies. The observed signatures are consistent with persistent inflammation driving anemia and stress erythropoiesis as major biologic underpinnings of LC.

CONCLUSION: The multiomics recovery factor identifies patients at risk of LC early after SARS-CoV-2 infection and reveals LC biomarkers and potential treatment targets.

TRIAL REGISTRATION: ClinicalTrials.gov NCT04378777.

FUNDING:

National Institute of Allergy and Infectious Diseases (NIAID), NIH (3U01AI167892-03S2, 3U01AI167892-01S2, 5R01AI135803-03, 5U19AI118608-04, 5U19AI128910-04, 4U19AI090023-11, 4U19AI118610-06, R01AI145835-01A1S1, 5U19AI062629-17, 5U19AI057229-17, 5U19AI057229-18, 5U19AI125357-05, 5U19AI128913-03, 3U19AI077439-13, 5U54AI142766-03, 5R01AI104870-07S1, 3U19AI089992-09, 3U19AI128913-03, and 5T32DA018926-1, 3U19AI1289130, U19AI128913-04S1, R01AI122220); NIH (UM1TR004528); and National Science Foundation (NSF) (DMS2310836).

Source: Gabernet G, Maciuch J, Gygi JP, Moore JF, Hoch A, Syphurs C, Chu T, Doni Jayavelu N, Corry DB, Kheradmand F, Baden LR, Sekaly RP, McComsey GA, Haddad EK, Cairns CB, Rouphael N, Fernandez-Sesma A, Simon V, Metcalf JP, Agudelo Higuita NI, Hough CL, Messer WB, Davis MM, Nadeau KC, Pulendran B, Kraft M, Bime C, Reed EF, Schaenman J, Erle DJ, Calfee CS, Atkinson MA, Brakenridge SC, Melamed E, Shaw AC, Hafler DA, Augustine AD, Becker PM, Ozonoff A, Bosinger SE, Eckalbar W, Maecker HT, Kim-Schulze S, Steen H, Krammer F, Westendorf K; IMPACC Network; Peters B, Fourati S, Altman MC, Levy O, Smolen KK, Montgomery RR, Diray-Arce J, Kleinstein SH, Guan L, Ehrlich LI. A multiomics recovery factor predicts long COVID in the IMPACC study. J Clin Invest. 2025 Sep 9;135(21):e193698. doi: 10.1172/JCI193698. PMID: 40924481; PMCID: PMC12582403. https://pmc.ncbi.nlm.nih.gov/articles/PMC12582403/ (Full text)

Use of artificial intelligence and machine learning for the management of fibromyalgia: a scoping review

Abstract:

Background: Fibromyalgia (FM) is a complex and multifactorial syndrome characterized by widespread pain, fatigue, cognitive impairment, and other systemic symptoms. The absence of specific biomarkers and the heterogeneous clinical presentation pose significant diagnostic challenges.

Objective: This scoping review aims to explore the current applications of artificial intelligence (AI) and machine learning (ML) in the diagnosis and clinical management of FM.

Methods: A systematic search was conducted in PubMed, EMBASE, and the Cochrane Library using defined keywords related to FM and AI/ML. Studies were included if they addressed ML applications in FM patients. Following PRISMA-ScR guidelines, 43 studies published between 2011 and 2024 were included and analyzed for ML techniques used, diagnostic targets, data types, and clinical relevance.

Results: As expected, the majority of studies done so far focused on improving diagnostic accuracy through supervised algorithms such as support vector machines, neural networks, and ensemble models, as well as unsupervised clustering and dimensionality reduction techniques. Notable findings include the identification of neurophysiological signatures via fMRI, gene expression patterns, retinal imaging changes, and metabolomic biomarkers that distinguish FM patients from controls. For instance, one study investigating circulating microRNAs used a Random Forest model to identify 11 microRNAs (e.g. hsa-miR-28-5p, hsa-miR-29a-3p, hsa-miR-150-5p) capable of differentiating patients with FM, ME/CFS, and healthy controls, suggesting their potential as biomarkers for more accurate diagnoses. Reported model accuracies ranged from 82% to 100%, although most studies were pilot-based with small and imbalanced samples, limiting generalizability.

Conclusion: AI and ML offer promising tools to overcome longstanding limitations in FM diagnosis and treatment. While current findings demonstrate significant potential, larger, multicenter studies with rigorous validation protocols are essential to finally establish these approaches as clinically reliable solutions.

Source: Clempi Almeida E Silva AL, Reis VHPF, Lamoglia ASA, Souza Desidério C, Freire Oliveira CJ. Use of artificial intelligence and machine learning for the management of fibromyalgia: a scoping review. J Man Manip Ther. 2026 Feb 17:1-17. doi: 10.1080/10669817.2026.2630999. Epub ahead of print. PMID: 41700030. https://pubmed.ncbi.nlm.nih.gov/41700030/

Metabolomics-Based Machine Learning Diagnostics of Post-Acute Sequelae of SARS-CoV-2 Infection

Abstract:

Background: COVID-19 has taken millions of lives and continues to affect people worldwide. Post-Acute Sequelae of SARS-CoV-2 Infection (also known as Post-Acute Sequelae of COVID-19 (PASC) or more commonly, Long COVID) occurs in the aftermath of COVID-19 and is poorly understood despite its widespread effects.

Methods: We created a machine-learning model that distinguishes PASC from PASC-similar diseases. The model was trained to recognize PASC-dysregulated metabolites (p ≤ 0.05) using molecular descriptors.

Results: Our multi-layer perceptron model accurately recognizes PASC-dysregulated metabolites in the independent testing set, with an AUC-ROC of 0.8991, and differentiates PASC from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), Lyme disease, postural orthostatic tachycardia syndrome (POTS), and irritable bowel syndrome (IBS). However, it was unable to differentiate fibromyalgia (FM) from PASC.

Conclusions: By creating and testing models pairwise on each of these diseases, we elucidated the unique strength of the similarity between FM and PASC relative to other PASC-similar diseases. Our approach is unique to PASC diagnosis, and our use of molecular descriptors enables our model to work with any metabolite where molecular descriptors can be identified, as these descriptors can be generated and compared for any metabolite. Our study presents a novel approach to PASC diagnosis that partially circumvents the lengthy process of exclusion, potentially facilitating faster interventions and improved patient outcomes.

Source: Cai E, Kouznetsova VL, Tsigelny IF. Metabolomics-Based Machine Learning Diagnostics of Post-Acute Sequelae of SARS-CoV-2 Infection. Metabolites. 2025 Dec 17;15(12):801. doi: 10.3390/metabo15120801. PMID: 41441042; PMCID: PMC12734907. https://pmc.ncbi.nlm.nih.gov/articles/PMC12734907/ (Full text)

Leveraging Explainable Automated Machine Learning (AutoML) and Metabolomics for Robust Diagnosis and Pathophysiological Insights in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS)

Abstract:

Background/Objectives: Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a debilitating complex disease with an elusive etiology, lacking objective diagnostic biomarkers. This study leverages advanced Automated Machine Learning (AutoML) to analyze plasma metabolomic and lipidomic profiles for the purpose of ME/CFS detection.

Methods: We utilized a publicly available dataset comprising 888 metabolic features from 106 ME/CFS patients and 91 matched controls. Three AutoML frameworks-TPOT, Auto-Sklearn, and H2O AutoML-were benchmarked under identical time constraints. Univariate ROC and PLS-DA analyses with cross-validation, permutation testing, and VIP-based feature selection were applied to standardized, log-transformed omics data to identify significant discriminatory metabolites/lipids and assess their intercorrelations.

Results: TPOT significantly outperformed its counterparts, achieving an area under the curve (AUC) of 92.1%, accuracy of 87.3%, sensitivity of 85.8%, and specificity of 89.0%. The PLS-DA model revealed a moderate but statistically significant discrimination between ME/CFS and controls. Explainable artificial intelligence (XAI) via SHAP analysis of the optimal TPOT model identified key metabolites implicating dysregulated pathways in mitochondrial energy metabolism (succinic acid, pyruvic acid, leucine), chronic inflammation (prostaglandin D2, 11,12-EET), gut-brain axis communication (glycocholic acid), and cell membrane integrity (pc(35:2)a).

Conclusions: Our results demonstrate that TPOT-derived models not only provide a highly accurate and robust diagnostic tool but also yield biologically interpretable insights into the pathophysiology of ME/CFS, highlighting its potential for clinical decision support and elucidating novel therapeutic targets.

Source: Yagin FH, Colak C, Al-Hashem F, Alzakari SA, Alhussan AA, Aghaei M. Leveraging Explainable Automated Machine Learning (AutoML) and Metabolomics for Robust Diagnosis and Pathophysiological Insights in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS). Diagnostics (Basel). 2025 Oct 30;15(21):2755. doi: 10.3390/diagnostics15212755. PMID: 41226047. https://www.mdpi.com/2075-4418/15/21/2755 (Full text)

Advancing Digital Precision Medicine for Chronic Fatigue Syndrome through Longitudinal Large-Scale Multi-Modal Biological Omics Modeling with Machine Learning and Artificial Intelligence

Abstract:

We studied a generalized question: chronic diseases like ME/CFS and long COVID exhibit high heterogeneity with multifactorial etiology and progression, complicating diagnosis and treatment. To address this, we developed BioMapAI, an explainable Deep Learning framework using the richest longitudinal multi-omics dataset for ME/CFS to date.

This dataset includes gut metagenomics, plasma metabolome, immune profiling, blood labs, and clinical symptoms. By connecting multi-omics to a symptom matrix, BioMapAI identified both disease- and symptom-specific biomarkers, reconstructed symptoms, and achieved state-of-the-art precision in disease classification.

We also created the first connectivity map of these omics in both healthy and disease states and revealed how microbiome-immune-metabolome crosstalk shifted from healthy to ME/CFS.

Source: Xiong R. Advancing Digital Precision Medicine for Chronic Fatigue Syndrome through Longitudinal Large-Scale Multi-Modal Biological Omics Modeling with Machine Learning and Artificial Intelligence. ArXiv [Preprint]. 2025 Jun 18:arXiv:2506.15761v1. PMID: 40980765; PMCID: PMC12447721. https://pmc.ncbi.nlm.nih.gov/articles/PMC12447721/ (Full text available as PDF file)

Circulating cell-free RNA signatures for the characterization and diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome

Abstract:

People living with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) experience heterogeneous and debilitating symptoms that lack sufficient biological explanation, compounded by the absence of accurate, noninvasive diagnostic tools. To address these challenges, we explored circulating cell-free RNA (cfRNA) as a blood-borne bioanalyte to monitor ME/CFS. cfRNA is released into the bloodstream during cellular turnover and reflects dynamic changes in gene expression, cellular signaling, and tissue-specific processes.

We profiled cfRNA in plasma by RNA sequencing for 93 ME/CFS cases and 75 healthy sedentary controls, then applied machine learning to develop diagnostic models and advance our understanding of ME/CFS pathobiology. A generalized linear model with least absolute shrinkage selector operator regression trained on condition-specific signatures achieved a test-set AUC of 0.81 and an accuracy of 77%.

Immune cfRNA deconvolution revealed differences in platelet-derived cfRNA between cases and controls, as well as elevated levels of plasmacytoid dendritic, monocyte, and T cell-derived cfRNA in ME/CFS. Biological network analysis further implicated immune dysfunction in ME/CFS, with signatures of cytokine signaling and T cell exhaustion. These findings demonstrate the utility of RNA liquid biopsy as a minimally invasive tool for unraveling the complex biology behind chronic illnesses.

Source: Gardella AE, Eweis-LaBolle D, Loy CJ, Belcher ED, Lenz JS, Franconi CJ, Scofield SY, Grimson A, Hanson MR, De Vlaminck I. Circulating cell-free RNA signatures for the characterization and diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome. Proc Natl Acad Sci U S A. 2025 Aug 19;122(33):e2507345122. doi: 10.1073/pnas.2507345122. Epub 2025 Aug 11. PMID: 40789036. https://pubmed.ncbi.nlm.nih.gov/40789036/

The Implications and Predictability of Sleep Reversal for People with Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Machine Learning Approach

Abstract:

Background/objectives: Impaired sleep is one of the core symptoms of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), yet the mechanisms and impact of sleep-related issues are poorly understood. Sleep dysfunctions for patients with ME/CFS include frequent napping, difficulties falling asleep, waking up early, and sleep reversal patterns (e.g., sleeping throughout the day and staying awake throughout the night). The current study focuses on sleep reversal for patients with ME/CFS.

Methods: We explored the symptoms and functional impairment of those with and without sleep reversal by analyzing the responses of a large international sample (N = 2313) using the DePaul Symptom Questionnaire (DSQ) and Medical Outcomes Study 36-item Short-Form Health Survey (SF-36).

Results: We found that those in our Sleep Reversal group (N = 327) compared to those without sleep reversal (N = 1986) reported higher symptom burden for 53 out of 54 DSQ symptoms and greater impairments for all six SF-36 subscales. The most accurate predictors of sleep reversal included age (p < 0.05), body mass index (p < 0.05), eleven DSQ symptoms (p < 0.01), and two SF-36 subscales (p < 0.01).

Conclusions: These features provide clues regarding some of the possible pathophysiological underpinnings of sleep reversal among those with ME/CFS.

Source: Dietrich MP, Pravin R, Furst J, Jason LA. The Implications and Predictability of Sleep Reversal for People with Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Machine Learning Approach. Healthcare (Basel). 2025 May 26;13(11):1255. doi: 10.3390/healthcare13111255. PMID: 40508869. https://www.mdpi.com/2227-9032/13/11/1255 (Full text)

Using Single-Cell Raman Microspectroscopy to Profile Human Peripheral Blood Mononuclear Cells

Abstract:

A reliable, validated test would enhance our ability to treat and research chronic conditions. Early and accurate diagnosis would provide an entry point into clinical care, give access to benefits, remove the stigma associated with these conditions, and importantly, provide researchers with a fundamental tool they require to study these heterogeneous disorders.

In this chapter, we describe how Raman microspectroscopy can be utilised to study the biology of peripheral blood mononuclear cells (PBMCs) isolated from human blood samples. Using machine learning approaches, the data generated can be used to attempt to separate different patient and control groups, subgroups within a patient cohort, and identify differences in intracellular metabolites which may provide clues about disease mechanisms.

Source: Gan E, Stoker M, Guo E, Morten KJ, Xu J. Using Single-Cell Raman Microspectroscopy to Profile Human Peripheral Blood Mononuclear Cells. Methods Mol Biol. 2025;2920:29-37. doi: 10.1007/978-1-0716-4498-0_3. PMID: 40372676. https://link.springer.com/protocol/10.1007/978-1-0716-4498-0_3