BioMapAI: Artificial Intelligence Multi-Omics Modeling of Myalgic Encephalomyelitis / Chronic Fatigue Syndrome

Abstract:

Chronic diseases like ME/CFS and long COVID exhibit high heterogeneity with multifactorial etiology and progression, complicating diagnosis and treatment. To address this, we developed BioMapAI, an explainable Deep Learning framework using the richest longitudinal multi-‘omics dataset for ME/CFS to date.

This dataset includes gut metagenomics, plasma metabolome, immune profiling, blood labs, and clinical symptoms. By connecting multi-‘omics to asymptom matrix, BioMapAI identified both disease- and symptom-specific biomarkers, reconstructed symptoms, and achieved state-of-the-art precision in disease classification. We also created the first connectivity map of these ‘omics in both healthy and disease states and revealed how microbiome-immune-metabolome crosstalk shifted from healthy to ME/CFS.

Thus, we proposed several innovative mechanistic hypotheses for ME/CFS: Disrupted microbial functions – SCFA (butyrate), BCAA (amino acid), tryptophan, benzoate – lost connection with plasma lipids and bile acids, and activated inflammatory and mucosal immune cells (MAIT, γδT cells) with INFγ and GzA secretion. These abnormal dynamics are linked to key disease symptoms, including gastrointestinal issues, fatigue, and sleep problems.

Source: Xiong R, Fleming E, Caldwell R, Vernon SD, Kozhaya L, Gunter C, Bateman L, Unutmaz D, Oh J. BioMapAI: Artificial Intelligence Multi-Omics Modeling of Myalgic Encephalomyelitis / Chronic Fatigue Syndrome. bioRxiv [Preprint]. 2024 Jun 28:2024.06.24.600378. doi: 10.1101/2024.06.24.600378. PMID: 38979186; PMCID: PMC11230215. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11230215/ (Full text available as PDF file)

A synthetic data generation system for myalgic encephalomyelitis/chronic fatigue syndrome questionnaires

Abstract:

Artificial intelligence or machine-learning-based models have proven useful for better understanding various diseases in all areas of health science. Myalgic Encephalomyelitis or chronic fatigue syndrome (ME/CFS) lacks objective diagnostic tests. Some validated questionnaires are used for diagnosis and assessment of disease progression.

The availability of a sufficiently large database of these questionnaires facilitates research into new models that can predict profiles that help to understand the etiology of the disease. A synthetic data generator provides the scientific community with databases that preserve the statistical properties of the original, free of legal restrictions, for use in research and education.

The initial databases came from the Vall Hebron Hospital Specialized Unit in Barcelona, Spain. 2522 patients diagnosed with ME/CFS were analyzed. Their answers to questionnaires related to the symptoms of this complex disease were used as training datasets. They have been fed for deep learning algorithms that provide models with high accuracy [0.69-0.81]. The final model requires SF-36 responses and returns responses from HAD, SCL-90R, FIS8, FIS40, and PSQI questionnaires. A highly reliable and easy-to-use synthetic data generator is offered for research and educational use in this disease, for which there is currently no approved treatment.

Source: Lacasa M, Prados F, Alegre J, Casas-Roma J. A synthetic data generation system for myalgic encephalomyelitis/chronic fatigue syndrome questionnaires. Sci Rep. 2023 Aug 31;13(1):14256. doi: 10.1038/s41598-023-40364-6. PMID: 37652910; PMCID: PMC10471690. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10471690/ (Full text)

A Proposed Explainable Artificial Intelligence-Based Machine Learning Model for Discriminative Metabolites for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome

Abstract:

Background: Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a complex and debilitating disease with a significant global prevalence of over 65 million individuals. It affects various systems, including the immune, neurological, gastrointestinal, and circulatory systems. Studies have shown abnormalities in immune cell types, increased inflammatory cytokines, and brain abnormalities. Further research is needed to identify consistent biomarkers and develop targeted therapies. A multidisciplinary approach is essential for diagnosing, treating, and managing this complex disease.

The current study aims at employing explainable artificial intelligence (XAI) and machine learning (ML) techniques to identify discriminative metabolites for ME/CFS.

Material and Methods: The present study used a metabolomics dataset of CFS patients and healthy controls, including 26 healthy controls and 26 ME/CFS patients aged 22-72. The dataset encapsulated 768 metabolites, classified into nine metabolic super-pathways: amino acids, carbohydrates, cofactors, vitamins, energy, lipids, nucleotides, peptides, and xenobiotics.

Random forest-based feature selection and Bayesian Approach based-hyperparameter optimization were implemented on the target data. Four different ML algorithms [Gaussian Naive Bayes (GNB), Gradient Boosting Classifier (GBC), Logistic regression (LR) and Random Forest Classifier (RFC)] were used to classify individuals as ME/CFS patients and healthy individuals. XAI approaches were applied to clinically explain the prediction decisions of the optimum model. Performance evaluation was performed using the indices of accuracy, precision, recall, F1 score, Brier score, and AUC.

Results: The metabolomics of C-glycosyltryptophan, oleoylcholine, cortisone, and 3-hydroxydecanoate were determined to be crucial for ME/CFS diagnosis.

The RFC learning model outperformed GNB, GBC, and LR in ME/CFS prediction using the 1000 iteration bootstrapping method, achieving 98% accuracy, precision, recall, F1 score, 0.01 Brier score, and 99% AUC.

Conclusion: RFC model proposed in this study correctly classified and evaluated ME/CFS patients through the selected biomarker candidate metabolites. The methodology combining ML and XAI can provide a clear interpretation of risk estimation for ME/CFS, helping physicians intuitively understand the impact of key metabolomics features in the model.

Source: Yagin, F.H., Alkhateeb, A., Raza, A., Samee, N.A., Mahmoud, N.F., Colak, C., & Yagin, B. (2023). A Proposed Explainable Artificial Intelligence-Based Machine Learning Model for Discriminative Metabolites for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome. Preprints. https://doi.org/10.20944/preprints202307.1585.v1 https://www.preprints.org/manuscript/202307.1585/v1 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10706650/ (Full text of completed study)

Long-COVID diagnosis: From diagnostic to advanced AI-driven models

Abstract:

SARS-COV 2 is recognized to be responsible for a multi-organ syndrome. In most patients, symptoms are mild. However, in certain subjects, COVID-19 tends to progress more severely. Most of the patients infected with SARS-COV2 fully recovered within some weeks. In a considerable number of patients, like many other viral infections, various long-lasting symptoms have been described, now defined as “long COVID-19 syndrome”. Given the high number of contagious over the world, it is necessary to understand and comprehend this emerging pathology to enable early diagnosis and improve patents outcomes.

In this scenario, AI-based models can be applied in long-COVID-19 patients to assist clinicians and at the same time, to reduce the considerable impact on the care and rehabilitation unit. The purpose of this manuscript is to review different aspects of long-COVID-19 syndrome from clinical presentation to diagnosis, highlighting the considerable impact that AI can have.

Source: Cau R, Faa G, Nardi V, Balestrieri A, Puig J, Suri JS, SanFilippo R, Saba L. Long-COVID diagnosis: From diagnostic to advanced AI-driven models. Eur J Radiol. 2022 Jan 19;148:110164. doi: 10.1016/j.ejrad.2022.110164. Epub ahead of print. PMID: 35114535; PMCID: PMC8791239. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8791239/ (Full text)