Abstract:
The effects of COVID-19 have had a tremendous impact on the quality of life, work, and society. This has been exacerbated by the progression of COVID-19 into Long COVID. Long COVID is not a specific disease or symptom but a set of wide-ranging conditions that linger in COVID-19 patients for four weeks or beyond post-initial COVID-19 detection. This relatively new condition is challenging due to a lack of prior research and data specific to the pediatric population, comprising 25.24% of all Long COVID cases under study.
Besides, there is a lack of deeper understanding about who may develop Long COVID. Various comorbidities could provide insights into the path leading toward a patient’s Long COVID detection, as referenced in Berg et al. (2022). Thus, we address two research questions in our study. First, what chronic co-morbidities are prevalent in pediatric patients exhibiting Long COVID symptoms? Second, what nonchronic conditions are associated with pediatric patients diagnosed with Long COVID?
To delve into the research questions, we use 80,000 Long COVID pediatric patients N3C (National COVID Cohort Collaboration) data across 72 healthcare units located in the US. The model we developed has 3 stages – First, we apply network analytics techniques to identify pre-existing chronic and non-chronic conditions among those diagnosed with Long COVID. Second, using CDC’s definition for Long COVID, we develop a bi-partite network representing a large pediatric population diagnosed with COVID-19 who subsequently developed Long-COVID. This bipartite network has patients on one side and diseases on the other with no connection among the patients and among the diseases. We take projection on the disease side to create disease-disease projection graph. Third, the projected disease-disease graph is processed such that we create bipartite network comprising pre-COVID diseases on one side and Long COVID diseases on the other side. We take the projection of both sides to carry out analysis regarding chronic and non-chronic pre-COVID conditions leading to Long COVID.
The above model was implemented using 0.5 million pediatric COVID patient dataset from the N3C (2020). Besides using Spark SQL and PySpark to analyze the data, we used graphical tools such as Gephi to integrate Community Detection algorithms and create visualizations. Since the size of the overall patient record is large, it necessitated implementation of various code optimization techniques for faster processing. This study provides critical building blocks for developing Long COVID prediction and recommendation systems models
Source: Kushagra, Kushagra; joghataee, mohammad; Gupta, Ashish; Kalgotra, Pankush; and Qin, Xiao, “Modeling Long Covid Disease Network in Pediatric Population” (2023). AMCIS 2023 TREOs. 107. https://aisel.aisnet.org/treos_amcis2023/107