Abstract:
It is tempting to mine the abundance of DNA data that is now available from direct-to-consumer genetic tests but this approach also has its pitfalls A recent study put forth a list of 50 single nucleotide polymorphisms (SNPs) that predispose to Chronic Fatigue Syndrome (CFS), a potentially major advance in understanding this still mysterious condition. However, only the patient cohort data came from a commercial company (23andMe) while the control was from a genetic database. The extent to which 23andMe data agree with genetic reference databases is unknown.
We reanalyzed the 50 purported CFS SNPs by comparing to control data specifically from 23andMe which are available through public platform OpenSNP. In addition, large high-quality database ALFA was used as an additional control. The analysis lead to dramatic change with the top of the leaderboard for CFS risk reduced and reversed from an astronomical 129,000 times to 0.8. Errors were found both within 23andMe data and the original study-reported Kaviar database control. Only 3 of 50 SNPs survived initial study criterion of at least twice as prevalent in patients, EFCAB4B, involved in calcium ion channel activation, LINC01171, and MORN2 genes.
We conclude that the reported top-50 deleterious polymorphisms for Chronic Fatigue Syndrome were more likely the top-50 errors in the 23andMe and Kaviar databases. In general, however, correlation of 23andMe control with ALFA was a respectable 0.93, suggesting an overall usefulness of 23andMe results for research purposes but only if caution is taken with chips and SNPs.
Source: Felice L. Bedford, Bastian Greshake Tzovaras. Re-analysis of genetic risks for Chronic Fatigue Syndrome from 23andMe data finds few remain. medRxiv 2020.10.27.20220939; doi: https://doi.org/10.1101/2020.10.27.20220939
Now published in Frontiers in Pediatrics doi: 10.3389/fped.2021.590040 https://www.medrxiv.org/content/10.1101/2020.10.27.20220939v2.full-text (Full text)