MagicalRsq-X - Loos Group

Novo Nordisk Foundation
Center for Basic Metabolic Research

MagicalRsq-X: A cross-cohort transferable genotype imputation quality metric

Research output: Contribution to journal › Journal article › Research › peer-review

Quan Sun
Yingxi Yang
Jonathan D Rosen
Jiawen Chen
Xihao Li
Wyliena Guan
Min-Zhi Jiang
Jia Wen
Rhonda G Pace
Scott M Blackman
Michael J Bamshad
Ronald L Gibson
Garry R Cutting
Wanda K O'Neal
Michael R Knowles
Charles Kooperberg
Alexander P Reiner
Laura M Raffield
April P. Carson
Stephen S. Rich
Jerome I. Rotter
Eimear Kenny
Byron C Jaeger
Yuan-I Min
Christian Fuchsberger
Yun Li

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.

Original language	English
Journal	American Journal of Human Genetics
Volume	111
Issue number	5
Pages (from-to)	990-995
Number of pages	6
ISSN	0002-9297
DOIs	https://doi.org/10.1016/j.ajhg.2024.04.001
Publication status	Published - 2024

Bibliographical note

Research areas

Humans, Polymorphism, Single Nucleotide, Software, Genotype, Gene Frequency, Cohort Studies, Linkage Disequilibrium, Genome-Wide Association Study/methods, Genome, Human, Quality Control, Machine Learning, Whole Genome Sequencing/standards

ID: 392988387

Novo Nordisk Foundation Center for Basic Metabolic Research

MagicalRsq-X: A cross-cohort transferable genotype imputation quality metric

Bibliographical note

Research areas