Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 2.95 MB, PDF document

  • Ying Wang
  • Shinichi Namba
  • Esteban A. Lopera-Maya
  • Sini Kerminen
  • Kristin Tsuo
  • Kristi Läll
  • Masahiro Kanai
  • Wei Zhou
  • Kuan Han H. Wu
  • Marie Julie Favé
  • Laxmi Bhatta
  • Philip Awadalla
  • Ben M. Brumpton
  • Patrick Deelen
  • Kristian Hveem
  • Valeria Lo Faro
  • Reedik Mägi
  • Yoshinori Murakami
  • Serena Sanna
  • Jordan W. Smoller
  • Jasmina Uzunovic
  • Brooke N. Wolford
  • Kuan Han H. Wu
  • Humaira Rasheed
  • Jibril B. Hirbo
  • Arjun Bhattacharya
  • Huiling Zhao
  • Ida Surakka
  • Esteban A. Lopera-Maya
  • Sinéad B. Chapman
  • Juha Karjalainen
  • Mitja Kurki
  • Maasha Mutaamba
  • Juulia J. Partanen
  • Ben M. Brumpton
  • Sameer Chavan
  • Tzu Ting Chen
  • Michelle Daya
  • Yi Ding
  • Yen Chen A. Feng
  • Christopher R. Gignoux
  • Sarah E. Graham
  • Whitney E. Hornsby
  • Nathan Ingold
  • Ruth Johnson
  • Triin Laisk
  • Kuang Lin
  • Jun Lv
  • Iona Y. Millwood
  • Loos, Ruth
  • BBJ
  • BioMe
  • BioVU
  • Canadian Partnership for Tomorrow's Health/OHS
  • China Kadoorie Biobank Collaborative Group
  • Colorado Center for Personalized Medicine
  • deCODE Genetics
  • ESTBB
  • FinnGen
  • Generation Scotland
  • Genes & Health
  • LifeLines
  • Mass General Brigham Biobank
  • Michigan Genomics Initiative
  • QIMR Berghofer Biobank
  • Taiwan Biobank
  • The HUNT Study
  • UCLA ATLAS Community Health Initiative
  • UKBB

Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.

Original languageEnglish
Article number100241
JournalCell Genomics
Volume3
Issue number1
Number of pages18
ISSN2666-979x
DOIs
Publication statusPublished - 2023

Bibliographical note

Publisher Copyright:
© 2022 The Author(s)

    Research areas

  • accuracy heterogeneity, Global-Biobank Meta-analysis Initiative, multi-ancestry genetic prediction, polygenic risk scores

ID: 351001390