Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores

Research output: Working paperPreprintResearch

Documents

  • Fulltext

    Submitted manuscript, 1.35 MB, PDF document

  • Yana Hrytsenko
  • Benjamin Shea
  • Michael Elgart
  • Nuzulul Kurniansyah
  • Genevieve Lyons
  • Alanna C Morrison
  • April P Carson
  • Bernhard Haring
  • Braxton D Mitchel
  • Bruce M Psaty
  • Byron C Jaeger
  • C Charles Gu
  • Charles Kooperberg
  • Daniel Levy
  • Donald Lloyd-Jones
  • Eunhee Choi
  • Jennifer A Brody
  • Jennifer A Smith
  • Jerome I Rotter
  • Matthew Moll
  • Myriam Fornage
  • Noah Simon
  • Peter Castaldi
  • Ramon Casanova
  • Ren-Hua Chung
  • Robert Kaplan
  • Loos, Ruth
  • Sharon L R Kardia
  • Stephen S Rich
  • Susan Redline
  • Tanika Kelly
  • Timothy O'Connor
  • Wei Zhao
  • Wonji Kim
  • Xiuqing Guo
  • Yii Der Ida Chen
  • Tamar Sofer
  • Trans-Omics in Precision Medicine Consortium

We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1% to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8% to 5.1% (SBP) and 4.7% to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs.

Original languageEnglish
PublishermedRxiv
Number of pages41
DOIs
Publication statusPublished - 2023

ID: 378953572