Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores
Research output: Working paper › Preprint › Research
Standard
Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores. / Hrytsenko, Yana; Shea, Benjamin; Elgart, Michael; Kurniansyah, Nuzulul; Lyons, Genevieve; Morrison, Alanna C; Carson, April P; Haring, Bernhard; Mitchel, Braxton D; Psaty, Bruce M; Jaeger, Byron C; Gu, C Charles; Kooperberg, Charles; Levy, Daniel; Lloyd-Jones, Donald; Choi, Eunhee; Brody, Jennifer A; Smith, Jennifer A; Rotter, Jerome I; Moll, Matthew; Fornage, Myriam; Simon, Noah; Castaldi, Peter; Casanova, Ramon; Chung, Ren-Hua; Kaplan, Robert; Loos, Ruth J F; Kardia, Sharon L R; Rich, Stephen S; Redline, Susan; Kelly, Tanika; O'Connor, Timothy; Zhao, Wei; Kim, Wonji; Guo, Xiuqing; Chen, Yii Der Ida; Sofer, Tamar; Trans-Omics in Precision Medicine Consortium.
medRxiv, 2023.Research output: Working paper › Preprint › Research
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - UNPB
T1 - Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores
AU - Hrytsenko, Yana
AU - Shea, Benjamin
AU - Elgart, Michael
AU - Kurniansyah, Nuzulul
AU - Lyons, Genevieve
AU - Morrison, Alanna C
AU - Carson, April P
AU - Haring, Bernhard
AU - Mitchel, Braxton D
AU - Psaty, Bruce M
AU - Jaeger, Byron C
AU - Gu, C Charles
AU - Kooperberg, Charles
AU - Levy, Daniel
AU - Lloyd-Jones, Donald
AU - Choi, Eunhee
AU - Brody, Jennifer A
AU - Smith, Jennifer A
AU - Rotter, Jerome I
AU - Moll, Matthew
AU - Fornage, Myriam
AU - Simon, Noah
AU - Castaldi, Peter
AU - Casanova, Ramon
AU - Chung, Ren-Hua
AU - Kaplan, Robert
AU - Loos, Ruth J F
AU - Kardia, Sharon L R
AU - Rich, Stephen S
AU - Redline, Susan
AU - Kelly, Tanika
AU - O'Connor, Timothy
AU - Zhao, Wei
AU - Kim, Wonji
AU - Guo, Xiuqing
AU - Chen, Yii Der Ida
AU - Sofer, Tamar
AU - Trans-Omics in Precision Medicine Consortium
PY - 2023
Y1 - 2023
N2 - We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1% to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8% to 5.1% (SBP) and 4.7% to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs.
AB - We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1% to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8% to 5.1% (SBP) and 4.7% to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs.
U2 - 10.1101/2023.12.13.23299909
DO - 10.1101/2023.12.13.23299909
M3 - Preprint
C2 - 38168328
BT - Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores
PB - medRxiv
ER -
ID: 378953572