Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores

Research output: Working paperPreprintResearch

Standard

Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores. / Hrytsenko, Yana; Shea, Benjamin; Elgart, Michael; Kurniansyah, Nuzulul; Lyons, Genevieve; Morrison, Alanna C; Carson, April P; Haring, Bernhard; Mitchel, Braxton D; Psaty, Bruce M; Jaeger, Byron C; Gu, C Charles; Kooperberg, Charles; Levy, Daniel; Lloyd-Jones, Donald; Choi, Eunhee; Brody, Jennifer A; Smith, Jennifer A; Rotter, Jerome I; Moll, Matthew; Fornage, Myriam; Simon, Noah; Castaldi, Peter; Casanova, Ramon; Chung, Ren-Hua; Kaplan, Robert; Loos, Ruth J F; Kardia, Sharon L R; Rich, Stephen S; Redline, Susan; Kelly, Tanika; O'Connor, Timothy; Zhao, Wei; Kim, Wonji; Guo, Xiuqing; Chen, Yii Der Ida; Sofer, Tamar; Trans-Omics in Precision Medicine Consortium.

medRxiv, 2023.

Research output: Working paperPreprintResearch

Harvard

Hrytsenko, Y, Shea, B, Elgart, M, Kurniansyah, N, Lyons, G, Morrison, AC, Carson, AP, Haring, B, Mitchel, BD, Psaty, BM, Jaeger, BC, Gu, CC, Kooperberg, C, Levy, D, Lloyd-Jones, D, Choi, E, Brody, JA, Smith, JA, Rotter, JI, Moll, M, Fornage, M, Simon, N, Castaldi, P, Casanova, R, Chung, R-H, Kaplan, R, Loos, RJF, Kardia, SLR, Rich, SS, Redline, S, Kelly, T, O'Connor, T, Zhao, W, Kim, W, Guo, X, Chen, YDI, Sofer, T & Trans-Omics in Precision Medicine Consortium 2023 'Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores' medRxiv. https://doi.org/10.1101/2023.12.13.23299909

APA

Hrytsenko, Y., Shea, B., Elgart, M., Kurniansyah, N., Lyons, G., Morrison, A. C., Carson, A. P., Haring, B., Mitchel, B. D., Psaty, B. M., Jaeger, B. C., Gu, C. C., Kooperberg, C., Levy, D., Lloyd-Jones, D., Choi, E., Brody, J. A., Smith, J. A., Rotter, J. I., ... Trans-Omics in Precision Medicine Consortium (2023). Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores. medRxiv. https://doi.org/10.1101/2023.12.13.23299909

Vancouver

Hrytsenko Y, Shea B, Elgart M, Kurniansyah N, Lyons G, Morrison AC et al. Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores. medRxiv. 2023. https://doi.org/10.1101/2023.12.13.23299909

Author

Hrytsenko, Yana ; Shea, Benjamin ; Elgart, Michael ; Kurniansyah, Nuzulul ; Lyons, Genevieve ; Morrison, Alanna C ; Carson, April P ; Haring, Bernhard ; Mitchel, Braxton D ; Psaty, Bruce M ; Jaeger, Byron C ; Gu, C Charles ; Kooperberg, Charles ; Levy, Daniel ; Lloyd-Jones, Donald ; Choi, Eunhee ; Brody, Jennifer A ; Smith, Jennifer A ; Rotter, Jerome I ; Moll, Matthew ; Fornage, Myriam ; Simon, Noah ; Castaldi, Peter ; Casanova, Ramon ; Chung, Ren-Hua ; Kaplan, Robert ; Loos, Ruth J F ; Kardia, Sharon L R ; Rich, Stephen S ; Redline, Susan ; Kelly, Tanika ; O'Connor, Timothy ; Zhao, Wei ; Kim, Wonji ; Guo, Xiuqing ; Chen, Yii Der Ida ; Sofer, Tamar ; Trans-Omics in Precision Medicine Consortium. / Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores. medRxiv, 2023.

Bibtex

@techreport{789f575d6ef94f36aa42f88d9c8209e6,
title = "Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores",
abstract = "We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1% to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8% to 5.1% (SBP) and 4.7% to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs.",
author = "Yana Hrytsenko and Benjamin Shea and Michael Elgart and Nuzulul Kurniansyah and Genevieve Lyons and Morrison, {Alanna C} and Carson, {April P} and Bernhard Haring and Mitchel, {Braxton D} and Psaty, {Bruce M} and Jaeger, {Byron C} and Gu, {C Charles} and Charles Kooperberg and Daniel Levy and Donald Lloyd-Jones and Eunhee Choi and Brody, {Jennifer A} and Smith, {Jennifer A} and Rotter, {Jerome I} and Matthew Moll and Myriam Fornage and Noah Simon and Peter Castaldi and Ramon Casanova and Ren-Hua Chung and Robert Kaplan and Loos, {Ruth J F} and Kardia, {Sharon L R} and Rich, {Stephen S} and Susan Redline and Tanika Kelly and Timothy O'Connor and Wei Zhao and Wonji Kim and Xiuqing Guo and Chen, {Yii Der Ida} and Tamar Sofer and {Trans-Omics in Precision Medicine Consortium}",
year = "2023",
doi = "10.1101/2023.12.13.23299909",
language = "English",
publisher = "medRxiv",
type = "WorkingPaper",
institution = "medRxiv",

}

RIS

TY - UNPB

T1 - Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores

AU - Hrytsenko, Yana

AU - Shea, Benjamin

AU - Elgart, Michael

AU - Kurniansyah, Nuzulul

AU - Lyons, Genevieve

AU - Morrison, Alanna C

AU - Carson, April P

AU - Haring, Bernhard

AU - Mitchel, Braxton D

AU - Psaty, Bruce M

AU - Jaeger, Byron C

AU - Gu, C Charles

AU - Kooperberg, Charles

AU - Levy, Daniel

AU - Lloyd-Jones, Donald

AU - Choi, Eunhee

AU - Brody, Jennifer A

AU - Smith, Jennifer A

AU - Rotter, Jerome I

AU - Moll, Matthew

AU - Fornage, Myriam

AU - Simon, Noah

AU - Castaldi, Peter

AU - Casanova, Ramon

AU - Chung, Ren-Hua

AU - Kaplan, Robert

AU - Loos, Ruth J F

AU - Kardia, Sharon L R

AU - Rich, Stephen S

AU - Redline, Susan

AU - Kelly, Tanika

AU - O'Connor, Timothy

AU - Zhao, Wei

AU - Kim, Wonji

AU - Guo, Xiuqing

AU - Chen, Yii Der Ida

AU - Sofer, Tamar

AU - Trans-Omics in Precision Medicine Consortium

PY - 2023

Y1 - 2023

N2 - We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1% to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8% to 5.1% (SBP) and 4.7% to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs.

AB - We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1% to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8% to 5.1% (SBP) and 4.7% to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs.

U2 - 10.1101/2023.12.13.23299909

DO - 10.1101/2023.12.13.23299909

M3 - Preprint

C2 - 38168328

BT - Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores

PB - medRxiv

ER -

ID: 378953572