A genomic mutational constraint map using variation in 76,156 human genomes

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

A genomic mutational constraint map using variation in 76,156 human genomes. / Chen, Siwei; Francioli, Laurent C.; Goodrich, Julia K.; Collins, Ryan L.; Kanai, Masahiro; Wang, Qingbo; Alföldi, Jessica; Watts, Nicholas A.; Vittal, Christopher; Gauthier, Laura D.; Poterba, Timothy; Wilson, Michael W.; Tarasova, Yekaterina; Phu, William; Grant, Riley; Yohannes, Mary T.; Koenig, Zan; Farjoun, Yossi; Banks, Eric; Donnelly, Stacey; Gabriel, Stacey; Gupta, Namrata; Ferriera, Steven; Tolonen, Charlotte; Novod, Sam; Bergelson, Louis; Roazen, David; Ruano-Rubio, Valentin; Covarrubias, Miguel; Llanwarne, Christopher; Petrillo, Nikelle; Wade, Gordon; Jeandet, Thibault; Munshi, Ruchi; Tibbetts, Kathleen; Loos, Ruth J.F.; Karczewski, Konrad J.; Genome Aggregation Database Consortium.

In: Nature, Vol. 625, No. 7993, 2024, p. 92-100.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Chen, S, Francioli, LC, Goodrich, JK, Collins, RL, Kanai, M, Wang, Q, Alföldi, J, Watts, NA, Vittal, C, Gauthier, LD, Poterba, T, Wilson, MW, Tarasova, Y, Phu, W, Grant, R, Yohannes, MT, Koenig, Z, Farjoun, Y, Banks, E, Donnelly, S, Gabriel, S, Gupta, N, Ferriera, S, Tolonen, C, Novod, S, Bergelson, L, Roazen, D, Ruano-Rubio, V, Covarrubias, M, Llanwarne, C, Petrillo, N, Wade, G, Jeandet, T, Munshi, R, Tibbetts, K, Loos, RJF, Karczewski, KJ & Genome Aggregation Database Consortium 2024, 'A genomic mutational constraint map using variation in 76,156 human genomes', Nature, vol. 625, no. 7993, pp. 92-100. https://doi.org/10.1038/s41586-023-06045-0

APA

Chen, S., Francioli, L. C., Goodrich, J. K., Collins, R. L., Kanai, M., Wang, Q., Alföldi, J., Watts, N. A., Vittal, C., Gauthier, L. D., Poterba, T., Wilson, M. W., Tarasova, Y., Phu, W., Grant, R., Yohannes, M. T., Koenig, Z., Farjoun, Y., Banks, E., ... Genome Aggregation Database Consortium (2024). A genomic mutational constraint map using variation in 76,156 human genomes. Nature, 625(7993), 92-100. https://doi.org/10.1038/s41586-023-06045-0

Vancouver

Chen S, Francioli LC, Goodrich JK, Collins RL, Kanai M, Wang Q et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature. 2024;625(7993):92-100. https://doi.org/10.1038/s41586-023-06045-0

Author

Chen, Siwei ; Francioli, Laurent C. ; Goodrich, Julia K. ; Collins, Ryan L. ; Kanai, Masahiro ; Wang, Qingbo ; Alföldi, Jessica ; Watts, Nicholas A. ; Vittal, Christopher ; Gauthier, Laura D. ; Poterba, Timothy ; Wilson, Michael W. ; Tarasova, Yekaterina ; Phu, William ; Grant, Riley ; Yohannes, Mary T. ; Koenig, Zan ; Farjoun, Yossi ; Banks, Eric ; Donnelly, Stacey ; Gabriel, Stacey ; Gupta, Namrata ; Ferriera, Steven ; Tolonen, Charlotte ; Novod, Sam ; Bergelson, Louis ; Roazen, David ; Ruano-Rubio, Valentin ; Covarrubias, Miguel ; Llanwarne, Christopher ; Petrillo, Nikelle ; Wade, Gordon ; Jeandet, Thibault ; Munshi, Ruchi ; Tibbetts, Kathleen ; Loos, Ruth J.F. ; Karczewski, Konrad J. ; Genome Aggregation Database Consortium. / A genomic mutational constraint map using variation in 76,156 human genomes. In: Nature. 2024 ; Vol. 625, No. 7993. pp. 92-100.

Bibtex

@article{1c222f4fdef34e56b9bcce77d2f40153,
title = "A genomic mutational constraint map using variation in 76,156 human genomes",
abstract = "The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders 1–4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.",
author = "Siwei Chen and Francioli, {Laurent C.} and Goodrich, {Julia K.} and Collins, {Ryan L.} and Masahiro Kanai and Qingbo Wang and Jessica Alf{\"o}ldi and Watts, {Nicholas A.} and Christopher Vittal and Gauthier, {Laura D.} and Timothy Poterba and Wilson, {Michael W.} and Yekaterina Tarasova and William Phu and Riley Grant and Yohannes, {Mary T.} and Zan Koenig and Yossi Farjoun and Eric Banks and Stacey Donnelly and Stacey Gabriel and Namrata Gupta and Steven Ferriera and Charlotte Tolonen and Sam Novod and Louis Bergelson and David Roazen and Valentin Ruano-Rubio and Miguel Covarrubias and Christopher Llanwarne and Nikelle Petrillo and Gordon Wade and Thibault Jeandet and Ruchi Munshi and Kathleen Tibbetts and Loos, {Ruth J.F.} and Karczewski, {Konrad J.} and {Genome Aggregation Database Consortium}",
note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive licence to Springer Nature Limited.",
year = "2024",
doi = "10.1038/s41586-023-06045-0",
language = "English",
volume = "625",
pages = "92--100",
journal = "Nature",
issn = "0028-0836",
publisher = "nature publishing group",
number = "7993",

}

RIS

TY - JOUR

T1 - A genomic mutational constraint map using variation in 76,156 human genomes

AU - Chen, Siwei

AU - Francioli, Laurent C.

AU - Goodrich, Julia K.

AU - Collins, Ryan L.

AU - Kanai, Masahiro

AU - Wang, Qingbo

AU - Alföldi, Jessica

AU - Watts, Nicholas A.

AU - Vittal, Christopher

AU - Gauthier, Laura D.

AU - Poterba, Timothy

AU - Wilson, Michael W.

AU - Tarasova, Yekaterina

AU - Phu, William

AU - Grant, Riley

AU - Yohannes, Mary T.

AU - Koenig, Zan

AU - Farjoun, Yossi

AU - Banks, Eric

AU - Donnelly, Stacey

AU - Gabriel, Stacey

AU - Gupta, Namrata

AU - Ferriera, Steven

AU - Tolonen, Charlotte

AU - Novod, Sam

AU - Bergelson, Louis

AU - Roazen, David

AU - Ruano-Rubio, Valentin

AU - Covarrubias, Miguel

AU - Llanwarne, Christopher

AU - Petrillo, Nikelle

AU - Wade, Gordon

AU - Jeandet, Thibault

AU - Munshi, Ruchi

AU - Tibbetts, Kathleen

AU - Loos, Ruth J.F.

AU - Karczewski, Konrad J.

AU - Genome Aggregation Database Consortium

N1 - Publisher Copyright: © 2023, The Author(s), under exclusive licence to Springer Nature Limited.

PY - 2024

Y1 - 2024

N2 - The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders 1–4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.

AB - The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders 1–4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.

U2 - 10.1038/s41586-023-06045-0

DO - 10.1038/s41586-023-06045-0

M3 - Journal article

C2 - 38057664

AN - SCOPUS:85180828283

VL - 625

SP - 92

EP - 100

JO - Nature

JF - Nature

SN - 0028-0836

IS - 7993

ER -

ID: 380298732