The population genetics of human disease: The case of recessive, lethal mutations
Carlos Eduardo G. Amorim , Ziyue Gao, Zachary Baker, José Francisco Diesel, Yuval B. Simons, Imran S. Haque, Joseph Pickrell , Molly Przeworski
Published: September 28, 2017 https://doi.org/10.1371/journal.pgen.1006915
Source/Fonte: Genetics
Abstract
Do the frequencies of disease mutations in human populations reflect a simple balance between mutation and purifying selection? What other factors shape the prevalence of disease mutations? To begin to answer these questions, we focused on one of the simplest cases: recessive mutations that alone cause lethal diseases or complete sterility. To this end, we generated a hand-curated set of 417 Mendelian mutations in 32 genes reported to cause a recessive, lethal Mendelian disease. We then considered analytic models of mutation-selection balance in infinite and finite populations of constant sizes and simulations of purifying selection in a more realistic demographic setting, and tested how well these models fit allele frequencies estimated from 33,370 individuals of European ancestry. In doing so, we distinguished between CpG transitions, which occur at a substantially elevated rate, and three other mutation types. Intriguingly, the observed frequency for CpG transitions is slightly higher than expectation but close, whereas the frequencies observed for the three other mutation types are an order of magnitude higher than expected, with a bigger deviation from expectation seen for less mutable types. This discrepancy is even larger when subtle fitness effects in heterozygotes or lethal compound heterozygotes are taken into account. In principle, higher than expected frequencies of disease mutations could be due to widespread errors in reporting causal variants, compensation by other mutations, or balancing selection. It is unclear why these factors would have a greater impact on disease mutations that occur at lower rates, however. We argue instead that the unexpectedly high frequency of disease mutations and the relationship to the mutation rate likely reflect an ascertainment bias: of all the mutations that cause recessive lethal diseases, those that by chance have reached higher frequencies are more likely to have been identified and thus to have been included in this study. Beyond the specific application, this study highlights the parameters likely to be important in shaping the frequencies of Mendelian disease alleles.
Author summary
What determines the frequencies of disease mutations in human populations? To begin to answer this question, we focus on one of the simplest cases: mutations that cause completely recessive, lethal Mendelian diseases. We first review theory about what to expect from mutation and selection in a population of finite size and generate predictions based on simulations using a plausible demographic scenario of recent human evolution. For a highly mutable type of mutation, transitions at CpG sites, we find that the predictions are close to the observed frequencies of recessive lethal disease mutations. For less mutable types, however, predictions substantially under-estimate the observed frequency. We discuss possible explanations for the discrepancy and point to a complication that, to our knowledge, is not widely appreciated: that there exists ascertainment bias in disease mutation discovery. Specifically, we suggest that alleles that have been identified to date are likely the ones that by chance have reached higher frequencies and are thus more likely to have been mapped. More generally, our study highlights the factors that influence the frequencies of Mendelian disease alleles.
Citation: Amorim CEG, Gao Z, Baker Z, Diesel JF, Simons YB, Haque IS, et al. (2017) The population genetics of human disease: The case of recessive, lethal mutations. PLoS Genet13(9): e1006915. https://doi.org/10.1371/journal.pgen.1006915
Editor: Philipp W. Messer, Cornell University, UNITED STATES
Received: December 4, 2016; Accepted: July 9, 2017; Published: September 28, 2017
Copyright: © 2017 Amorim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files and code from gitHub (https://github.com/cegamorim/PopGenHumDisease; https://github.com/sellalab/ForwardSimulator).
Funding: CEGA was partially funded by a Science Without Borders fellowship from CAPES foundation (BEX 8279/11-0) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (PDE 201145/2015-4), Brazil. ZG was partially supported by a postdoctoral fellowship funded by Stanford Center for Computational, Evolutionary and Human Genomics. JFD was funded by a Science Without Borders fellowship from CAPES foundation (88888.038761/2013-00). YBS was supported by NIH grant GM115889. The work was partially supported by a Research Initiative in Science and Engineering grant from Columbia University and NIGMS grants (GM121372) to JKP and MP. The computing in this project was supported by two National Institutes of Health instrumentation grants (S10OD012351 and S10OD021764) received by the Department of Systems Biology at Columbia University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
FREE PDF GRATIS: PLoS Genetics