The Evolution of Mammalian Gene Families
Jeffery P. Demuth1, Tijl De Bie2, Jason E. Stajich3, Nello Cristianini4, Matthew W. Hahn1*
1 Department of Biology and School of Informatics, Indiana University, Bloomington, Indiana, United States of America, 2School of Electronics and Computer Science, ISIS Group, University of Southampton, Southampton, United Kingdom, 3Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina, United States of America, 4Department of Statistics, University of California Davis, Davis, California, United States of America
Abstract
1 Gene families are groups of homologous genes that are likely to have highly similar functions. Differences in family size due to lineage-specific gene duplication and gene loss may provide clues to the evolutionary forces that have shaped mammalian genomes. Here we analyze the gene families contained within the whole genomes of human, chimpanzee, mouse, rat, and dog. In total we find that more than half of the 9,990 families present in the mammalian common ancestor have either expanded or contracted along at least one lineage. Additionally, we find that a large number of families are completely lost from one or more mammalian genomes, and a similar number of gene families have arisen subsequent to the mammalian common ancestor. Along the lineage leading to modern humans we infer the gain of 689 genes and the loss of 86 genes since the split from chimpanzees, including changes likely driven by adaptive natural selection. Our results imply that humans and chimpanzees differ by at least 6% (1,418 of 22,000 genes) in their complement of genes, which stands in stark contrast to the oft-cited 1.5% difference between orthologous nucleotide sequences. This genomic “revolving door” of gene gain and loss represents a large number of genetic differences separating humans from our closest relatives.
Citation: Demuth JP, Bie TD, Stajich JE, Cristianini N, Hahn MW (2006) The Evolution of Mammalian Gene Families. PLoS ONE 1(1): e85. doi:10.1371/journal.pone.0000085
Academic Editor: Justin Borevitz, University of Chicago, United States of America
Received: October 26, 2006; Accepted: November 14, 2006; Published: December 20, 2006
Copyright: © 2006 Demuth et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by grants from the National Science Foundation (MCB-0528465 and DBI-0543586) to MWH and the National Institutes of Health (R33HG003070-01) to NC. TDB acknowledges support from the CoE EF/05/007 SymBioSys, and from GOA/2005/04, both from the Research Council K.U. Leuven. MWH and JPD are also supported by the METACyt Initiative of Indiana University, funded in part through a major grant from the Lilly Endowment, Inc. None of the sponsors played any role in any part of the study.
* To whom correspondence should be addressed. E-mail: mwh@indiana.edu
a Katholieke Universiteit Leuven, OKP Research Group, Leuven, Belgium
b Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
c Department of Engineering Mathematics, University Of Bristol, Bristol, United Kingdom
+++++