Filogenia de genomas completos de mamíferos: informação evolucionária em regiôes gênicas e não gênicas

Whole-genome phylogeny of mammals: Evolutionary information in genic and nongenic regions

Gregory E. Simsa, Se-Ran Juna, Guohong Albert Wua and Sung-Hou Kima,b,1

aDepartment of Chemistry, University of California, Berkeley CA 94720; and

bLawrence Berkeley National Lab, Berkeley, CA 94720

Contributed by Sung-Hou Kim, August 19, 2009 (sent for review July 10, 2009)


Ten complete mammalian genome sequences were compared by using the “feature frequency profile” (FFP) method of alignment-free comparison. This comparison technique reveals that the whole nongenic portion of mammalian genomes contains evolutionary information that is similar to their genic counterparts—the intron and exon regions. We partitioned the complete genomes of mammals (such as human, chimp, horse, and mouse) into their constituent nongenic, intronic, and exonic components. Phylogenic species trees were constructed for each individual component class of genome sequence data as well as the whole genomes by using standard tree-building algorithms with FFP distances. The phylogenies of the whole genomes and each of the component classes (exonic, intronic, and nongenic regions) have similar topologies, within the optimal feature length range, and all agree well with the evolutionary phylogeny based on a recent large dataset, multispecies, and multigene-based alignment. In the strictest sense, the FFP-based trees are genome phylogenies, not species phylogenies. However, the species phylogeny is highly related to the whole-genome phylogeny. Furthermore, our results reveal that the footprints of evolutionary history are spread throughout the entire length of the whole genome of an organism and are not limited to genes, introns, or short, highly conserved, nongenic sequences that can be adversely affected by factors (such as a choice of sequences, homoplasy, and different mutation rates) resulting in inconsistent species phylogenies.

alignment-free genome comparison feature frequency profile (FFP) mammalian phylogeny noncoding DNA nongenic regions of the genome


