ScienceDaily (Apr. 20, 2010) — Researchers have discovered 2,363 new DNA sequences corresponding to 730 regions on the human genome by using new approaches. These sequences represent segments of the genome that were not charted in the reference map of the human genome.
Kidd headed the study while earning his Ph.D. at the University of Washington in the Eichler lab. Kidd is now a postdoctoral fellow at Stanford University.
"Over the past several years, the extent to which the structure of the genome varies among humans has become clearer. This variation suggested that there must be portions of the human genome where DNA sequences had yet to be discovered, annotated and characterized," he said "We hope that these sequences ultimately will be included as part of future releases of the reference human genome sequence."
The reference genome is a yardstick -- or standard for comparison -- for studies of human genetics.
The human reference genome was first created in 2001 and is updated every couple of years, Kidd explained. It's a mosaic of DNA sequences derived from several individuals. He went on to say that about 80 percent of the reference genome came from eight people. One of them actually accounts for more than 66 percent of the total.
...
Read more here/Leia mais aqui: Science Daily
+++++
Nature Methods
Published online: 18 April 2010 | doi:10.1038/nmeth.1451
Characterization of missing human genome sequences and copy-number polymorphic insertions
Jeffrey M Kidd1, Nick Sampas2, Francesca Antonacci1, Tina Graves3, Robert Fulton3, Hillary S Hayden1, Can Alkan1, Maika Malig1, Mario Ventura4, Giuliana Giannuzzi4, Joelle Kallicki3, Paige Anderson2, Anya Tsalenko2, N Alice Yamada2, Peter Tsang2, Rajinder Kaul1, Richard K Wilson3, Laurakay Bruhn2 & Evan E Eichler1,5
Abstract
The extent of human genomic structural variation suggests that there must be portions of the genome yet to be discovered, annotated and characterized at the sequence level. We present a resource and analysis of 2,363 new insertion sequences corresponding to 720 genomic loci. We found that a substantial fraction of these sequences are either missing, fragmented or misassigned when compared to recent de novo sequence assemblies from short-read next-generation sequence data. We determined that 18–37% of these new insertions are copy-number polymorphic, including loci that show extensive population stratification among Europeans, Asians and Africans. Complete sequencing of 156 of these insertions identified new exons and conserved noncoding sequences not yet represented in the reference genome. We developed a method to accurately genotype these new insertions by mapping next-generation sequencing datasets to the breakpoint, thereby providing a means to characterize copy-number status for regions previously inaccessible to single-nucleotide polymorphism microarrays.
Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA.
Agilent Laboratories, Santa Clara, California, USA.
Washington University Genome Sequencing Center, School of Medicine, St. Louis, Missouri, USA.
Department of Genetics and Microbiology, University of Bari, Bari, Italy.
Howard Hughes Medical Institute, Seattle, Washington, USA.
Correspondence to: Evan E Eichler1,5 e-mail: eee@gs.washington.edu
+++++