Home    Genome Browser    BLAT Search    FAQ

 Data Sources and Other Integrated Resources
This project is in collaboration with the Human Genome Project, directed by Francis Collins. All human genome sequence data used has been generated by the laboratories involved the human genome sequencing consortium, listed below. Additional sources of data and genome annotation are given at NHGRI's Genome Hub, Human Genome Central, EBI, and Human Genome Central, NCBI. The working draft sequence and its annotation are described in the paper Initial Sequencing and Analysis of the Human Genome by the International Human Genome Sequencing Consortium, Nature, 409: 860-921, Feb. 2001.


The following specific resources were used by this project:
  • A fingerprint-based map of BAC clones fro the human genome primarily from the RP-11 library, and a corresponding layout of sequenced clones, prepared under the direction of Robert Waterston at Washington University in St. Louis.
  • A set of processed, managed DNA sequences from this layout, split at the appropriate point for working draft clones and with obvious contaminants removed, prepared by Greg Schuler at NCBI, taken from public sequence database submissions. Also, specially formated versions of the finished chromosomes 21 and 22 from Greg Schuler, prepared from data generated at the centers that sequenced these chromosomes. See the sites at the Max Plank Institute and RIKEN for more information on Chromosome 21, and the Sanger Centre. for more information on Chromosome 22.
  • EST, mRNA and BAC end sequence data and information taken from GenBank.
  • In the September 5 freeze, we used plasmid end reads generated by the Whitehead Institute for Biomedical Research for the SNP consortium. In October we also used plasmid end reads from the Washington University Genome Sequencing Center and the Sanger Centre generated for the SNP consortium.
  • In the browser for the July 17 freeze, gene predictions were supplied by David Kulp and Alan Williams from Affymetrix and Ewan Birney from Ensembl. David Kulp and Alan Williams also supplied predictions for known genes processed through Genie. The CpG island track was supplied by Kim Worley and John Bouck, and the simple repeats track by James Durbin at Baylor College of Medicine. The duplications track data was provided by Evan Eichler and Jeff Bailey at Case Western Reserve University. Lukas Wagner and Greg Schuler at NCBI supplied the data for the 3' EST track, which we filtered further before displaying.
  • Along with updated tracks from the July 17 freeze browser, in the browser for the Sept. 5 freeze the SNPs tracks were provided by Lincoln Stein and the SNP consortium. Greg Schuler provided collections of map data used in the STS track from the following original sources: Genethon genetic map, Marshfield genetic map, GeneMap99 radiation hybrid map, G3 radiation hybrid map, and the Whitehead YAC map. David Cox provided STS markers for the TNG radiation hybrid map. We placed the above STS markers on the genome sequence using a very slightly modified version of Schuler's e-PCR program. Cytogenetic markers (FISH-mapped clones) were provided by Barbara Trask and were generated by the BAC Resource Consortium, as described more fully in "Integration of cytogenetic landmarks into the draft sequence of the human genome," by the BAC Resource Consortium, Nature, Vol. 409. These markers were placed on the draft sequence by Wonhee Jang of NCBI. Sean Eddy at Wash. U. provided the RNA genes track. Olivier Jaillon at Genoscope provided the Exofish track. Further Exofish tools and results are described in 'Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence' Nature Genetics volume 25 page 235, June 2000. Data for the mouse synteny track was provided by Deanna Church at NCBI. More details can be found at the NCBI human-mouse homology map. LaDeana Hillier at Wash. U. provided an additional CpG islands track.
  • Updated versions of most of the tracks from the Sept. 5 freeze browser were provided by the same groups for the Oct. 7 freeze browser. The Fgenesh++ gene predictions were produced by Softberry Inc. See the paper "Ab initio gene finding in Drosophila genomic DNA", Genome Research 10(5) 516-522 for more information on the method. The data for the exonerate mouse track was kindly provided by Guy Slater, Michele Clamp, and Ewan Birney from Ensembl.
  • In the Dec. 12 freeze browser, the STS markers and FISH-mapped clones were placed by Terry Furey at UCSC. He also did the cytogenetic bands in this and earlier browsers.

The following resources use the information provided by this project:
  • The gene annotations and genome browser, providing multiple views of a variety of key genome annotations, built by the Ensembl group at EBI and the Sanger Centre.

A closely related set of resources:
  • A separate assembly and annotations from NCBI.


The institutions that form the Human Genome Sequencing Consortium include:

  1. Baylor College of Medicine, Houston, Texas, USA
  2. Beijing Human Genome Center, Institute of Genetics, Chinese Academy of Sciences, Beijing, China
  3. Cold Spring Harbor Laboratory, Lita Annenberg Hazen Genome Center, Cold Spring Harbor, NY, USA
  4. Gesellschaft für Biotechnologische Forschung mbH, Braunschweig, Germany
  5. Genoscope, Evry, France
  6. Genome Therapeutics Corporation, Waltham, MA, USA
  7. Institute for Molecular Biotechnology, Jena, Germany
  8. Joint Genome Institute, U.S. Department of Energy, Walnut Creek, CA, USA
  9. Keio University, Tokyo, Japan
  10. Max Planck Institute for Molecular Genetics, Berlin, Germany
  11. RIKEN Genomic Sciences Center, Saitama, Japan
  12. The Sanger Centre, Hinxton, U.K.
  13. Stanford Genome Technology Center, Palo Alto, CA, USA
  14. Stanford Human Genome Center, Palo Alto, CA, USA
  15. University of Oklahoma's Advanced Center for Genome Technology, OK, USA
  16. University of Texas Southwestern Medical Center at Dallas, TX, USA
  17. University of Washington Genome Center, Seattle, WA, USA
  18. Multimegabase Sequencing Center, Institute for Systems Biology, Seattle, WA,USA
  19. Whitehead Institute for Biomedical Research, MIT, Cambridge, MA, USA
  20. Washington University Genome Sequencing Center, St. Louis, MO, USA