This directory contains human/chimpanzee "reciprocal best" alignments made using the Jul. 2003 human assembly (NCBI Build 34, UCSC hg16) and the 13 Nov. 2003 Arachne 4x draft chimp assembly from the Broad Institute, MIT/Harvard, with sequence provided by the Broad Institute and Washington University, St. Louis. The Chimpanzee Genome Sequencing project was sponsored by NHGRI. The alignments are in 'axt' format. Each alignment contains three lines and is separated from the next alignment by a space: Line 1 - summarizes the alignment. Line 2 - contains the human sequence with inserts. Line 3 - contains the chimp sequence with inserts. The summary line contains 9 blank separated fields with the following meanings: 1 - Alignment number. The first alignment in a file is numbered 0, the next 1, and so forth. 2 - Human chromosome. 3 - Start in human chromosome. The first base is numbered 1. 4 - End in human chromosome. The end base is included. 5 - Chimp scaffold. 6 - Start in chimp scaffold. 7 - End in chimp scaffold. 8 - Chimp strand. If this is '-', the chimp start/end fields are relative to the reverse-complemented chimp scaffold. 9 - Blastz score. The scoring matrix blastz uses is: A C G T A 91 -114 -31 -123 C -114 100 -125 -31 G -31 -125 100 -114 T -123 -31 -114 91 with a gap open penalty of 400 and a gap extension penalty of 30. The minimum score for an alignment to be kept was 3000 for the first pass, and then 2200 for the second pass, which just restricts the search space to the regions between two alignments found in the first pass. The alignments were done with blastz, which is available from Webb Miller's group at Pennsylvania State University (PSU). Each chromosome was divided into 10010000 base chunks with 10000 bases of overlap. The .lav format blastz output, which does not include the sequence, was converted to .axt with PSU's lavToAxt. The axtBest subset covers 40% of the human genome.
Name Last modified Size Description
Parent Directory - chr5_random.axt.gz 2003-12-19 13:41 356 chr18_random.axt.gz 2003-12-19 13:39 685 md5sum.txt 2003-12-19 19:20 2.0K chrM.axt.gz 2003-12-19 13:42 3.8K chr13_random.axt.gz 2003-12-19 13:38 4.1K chr3_random.axt.gz 2003-12-19 13:40 4.2K chr19_random.axt.gz 2003-12-19 13:39 5.6K chr15_random.axt.gz 2003-12-19 13:38 40K chr4_random.axt.gz 2003-12-19 13:40 41K chr2_random.axt.gz 2003-12-19 13:40 48K chr7_random.axt.gz 2003-12-19 13:41 70K chr8_random.axt.gz 2003-12-19 13:41 81K chrX_random.axt.gz 2003-12-19 13:42 117K chr6_random.axt.gz 2003-12-19 13:41 131K chr17_random.axt.gz 2003-12-19 13:39 146K chr10_random.axt.gz 2003-12-19 13:37 180K chr1_random.axt.gz 2003-12-19 13:39 259K chr9_random.axt.gz 2003-12-19 13:42 261K chrUn_random.axt.gz 2003-12-19 13:42 519K chrY.axt.gz 2003-12-19 13:42 3.9M chr22.axt.gz 2003-12-19 13:40 12M chr21.axt.gz 2003-12-19 13:40 13M chr19.axt.gz 2003-12-19 13:39 19M chr20.axt.gz 2003-12-19 13:39 24M chr17.axt.gz 2003-12-19 13:39 29M chr16.axt.gz 2003-12-19 13:38 30M chr18.axt.gz 2003-12-19 13:39 31M chr15.axt.gz 2003-12-19 13:38 32M chr14.axt.gz 2003-12-19 13:38 35M chrX.axt.gz 2003-12-19 13:42 38M chr13.axt.gz 2003-12-19 13:38 39M chr9.axt.gz 2003-12-19 13:42 45M chr12.axt.gz 2003-12-19 13:38 51M chr11.axt.gz 2003-12-19 13:38 52M chr10.axt.gz 2003-12-19 13:37 52M chr8.axt.gz 2003-12-19 13:41 58M chr7.axt.gz 2003-12-19 13:41 60M chr6.axt.gz 2003-12-19 13:41 68M chr5.axt.gz 2003-12-19 13:41 72M chr4.axt.gz 2003-12-19 13:40 76M chr3.axt.gz 2003-12-19 13:40 80M chr1.axt.gz 2003-12-19 13:37 87M chr2.axt.gz 2003-12-19 13:39 96M