This directory contains the two sequences that should be used to align ENCODE sequence data to GRCh37/hg19 reference genomes. There is one sequence for female DNA, and the other for male DNA. The only difference between the sequences is that the male sequence includes chrY data with the PAR regions hard-masked (with N's). These sequences are composed of all the autosomes plus the X chromosome from GRCh37. The male sequence also includes the chrY sequence from GRCh37. None of the random chromosomes, chrUn chromosomes, or haplotype chromosomes are included in either reference sequence. The mitochondrial genome included in both references is the chrM sequence currently in use on the UCSC hg19 browser (NC_001807). This sequence is NOT considered to be the most representative of human. See this site for a discussion on what has changed between the old reference mitochondrial eequence and the new reference mitochondrial sequences. http://mitomap.org/bin/view/MITOMAP/HumanMitoSeq This directory also contains the ENCODE pilot regions lifted to hg19.
Name Last modified Size Description
Parent Directory - encodePilotRegions.hg19.bed 07-Mar-2011 08:39 1.3K female.hg19.2bit 27-Jan-2010 13:04 754M female.hg19.chrom.sizes 10-Mar-2010 11:45 362 female.hg19.fa.gz 27-Jan-2010 12:59 886M femaleByChrom/ 09-Apr-2010 16:18 - male.hg19.2bit 27-Jan-2010 13:03 768M male.hg19.chrom.sizes 10-Mar-2010 11:45 376 male.hg19.fa.gz 27-Jan-2010 12:56 893M maleByChrom/ 04-Feb-2010 18:48 - md5sum.txt 26-Aug-2010 11:48 361