This directory contains FASTA files which contain a modified version
of the Build 36.1 finished human genome assembly (hg18,
Mar. 2006). The chromosomal sequences were assembled by the
International Human Genome Project sequencing centers.  The hg18/36.1
assembly was changed to use IUPAC ambiguous nucleotide characters at
each base covered by a stringently filtered subset of single-base
substitutions annotated by dbSNP build 128.  For example, if the
assembly has an 'A' at a position where dbSNP has annotated an A/C/T
substitution SNP, the 'A' is replaced by 'H' in the FASTA file here.  

dbSNP single-base substitutions were excluded from masking in the
following cases:
- UCSC tagged the dbSNP item with any of these exceptions (see also
  hg18.snp128Exceptions and hg18.snp128ExceptionDesc database tables):
  - MultipleAlignments: dbSNP mapped item to multiple locations
  - ObservedMismatch: the reference allele does not appear in the item's
    observed alleles.
  - ObservedWrongFormat: the observed sequence has an unexpected format
    (no instances of this exception were found in snp128)
- dbSNP item class is not "single".
- dbSNP item length is not exactly one base.
- dbSNP item weight is greater than 1.  (lower weight = higher confidence)
The remaining single-base substitutions were used to mask the genomic 
sequence.

Files included in this directory:

chr*.subst.fa.gz - FASTA files with IUPAC characters for substitution SNPs

md5sum.txt - checksums of files in this directory

------------------------------------------------------------------
If you plan to download a large file or multiple files from this
directory, we recommend that you use ftp rather than downloading the
files via our website. To do so, ftp to hgdownload.cse.ucsc.edu
[username: anonymous, password: your email address], then cd to the
directory goldenPath/hg18/bigZips. To download multiple files, use
the "mget" command:

    mget <filename1> <filename2> ...
    - or -
    mget -a (to download all the files in the directory)

Alternate methods to ftp access.

Using an rsync command to download the entire directory:
    rsync -avzP rsync://hgdownload.cse.ucsc.edu/goldenPath/hg18/snp128Mask/ .
For a single file, e.g. chr1.subst.fa.gz
    rsync -avzP \
        rsync://hgdownload.cse.ucsc.edu/goldenPath/hg18/snp128Mask/chr1.subst.fa.gz .

Or with wget, all files:
    wget --timestamping \
        'ftp://hgdownload.cse.ucsc.edu/goldenPath/hg18/snp128Mask/*'
With wget, a single file:
    wget --timestamping \
        'ftp://hgdownload.cse.ucsc.edu/goldenPath/hg18/snp128Mask/chr1.subst.fa.gz' \
        -O chr1.subst.fa.gz

To uncompress the fa.gz files:
    gunzip <file>.fa.gz

      Name                      Last modified      Size  Description
Parent Directory - md5sum.txt 2008-02-22 16:04 1.4K chrM.subst.fa.gz 2008-01-30 15:43 6.1K chr5_h2_hap1.subst.fa.gz 2008-01-30 15:42 550K chr6_qbl_hap2.subst.fa.gz 2008-01-30 15:42 1.3M chr6_cox_hap1.subst.fa.gz 2008-01-30 15:42 1.5M chrY.subst.fa.gz 2008-01-30 15:43 7.9M chr21.subst.fa.gz 2008-01-30 15:41 11M chr22.subst.fa.gz 2008-01-30 15:41 11M chr19.subst.fa.gz 2008-01-30 15:40 17M chr20.subst.fa.gz 2008-01-30 15:40 19M chr18.subst.fa.gz 2008-01-30 15:40 24M chr17.subst.fa.gz 2008-01-30 15:40 24M chr16.subst.fa.gz 2008-01-30 15:39 25M chr15.subst.fa.gz 2008-01-30 15:39 26M chr14.subst.fa.gz 2008-01-30 15:39 28M chr13.subst.fa.gz 2008-01-30 15:39 31M chr9.subst.fa.gz 2008-01-30 15:43 38M chr12.subst.fa.gz 2008-01-30 15:39 41M chr11.subst.fa.gz 2008-01-30 15:39 42M chr10.subst.fa.gz 2008-01-30 15:38 42M chr8.subst.fa.gz 2008-01-30 15:43 45M chrX.subst.fa.gz 2008-01-30 15:43 48M chr7.subst.fa.gz 2008-01-30 15:43 49M chr6.subst.fa.gz 2008-01-30 15:42 53M chr5.subst.fa.gz 2008-01-30 15:42 57M chr4.subst.fa.gz 2008-01-30 15:41 60M chr3.subst.fa.gz 2008-01-30 15:41 62M chr1.subst.fa.gz 2008-01-30 15:38 71M chr2.subst.fa.gz 2008-01-30 15:40 76M