This directory contains download files of the Saccharomyces cerevisiae genome sequence and associated annotations. The data is based on sequence dated June 2008 in the Saccharomyces Genome Database (http://www.yeastgenome.org/) and was obtained from the site http://downloads.yeastgenome.org/sequence/genomic_sequence/chromosomes/fasta/ The S288C strain was used in this sequencing project. Files included in this directory: sacCer2.2bit - contains the complete genome sequence in the 2bit file format. The utility program, twoBitToFa (available from the kent src tree), can be used to extract .fa file(s) from this file. A pre-compiled version of the command line tool can be found at: http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/ See also: http://genome.ucsc.edu/admin/git.html http://genome.ucsc.edu/admin/jk-install.html chromAgp.tar.gz - contains the list of accession identifiers for each chromosome, unpacking to one file per chromosome. chromFa.tar.gz - The assembly sequence in one file per chromosome. No masking has been applied to these sequences. There are NO RepeatMasker .out files for this assembly. chromTrf.tar.gz - Tandem Repeats Finder locations, filtered to keep repeats with period less than or equal to 12, and translated into UCSC's BED format (one file per chromosome). est.fa.gz - S. cerevisiae ESTs in GenBank. This sequence data is updated once a week via automatic GenBank updates. md5sum.txt - checksums of files in this directory mrna.fa.gz - S. cerevisiae mRNA from GenBank. This sequence data is updated once a week via automatic GenBank updates. sgdGene.upstream*.fa.gz - Saccharomyces Genome Database genes upstream sequences, 1000, 2000 and 5000 bases sacCer2.chrom.sizes - Two-column tab-separated text file containing assembly sequence names and sizes. ------------------------------------------------------------------ If you plan to download a large file or multiple files from this directory, we recommend that you use ftp rather than downloading the files via our website. To do so, ftp to hgdownload.cse.ucsc.edu [username: anonymous, password: your email address], then cd to the directory goldenPath/sacCer2/bigZips. To download multiple files, use the "mget" command: mget <filename1> <filename2> ... - or - mget -a (to download all the files in the directory) Alternate methods to ftp access. Using an rsync command to download the entire directory: rsync -avzP rsync://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/ . For a single file, e.g. chromFa.tar.gz rsync -avzP rsync://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/chromFa.tar.gz . Or with wget, all files: wget --timestamping 'ftp://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/*' With wget, a single file: wget --timestamping 'ftp://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/chromFa.tar.gz' -O chromFa.tar.gz To unpack the *.tar.gz files: tar xvzf <file>.tar.gz To uncompress the fa.gz files: gunzip <file>.fa.gz All the tables in this directory are freely available for public use.
Name Last modified Size Description
Parent Directory - xenoRefMrna.fa.gz.md5 2019-10-17 21:04 52 xenoRefMrna.fa.gz 2019-10-17 21:04 331M upstream5000.fa.gz.md5 2019-10-17 21:04 53 upstream5000.fa.gz 2019-10-17 21:04 73K upstream2000.fa.gz.md5 2019-10-17 21:04 53 upstream2000.fa.gz 2019-10-17 21:04 30K upstream1000.fa.gz.md5 2019-10-17 21:04 53 upstream1000.fa.gz 2019-10-17 21:04 16K sgdGene.upstream5000.fa.gz 2009-07-28 12:37 8.1M sgdGene.upstream2000.fa.gz 2009-07-28 12:37 3.8M sgdGene.upstream1000.fa.gz 2009-07-28 12:37 2.1M sacCer2.fa.gz 2020-01-23 02:26 3.6M sacCer2.chrom.sizes 2009-02-03 14:05 242 sacCer2.2bit 2009-02-03 14:05 2.9M mrna.fa.gz.md5 2019-10-17 21:00 45 mrna.fa.gz 2019-10-17 21:00 111K md5sum.txt 2012-01-09 13:19 434 genes/ 2020-02-05 13:47 - est.fa.gz.md5 2019-10-17 21:04 44 est.fa.gz 2019-10-17 21:04 6.2M chromTrf.tar.gz 2009-02-24 15:40 20K chromFa.tar.gz 2009-02-24 15:40 3.6M chromAgp.tar.gz 2009-02-24 15:40 711