This file is from: http://hgdownload.cse.ucsc.edu/goldenPath/rn6/multiz20way/README.txt This directory contains compressed multiple alignments of the following assemblies to the Rat genome (rn6, Jul. 2014): Assemblies used in these alignments: Rat - Rattus norvegicus Jul. 2014 (RGSC 6.0/rn6) reference Rat Rattus norvegicus Jul. 2014 (RGSC 6.0/rn6) Mouse Mus musculus Dec. 2011 (GRCm38/mm10) Prairie vole Microtus ochrogaster Oct 2012 (MicOch1.0/micOch1) Guinea pig Cavia porcellus Feb. 2008 (Broad/cavPor3) Rabbit Oryctolagus cuniculus Apr. 2009 (Broad/oryCun2) Human Homo sapiens Dec. 2013 (GRCh38/hg38) Chimp Pan troglodytes May 2016 (Pan_tro 3.0/panTro5) Rhesus Macaca mulatta Nov. 2015 (BCM Mmul_8.0.1/rheMac8) Tarsier Tarsius syrichta Sep. 2013 (Tarsius_syrichta-2.0.1/tarSyr2) Dog Canis lupus familiaris Sep. 2011 (Broad CanFam3.1/canFam3) Panda Ailuropoda melanoleuca Dec. 2009 (BGI-Shenzhen 1.0/ailMel1) Cat Felis catus Nov. 2014 (ICGSC Felis_catus_8.0/felCat8) Cow Bos taurus Jun. 2014 (Bos_taurus_UMD_3.1.1/bosTau8) Opossum Monodelphis domestica Oct. 2006 (Broad/monDom5) Platypus Ornithorhynchus anatinus Feb. 2007 (ASM227v2/ornAna2) Chicken Gallus gallus Dec 2015 (Gallus_gallus-5.0/galGal5) Turkey Meleagris gallopavo Nov. 2014 (Turkey_5.0/melGal5) X. tropicalis Xenopus tropicalis Sep. 2012 (JGI 7.0/xenTro7) Zebrafish Danio rerio Sep. 2014 (GRCz10/danRer10) Elephant shark Callorhinchus milii Dec. 2013 (Callorhinchus_milii-6.1.3/calMil1) These alignments were prepared using the methods described in the track description file: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=rn6&g=cons20way based on the phylogenetic tree: rn6.20way.nh. Files in this directory: - rn6.20way.nh - phylogenetic tree used during the multiz multiple alignment - rn6.20way.commonNames.nh - same as rn6.20way.nh with the UCSC database names replaced by the common name for the species - rn6.20way.scientificName.nh - same as rn6.20way.nh with the UCSC database names replaced by the scientific name for the species - upstream*.ensGene.maf.gz - alignments of regions upstream of Ensembl genes - rn6.20way.maf.gz - the multiple alignments on the Rat genome - md5sum.txt - md5 check sums of these files to verify correct download files The "alignments" directory contains compressed FASTA alignments for the CDS regions for the gene track ensGene (v86/Oct. 2016 version) of the rat genome (rn6, Jul. 2014) aligned to the assemblies. The rn6.20way.maf.gz file contain all the alignments for the chromosomes in the rat genome, including additional annotations to indicate gap context and genomic breaks for the sequence in the underlying genome assemblies. Note, the compressed data size of the maf file is 7.2 Gb, uncompressed is more than 46 Gb. The .upstream*.*.maf.gz files contain alignments in regions upstream of annotated transcription starts for version v86/Oct. 2016 Ensembl genes. with annotated 5' UTRs. These files differ from the standard MAF format: they display alignments that extend from start to end of the upstream region in the rat whether or not alignments actually exist. In situations where no alignments exist or the alignments of one or more species are missing, dot (".") is used as a placeholder. Multiple regions of an assembly's sequence may align to a single region in rat therefore, only the species name is displayed in the alignment data and no position information is recorded. The alignment score is always zero in these files. For a description of multiple alignment format (MAF), see http://genome.ucsc.edu/goldenPath/help/maf.html. PhastCons conservation scores for these alignments are available at: http://hgdownload.cse.ucsc.edu/goldenPath/rn6/phastCons20way PhyloP conservation scores for these alignments are available at: http://hgdownload.cse.ucsc.edu/goldenPath/rn6/phyloP20way --------------------------------------------------------------- To download a large file or multiple files from this directory, we recommend that you use rsync or ftp rather than downloading the files via our website. There is approximately 7.9 Gb of compressed data in this directory. Via rsync: rsync -av --progress \ rsync://hgdownload.cse.ucsc.edu/goldenPath/rn6/multiz20way/ ./ Via FTP: ftp hgdownload.cse.ucsc.edu user name: anonymous password: <your email address> go to the directory goldenPath/rn6/multiz20way To download multiple files from the UNIX command line, use the "mget" command. mget <filename1> <filename2> ... - or - mget -a (to download all the files in the directory) Use the "prompt" command to toggle the interactive mode if you do not want to be prompted for each file that you download. --------------------------------------------------------------- All the files in this directory are freely usable for any purpose. For data use restrictions regarding the individual genome assemblies, see http://genome.ucsc.edu/goldenPath/credits.html.
Name Last modified Size Description
Parent Directory - alignments/ 2017-01-30 11:01 - md5sum.txt 2017-01-24 15:16 406 rn6.20way.commonNames.nh 2017-01-24 13:25 674 rn6.20way.nh 2017-01-24 13:25 683 rn6.20way.scientificNames.nh 2017-01-24 13:25 862 upstream1000.ensGene.maf.gz 2017-01-24 14:53 52M upstream2000.ensGene.maf.gz 2017-01-24 14:57 103M upstream5000.ensGene.maf.gz 2017-01-24 15:02 220M rn6.20way.maf.gz 2017-01-22 10:39 7.2G