This directory contains the mapping from sequence name in genomes distributed by other websites to the ones used by UCSC for the genome hg19.p13.plusMT. The format is <otherGenomeSeqName> <tab> <ucscName> Text files like .bed, .sam or .bedGraph that contain sequence identifiers from other genome versions can be easily converted with these mapping files and our little tool chromToUcsc. Example: wget https://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/chromToUcsc wget https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/analysisSet/chromAlias/ncbiToUcsc.txt chmod a+x chromToUcsc chromToUcsc -i test2.bed -o test2.ucsc.bed -a ncbiToUcsc.txt The genomes used were: g1k: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/human_g1k_v37.fasta.gz NCBI: https://ftp.ncbi.nlm.nih.gov/genomes/archive/old_genbank/Eukaryotes/vertebrates_mammals/Homo_sapiens/GRCh37.p13/seqs_for_alignment_pipelines/GCA_000001405.14_GRCh37.p13_full_analysis_set.fna.gz Ensembl: http://ftp.ensembl.org/pub/grch37/release-99/fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.dna_sm.primary_assembly.fa.gz A full log of the Unix commands that were run to create these files is as always available in our makeDoc directory: https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg19.analysisSet.txt
Name Last modified Size Description
Parent Directory - g1kToUcsc.txt 2020-03-09 08:25 1.8K ensemblToUcsc.txt 2020-03-09 09:18 1.8K ncbiToUcsc.txt 2020-03-09 09:36 10K