The file: sacCer3.gc5Base.varStep.txt.gz is the raw data used to encode
the gc5Base track on sacCer3. This file was produced from the sacCer3.2bit
sequence with the kent source tree utility hgGcPercent thusly:
hgGcPercent -wigOut -doGaps -file=stdout -win=5 sacCer3 sacCer3.2bit \
| gzip > sacCer3.gc5Base.varStep.txt.gz
The format of the data is variableStep wiggle data described
at: http://genome.ucsc.edu/goldenPath/help/wiggle.html
with a "span" size of 5 bases. Each value is for a window
size of 5 bases. Thus, possible values are:
0, 20, 40, 60, 80, 100
for a GC count of: 0, 1, 2, 3, 4, 5 respectively.
Odd values can occur when fewer than 5 bases are used in
the calculation depending upon the location of gaps in
the sequence.
Note: the chromosome coordinates given are 1-relative coordinates.
The first nucleotide on a chromosome is numbered 1.
Example data from chrI:
variableStep chrom=chrI span=5
1 60
6 60
11 80
16 60
... etc ...
Meaning base positions 1 through 5 have value 60,
base positions 6 through 10 have value 60, base positions
11 through 15 have value 80, etc ...
Name Last modified Size Description
Parent Directory -
md5sum.txt 2011-09-02 15:33 65
sacCer3.gc5Base.varStep.txt.gz 2011-08-24 12:01 6.0M