Some Terminology (some more Terminology)
- (FPC) Contig
- In this context means the largest scale of Golden Path structure below that of the Chromosome. This is a 'clone contig' or FPC, a group of clones associated
by the WashU map. Groups of clones in a 'contig' are organized into barges, then
laid out along the chromosome as a whole.
- Barge
-
a series of clones within an FPC contig that
show sequence overlap and can be ordered linearly
- raft
-
a set of fragments from overlapping clones that
is assembled by ooGreedy into a continuous stretch of
sequence
- fragment
-
a sequence contig inside of an accessioned clone (BAC, PAC
or cosmid). A fragment is assembled from reads usually
by Phrap.
- read
-
a (typically 300-1000) base sequence that was
produced by a single lane or capillary on a sequencing
machine. (No assembly required).
- Input
- Statistic is calculated on data before any assembly is done; all available data.
- Pre-GP
- Statistic is calculated on Genome data before final distillation
of the Assembly into the Golden Path.
Thus, all overlapping sequence is included.
- Golden Path / GP
- The Golden Path is the finished distillate of the assembly process. Unless otherwise
stated, all statistics on this page can be assumed derived from Golden Path
data.
- Weighted Median
- For an object associated with a length in basepairs along the Golden Path (only),
"The length such that half the basepairs in the overall sequence are in objects of the
same type with equal or lesser length".
This more algorithmically described as: for datapoints x[1..N] (in ascending order),
x[k] for the first k such that
the sum of x[i] for i less than k is greater than or equal to half the sum of all
datapoints.
- Weighted Mean
- An average over lengths, weighted proportionally to those lengths.
- Bridged Clone Gap
- A gap between two barges which are known to be near one another, generally
by mRNA or BAC-end hits. These are represented in the Golden Path as 50 Kb inserts of
'N's.
- Unbridged Clone Gap
- A gap between two barges which have no evidence for adjacency.
These are represented in the Golden Path as 100 Kb inserts of 'N's.
- Size
- Here, size and extent have specific, and different, meaning. The 'Size' of a genomic
object is the number of actual basepairs it includes - i.e., it doesn't include any
gaps. So, for example, the 'size' of a clone is the sum of the lengths of its
(nonoverlapping, for GP stats) fragments.
- Extent
- How much space the object takes up on
the genome - how far apart, in basepairs, its extreme points are.
- Coverage
- How thoroughly an clone is sequenced - specifically, the ratio of the Size to the
Extent.
- Finished, Deep, Draft Rafts
- These are classifications fo Rafts:
- Finished: raft contains at least one finished clone fragment. Often abbrev. 'Fin'.
- Deep: Unfinished raft that contains fragments from at least two different draft clones.
- Draft: All other rafts