cytoBand Chromosome Band bed 4 + 1 1 0 0 0 150 50 50 0 0 0

Description

\

The chromosome band track represents the approximate \ location of bands seen on Giemsa-stained chromosomes\ under conditions where 400 bands are visible across the entire\ genome.

\

Methods

\

Data are derived from the Mouse400.dat file downloaded from the NCBI ftp \ site ftp://ftp.ncbi.nih.gov/genomes/M_musculus/maps/mapview/. Band lengths\ are estimated based on relative sizes as defined by the International System for \ Cytogenetic Nomenclature (ISCN).\ \

Credits

\

We would like to thank NCBI for providing this information.\ map 1 stsMapMouseNew STS Markers bed 5 + STS Markers on Genetic and Radiation Hybrid Maps 1 5 0 0 0 128 128 255 0 0 0

Description

\

This track shows locations of Sequence Tagged Sites (STS) \ along the mouse draft assembly. These markers have been mapped using \ either genetic mapping (WICGR Mouse Genetic Map, MGD Genetic Map) or radiation hybridization mapping (Whitehead/MRC RH Map) techniques.

\ \ Additional data on the individual maps can be found at the following links:\ \ map 1 recombRateMouse Recomb Rate bed 4 + Recombination Rate from WI and MGD genetic maps (WI default) 0 8 0 0 0 127 127 127 0 0 0

Description

\

The recombination rate track represents calculated sex-averaged\ rates of recombination based on either the Whitehead Institute (WI) or\ MGD genetic maps. By default, the WI genetic map rates are displayed.\ Female and male specific recombination rates are not available for\ these maps.\

\ \

Methods

\

The WI Mouse Genetic Map is based on 6,336 genetic\ markers with a total of 92 meiotic events. For more information on\ this map, see W.F. Dietrich et. al., "A comprehensive genetic map\ of the mouse genome.", Nature, 380(6570), pages 149-152\ (1996).\

\

The MGD Genetic Map was created at the Jackson\ Laboratory.\

\

Data for these maps was downloaded from the NCBI ftp site.\

Each base is assigned the recombination rate calculated by\ assuming a linear genetic distance across the immediately flanking\ genetic markers. The recombination rate assigned to each 1Mb window\ is the average recombination rate of the bases contained within the\ window.\

\ \

Using the Filter

\

To view a particular map, select the corresponding option from the\ "Map Distances" pulldown list. By default, the browser\ displays the WI genetic map sex-averaged distances.

\ \

Credits

\

This track is produced at UCSC and uses data that are freely\ available for the WI and MGD genetic maps (see above links). Thanks\ to all who have played a part in the creation of these maps.

\ \ map 1 ctgPos Map Contigs Physical Map Contigs 0 9 150 0 0 202 127 127 0 0 0

Description

\ This track shows the locations of contigs of clones\ on the physical map. \ \

Method

\ In assembly versions prior to the August 2001\ freeze, this track was based on the Washington University accession\ map, which in turn was based on a fingerprint contig (FPC) map\ described in "A physical map of the human genome", Nature 409: 934-941. \

\

\ From the August 2001 to the Nov 2002 freeze, this track was based on\ tiling path (TPF) maps curated by the sequencing centers responsible for\ each chromosome, which were integrated into an assembly done by NCBI.\ Beginning with the April 2003 freeze, the chromosome coordinators\ at the individual sequencing centers took over complete responsibility\ for preparing the assembly of their chromosomes in AGP format. The\ files provided by these centers are checked and validated at NCBI, and\ form the basis for the definition of the physical map contigs.\

\ map 0 gold Assembly bed 3 + Assembly from Fragments 0 10 150 100 30 230 170 40 0 0 0

Description

\

\ This track shows the draft assembly of the $Organism genome. This\ assembly merges contigs from overlapping clones into longer sequence\ contigs. \

\

In dense mode, this track depicts the path through the clones \ (aka the golden path) used to create the assembled sequence. \ Clone boundaries are distinguished by the use of alternating gold and brown \ coloration. Where gaps\ exist in the path, spaces are shown between the gold and brown\ blocks. If the relative order and orientation of the contigs\ between the two blocks is known, a line is drawn to bridge the\ blocks.

\

\ All components within this track are of fragment type "W": \ Whole Genome Shotgun contig. \ \ map 1 gap Gap bed 3 + Gap Locations 1 11 0 0 0 127 127 127 0 0 0

Description

\ This track depicts gaps in the assembly. These gaps - with the\ exception of intractable heterochromatic gaps - will be closed during the\ finishing process. \

\ Gaps are represented as black boxes in this track.\ If the relative order and orientation of the contigs on either side\ of the gap is known, it is a bridged gap and a white line is drawn \ through the black box representing the gap. \

\

There are four principal types of gaps:\

\ map 1 bacEndPairs BAC End Pairs bed 6 + BAC End Pairs 0 15 0 0 0 127 127 127 0 0 0

Description

\

Bacterial artificial chromosomes (BACs) are a key part of many large\ scale sequencing projects. A BAC typically consists of 50-300kb of\ DNA. During the early phase of a sequencing project, it is common\ to sequence a single read (approximately 500 bases) off each end of\ a large number of BACs. Later on in the project, these BAC end reads\ can be mapped to the genome sequence. \

\

This track shows these mappings\ in cases where both ends could be mapped. These BAC end pairs can\ be useful for validating the assembly over relatively long ranges. In some\ cases, the BACs are useful biological reagents. This track can also be\ used for determining which BAC contains a given gene, useful information\ for certain wet lab experiments.\ \

A valid pair of BAC end sequences must be\ at least 50Kb but no more than 600Kb away from each other. \ The orientation of the first BAC end sequence must be "+" and\ the orientation of the second BAC end sequence must be "-".

\ \

Methods

\

BAC end sequences are placed on the assembled sequence using\ Jim Kent's \ blat \ program.

\ \

Credits

\

Additional information about the clone, including how it\ can be obtained, may be found at the \ NCBI Clone Registry.\ To view the registry entry for a specific clone, open the details page for the clone and click on its name at the top of the page.\

\ map 1 exonArrows off\ gcPercent GC Percent bed 4 + Percentage GC in 20,000 Base Windows 0 23 0 0 0 127 127 127 1 0 0

Description

\

\ The GC percent track shows the percentage of G (guanine) and C (cytosine) bases\ in a 20,000 base window. Windows with high GC content are drawn more darkly \ than windows with low GC content. High GC content is typically associated with \ gene-rich areas.\

\

Credits

\

\ This track was generated at UCSC.\ map 1 knownGene Known Genes genePred refPep refMrna Known Genes Based on SWISS-PROT, TrEMBL, mRNA, and RefSeq 3 34 12 12 120 133 133 187 0 0 0

Description

\

\ The Known Genes track shows known protein coding genes based on \ proteins from SWISS-PROT, TrEMBL, and TrEMBL-NEW and their\ corresponding mRNAs from \ GenBank.\ Coding exons are displayed as thicker blocks than 5' and 3' \ untranslated regions (UTR). Connecting introns \ are one-pixel lines with hatch marks indicating direction of transcription.\ Entries which have corresponding entries in PDB are colored black.\ Entries which either have corresponding proteins in SWISS-PROT or mRNAs that are \ NCBI Reference Sequences with a "Reviewed" status are colored dark blue.\ Entries which have mRNAs that are \ NCBI Reference Sequences with a "Provisional" status are colored lighter blue.\ Everything else is colored with lightest blue.

\ \

Methods

\

\ All mRNAs of a species are aligned against the genome using the blat\ program. When a single mRNA aligns in multiple places, only\ the best alignments are kept. The alignments must also have \ at least 98% sequence identity to be kept. \ This set of mRNA alignments is further reduced by keeping only those mRNAs that \ are referenced by a protein in SWISS-PROT, TrEMBL, or TrEMBL-NEW.

\

\ Among multiple mRNAs referenced by a single protein, the best mRNA is chosen based on \ a quality score, which depends on its length, how good its translation matches \ the protein sequence, and its release date.\ The list of mRNA and protein pairs are further cleaned up by removing \ short invalid entries and consolidating entries with identical CDS regions.

\

\ Finally, RefSeq entries which are derived from DNA sequences instead of \ mRNA sequences are added. Disease annotations are from SWISS-PROT.

\ \

Credits

\

\ The Known Genes track is produced at UCSC based primarily on cross-references \ between proteins from \ SWISS-PROT \ (also including TrEMBL and TrEMBL-NEW) and mRNAs from GenBank\ generated by scientists worldwide. Part of \ NCBI RefSeq \ data are also included in this track.

\ \

Data Use Restrictions

\

\ The SWISS-PROT entries in this annotation track are copyrighted. They are \ produced through a collaboration \ between the Swiss Institute of Bioinformatics and the EMBL Outstation - the \ European Bioinformatics Institute. There are no restrictions on their use by \ non-profit institutions as long as their content is in no way modified and this \ statement is not removed. Usage by and for commercial entities requires a \ license agreement (see \ http://www.isb-sib.ch/announce/ or send an email to \ license@isb-sib.ch).

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2004)\ GenBank: update. \ Nucleic Acids Res. 32 Database issue:D23-6.\ genes 1 cdsDrawDefault genomic codons\ hgGene on\ refGene RefSeq Genes genePred refPep refMrna RefSeq Genes 1 35 12 12 120 133 133 187 0 0 0

Description

\

\ The RefSeq Genes track shows known protein-coding genes taken from mRNA \ reference sequences compiled at LocusLink. Coding exons are represented by \ blocks connected by horizontal lines representing introns. The 5' and 3' \ untranslated regions (UTRs) are displayed as thinner blocks on the leading \ and trailing ends of the aligning regions. In full display mode, arrowheads \ on the connecting intron lines indicate the direction of transcription.\ The color shading indicates the level of review the RefSeq record has \ undergone: predicted (light), provisional (medium), reviewed (dark). \

\

\ Non-coding RNA genes have their own track in some assemblies.\

\

Method

\

\ Refseq mRNAs are aligned against the genome using the \ blat\ program. When a single mRNA aligns in multiple places, only\ the best alignments which also have at least 98% sequence identity are kept.\

\

Using the Filter

\

The track filter can be used to configure the labeling of the features within\ the track. By default, items are labeled by gene name. Click the \ appropriate Label option to display the accession name instead of the gene\ name, show both the gene and accession names, or turn off the label completely.\ After you have made your selection, click Submit to return to the tracks display\ page.\

Credits

\

\ The RefSeq Genes track is produced at UCSC from mRNA sequence data\ generated by scientists worldwide and curated by the \ NCBI RefSeq project. \

\ genes 1 mgcGenes MGC Genes genePred Mammalian Gene Collection Full ORF mRNAs 0 36 34 139 34 144 197 144 0 0 0

Description

\

\ This track shows alignments of $organism mRNAs from the\ Mammalian Gene Collection (MGC)\ having full-length open reading frames (ORFs) to the genome. \ Coding exons are represented by \ blocks connected by horizontal lines representing introns. The 5' and 3' \ untranslated regions (UTRs) are displayed as thinner blocks on the leading \ and trailing ends of the aligning regions. In full display mode, arrowheads \ on the connecting intron lines indicate the direction of transcription.\

\

Method

\

\ GenBank $organism MGC mRNAs identified as having full-length ORFs are\ aligned against the genome using the \ blat \ program. When a single mRNA aligns\ in multiple places, the alignment having the highest base identity is found. \ Only alignments that have a base identity level within 1% of\ the best and also have at least 95% base identity are kept.

\ \

Credits

\

\ The $Organism MGC full-length mRNA track is produced at UCSC from mRNA sequence data\ submitted to \ GenBank\ by the \ Mammalian Gene Collection project.

\ genes 1 ensGene Ensembl Genes genePred ensPep Ensembl Gene Predictions 1 40 150 0 0 202 127 127 0 0 0 http://www.ensembl.org/Mus_musculus/transview?transcript=$$

Description

\ These gene predictions are from Project Ensembl.\ \

Methods

\

For a description of the methods used in Ensembl gene prediction, refer to \ \ The Ensembl genome database project, Nucleic Acids Research, \ 2002, 30(1) 38-41.

\ \

Credits

\ Thanks to the Project Ensembl for providing this annotation.\ \ genes 1 ECgene ECgene Genes genePred ECgenePep ECgene Gene Predictions with Alt-Splicing 0 41.5 155 0 125 205 127 190 0 0 0

Description

\

\ ECgene (gene prediction by \ EST clustering) predicts genes by combining genome-based EST clustering and transcript \ assembly methods. The EST clustering is based on genomic alignment of mRNA and ESTs \ similar to that of NCBI's UniGene for the human genome. The transcript \ assembly procedure yields gene models for each cluster that include alternative splicing\ variants. This algorithm was developed by Prof. Sanghyuk Lee's Lab of Bioinformatics at \ Ewha Womans University in Seoul, Korea.\

\ For more detailed information, see the ECgene website.\ \

Methods

\ The following is a brief summary of the ECgene algorithm: \
    \
  1. \ Genomic alignment of mRNA and ESTs: Input sequences are aligned against the \ genome using the Blat program developed by Jim Kent. Blat alignments are corrected for \ valid splice sites, and the SIM4 program is used for suspicious alignments if necessary.\
  2. \ Sequences that share more than one splice site are clustered together. This produces the \ primary clusters without unspliced sequences (singletons).\
  3. \ The genomic alignment of exons in each spliced sequence is represented as a directed \ acyclic graph (DAG), and all possible gene models are derived by the depth-first-search \ (DFS) method.\
  4. \ Sequences compatible with each gene model are grouped together as sub-clusters. Gene \ models without sufficient evidence are discarded at this stage. Sensitive detection of \ polyA tails is achieved by analyzing genomic alignment of mRNA and EST sequences,\ and specifically used to determine the gene boundary.\
  5. \ Finally, unspliced sequences are added so as not to change the splice sites of the \ existing gene model.\
\ \

Credits

\ The predictions for this track were produced by Namshin Kim and Sanghyuk Lee at Ewha \ Womans Univeristy, Seoul, KOREA.\ genes 1 ensEst Ensembl ESTs genePred ensEstPep $Organism ESTs From Ensembl 0 42 175 20 125 215 137 190 0 0 0

Description

\

\ Gene predictions from Ensembl based on ESTs.

\ \

Methods

\

For a description of the methods used, refer to \ "\ The Ensembl genome database project", Nucleic Acids Research, \ 2002, 30(1) 38-41.

\ \

Credits

\

Thanks to Ensembl for providing this annotation.

\ \ genes 1 twinscan Twinscan genePred twinscanPep Twinscan Gene Predictions Using Mouse/Human Homology 0 45 0 100 100 0 50 50 0 0 0

Description & Credits

\

\ Twinscan predicts genes in a manner similar to Genscan, except that\ Twinscan takes advantage of genome comparison to improve gene prediction\ accuracy. The Nov. 2002 (hg13) human assembly was used in creating this \ annotation. \

\

\ The Twinscan algorithm is described in Korf, I., P. Flicek, D. Duan, and M.R. Brent. \ 2001. Integrating genomic homology into gene structure prediction. \ Bioinformatics 17:S140-148.\ More information and a web server can be found at http://genes.cs.wustl.edu/.\

\ \ genes 1 sgpGene SGP Genes genePred sgpPep SGP Gene Predictions Using Mouse/Human Homology 0 47 0 90 100 127 172 177 0 0 0

Description

\

\ This track shows gene predictions from the SGP program, which is being developed at \ the Grup de Recerca en\ Informàtica Biomèdica (GRIB) at Institut Municipal d'Investigació Mèdica (IMIM) in \ Barcelona. To predict genes in a genomic\ query, SGP combines geneid predictions with tblastx comparisons of the \ genomic query against other genomic sequences. \

\

Credits

\

\ Thanks to GRIB for providing these gene predictions.\

\ \ \ \ genes 1 softberryGene Fgenesh++ Genes genePred softberryPep Fgenesh++ Gene Predictions 0 48 0 100 0 127 177 127 0 0 0

Description

\

Fgenesh++ predictions are based on Softberry's gene finding software.

\ \

Methods

\ Fgenesh++ uses both hidden Markov models (HMMs) and protein similarity to find genes in a completely \ automated manner. For more information, see the paper Solovyev VV (2001), \ "Statistical approaches in Eukaryotic gene prediction" in the Handbook of \ Statistical Genetics (ed. Balding D. et al.), John Wiley & Sons, Ltd., p. 83-127.\ \

Credits

\

The Fgenesh++ gene predictions were produced by \ Softberry Inc. \ Commercial use of these predictions is restricted to viewing in \ this browser. Please contact Softberry Inc. to make arrangements for further commercial access.\ \ genes 1 geneid Geneid Genes genePred geneidPep Geneid Gene Predictions 0 49 0 90 100 127 172 177 0 0 0

Description

\

\ This track shows gene predictions from the geneid program developed at the \ Grup de Recerca en\ Informàtica Biomèdica (GRIB) at Institut Municipal d'Investigació Mèdica (IMIM) in \ Barcelona. \

\

Methods

\

\ Geneid is a program to predict genes in anonymous genomic sequences designed \ with a hierarchical structure. In the first step, splice sites, start and stop \ codons are predicted and scored along the sequence using Position Weight Arrays \ (PWAs). Next, exons are built from the sites. Exons are scored as the sum of the \ scores of the defining sites, plus the the log-likelihood ratio of a \ Markov Model for coding DNA. Finally, from the set of predicted exons, the gene \ structure is assembled, maximizing the sum of the scores of the assembled exons. \

\

Credits

\

\ Thanks to GRIB for providing these data.\

\ genes 1 genscan Genscan Genes genePred genscanPep Genscan Gene Predictions 1 50 170 100 0 212 177 127 0 0 0

Description

\

This track shows predictions from the \ Genscan program written by Chris Burge.\

\

Methods

\ For a description of the Genscan program and the model that underlies it, refer\ to Burge C and Karlin S (1997), \ "Prediction of Complete Gene Structures in Human Genomic DNA", \ J. Mol. Biol. 268(1):78-94. The splice site models used are described in \ more detail in Burge C (1998), "Modeling Dependencies in Pre-mRNA Splicing \ Signals" in Salzberg S, Searls D, and Kasif S, eds. \ \ Computational Methods in Molecular Biology, Elsevier Science, Amsterdam, \ 127-163. \ \

Credits

\ Thanks to Chris Burge for providing these data.\ genes 1 superfamily Superfamily bed 4 + Superfamily/SCOP: Proteins Having Homologs with Known Structure/Function 0 53 150 0 0 202 127 127 0 0 0 http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/cgi-bin/gene.cgi?genome=

Description

\

\ The \ Superfamily \ track shows proteins having homologs with known structures or functions.

\

\ Each entry on the track shows the coding region of a gene (based on Ensembl gene predictions).\ In full display mode, the label for an entry consists of the names of \ all known protein domains coded by this gene. This \ usually contains structural and/or function descriptions that provide valuable information to help users get a quick grasp of the biological significance of the gene.

\

Method

\

\ Data are downloaded from the Superfamily server.\ Using the cross-reference between Superfamily entries and Ensembl gene prediction entries\ and their alignment to the appropriate genome, the associated data are processed to generate \ a simple BED format track.

\

Credits

\

\ Superfamily is developed by\ Julian\ Gough at the MRC Laboratory\ of Molecular Biology, Cambridge.

\

\ Gough, J., Karplus, K., Hughey, R. and\ Chothia, C. (2001). "Assignment of Homology to Genome Sequences using a\ Library of Hidden Markov Models that Represent all Proteins of Known Structure". \ J. Mol. Biol., 313(4), 903-919.

\ \ genes 1 mrna $Organism mRNAs psl . $Organism mRNAs from GenBank 3 54 0 0 0 127 127 127 1 0 0

Description

\

\ The $Organism mRNA track shows alignments between $organism mRNAs\ in GenBank and the genome. Aligning regions (usually exons)\ are shown as black boxes connected by lines for gaps (spliced-out introns, \ usually). In full display, arrows on the introns\ indicate the direction of transcription.

\ \

Method

\

\ GenBank $organism mRNAs are aligned against the genome using the \ blat\ program. When a single mRNA aligns in multiple places, \ the alignment having the highest base identity is found. \ Only alignments that have a base identity level within 1% of\ the best are kept. Alignments must also have at least 95%\ base identity to be kept.

\ \

Using the Filter

\

The track filter can be used to change the color or include/exclude a subset of individual \ items within a track. This is helpful when many items are shown in the track\ display, especially when only some are relevant to the current task. To use the\ filter:\

    \
  1. Enter a value in one or more of the text boxes to filter the mRNA display. For\ example, to apply the filter to all liver mRNAs, type "liver" in the \ tissue box. For a list of permissible filter values, consult the non-positional table in\ the Table Browser that corresponds to the factor on which you wish to filter. For\ example, the non-positional table "tissue" contains all of the types of tissues\ that can be entered into the tissue text box. Wildcards can also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only mRNAs that match all of the filter criteria will\ be highlighted. If "or" is selected, mRNAs that match any one of the filter criteria\ will be highlighted.\
  3. Choose the color or display characteristic that will be used to highlight or\ include/exclude the filtered items. If "exclude" is chosen, the browser will not \ display mRNAs that match the filter criteria. If "include" is selected, the browser \ will display only those mRNAs that match the filter criteria.\

\

\ When you have finished configuring the filter, click the Submit button.

\ \

Credits

\

\ The $Organism mRNA track is produced at UCSC from mRNA sequence data\ submitted to the international public sequence databases by \ scientists worldwide.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2004)\ GenBank: update. \ Nucleic Acids Res. 32 Database issue:D23-6.\ rna 1 cdsDrawOptions enabled\ intronEst Spliced ESTs psl est $Organism ESTs That Have Been Spliced 1 56 0 0 0 127 127 127 1 0 0

Description

\

The Spliced EST track displays Expressed Sequence Tags \ (ESTs) from GenBank that show signs of splicing when\ aligned against the genome. To be considered spliced, an EST must show \ evidence of at least one cannonical intron, i.e. one that is at least\ 32 bases in length and has GT/AG ends. By requiring splicing, the level \ of contamination in the EST databases is drastically reduced\ at the expense of eliminating many genuine 3' ESTs.\ For a display of all ESTs (including unspliced), see the \ $Organism EST track.

\ \

Expressed sequence tags are single-read (typically\ approximately 500 base) sequences which usually\ represent fragments of transcribed genes. Aligning \ regions (usually exons) are shown as black boxes \ connected by lines for gaps (usually spliced-out introns). \ In full display mode, arrows on the introns\ indicate the direction of transcription, which is determined\ by looking at the splice sites.

\ \

Strand information provided for ESTs (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.\ \

Method

\

To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector, and a read taken from the 5'\ and/or 3' primer. For most - but not all - ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.

\ \

In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries are starting to hit transcription start\ reasonably well. Before the cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to get sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ (Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination.) Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism. However, because the $organism 3' UTRs are quite\ long, the splicing requirement does eliminate many genuine 3'\ ESTs.

\ \

To generate this track, $organism ESTs from GenBank are aligned \ against the genome using the \ blat \ program. Note that the maximum intron length\ allowed by blat is 500,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligns in multiple places, the alignment having the \ highest base identity is found. Only alignments that have \ a base identity level within 1% of the best are kept. \ Alignments must also have at least 93% base identity to be kept.

\ \

Using the Filter

\

The track filter can be used to change the color or include/exclude a subset of \ individual items within a track. This is helpful when many items are shown in the \ track display, especially when only some are relevant to the current task. To use the\ filter:\

    \
  1. Enter a value in one or more of the text boxes to filter the EST display. For\ example, to apply the filter to all ESTs expressed in the liver, type "liver" in the \ tissue box. For a list of permissible filter values, consult the non-positional table in\ the Table Browser that corresponds to the factor on which you wish to filter. For\ example, the non-positional table "tissue" contains all of the types of tissues\ that can be entered into the tissue text box. Wildcards can also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only ESTs that match all of the filter criteria will\ be highlighted. If "or" is selected, ESTs that match any one of the filter criteria\ will be highlighted.\
  3. Choose the color or display characteristic that should be used to highlight or\ include/exclude the filtered items. If "exclude" is chosen, the browser will not \ display ESTs that match the filter criteria. If "include" is selected, the browser \ will display only those ESTs that match the filter criteria.\

\

\ When you have finished configuring the filter, click the Submit button.Credits\

\ The Spliced EST track is produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2004)\ GenBank: update. Nucleic Acids Res. 32 Database issue:D23-6.\ rna 1 est $Organism ESTs psl est $Organism ESTs Including Unspliced 0 57 0 0 0 127 127 127 1 0 0

Description

\

\ This track shows alignments between $organism Expressed\ Sequence Tags (ESTs) in \ GenBank and the genome.

\ \

Expressed sequence tags are single-read (typically\ approximately 500 base) sequences which usually\ represent fragments of transcribed genes. Aligning \ regions (usually exons) are shown as black boxes \ connected by lines for gaps (usually spliced out introns). \ In full display mode, arrows on the introns\ indicate the direction of transcription, which is\ determined by looking at the splice sites. \

\ \

Strand information provided for ESTs (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.\ \

Method

\

To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector, and a read taken from the 5'\ and/or 3' primer. For most - but not all - ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.

\ \

In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries are starting to hit transcription start\ reasonably well. Before the cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to get sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ (Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination.) Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism. However, because the $organism 3' UTRs are quite\ long, the splicing requirement does eliminate many genuine 3'\ ESTs.

\ \

To generate this track, $organism ESTs from GenBank are aligned \ against the genome using the \ blat \ program. Note that the maximum intron length\ allowed by blat is 500,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligns in multiple places, the alignment having the \ highest base identity is found. Only alignments that have \ a base identity level within 1% of the best are kept. \ Alignments must also have at least 93% base identity to be kept.

\ \

Using the Filter

\

The track filter can be used to change the color or include/exclude a subset of \ individual items within a track. This is helpful when many items are shown in the \ track display, especially when only some are relevant to the current task. To use the\ filter:\

    \
  1. Enter a value in one or more of the text boxes to filter the EST display. For\ example, to apply the filter to all ESTs expressed in the liver, type "liver" in the \ tissue box. For a list of permissible filter values, consult the non-positional table in\ the Table Browser that corresponds to the factor on which you wish to filter. For\ example, the non-positional table "tissue" contains all of the types of tissues\ that can be entered into the tissue text box. Wildcards can also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only ESTs that match all of the filter criteria will\ be highlighted. If "or" is selected, ESTs that match any one of the filter criteria\ will be highlighted.\
  3. Choose the color or display characteristic that should be used to highlight or\ include/exclude the filtered items. If "exclude" is chosen, the browser will not \ display ESTs that match the filter criteria. If "include" is selected, the browser \ will display only those ESTs that match the filter criteria.\

\

\ When you have finished configuring the filter, click the Submit button.

\ \

Credits

\

\ The $Organism EST track is produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.\ \

References

\

\ \ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2004)\ GenBank: update. Nucleic Acids Res. 32 Database issue:D23-6.\ \ \ rna 1 xenoMrna Non-$Organism mRNAs psl xeno Non-$Organism mRNAs from GenBank 0 63 0 0 0 127 127 127 1 0 0

Description

\

\ This track displays translated \ blat\ alignments of\ non-$organism vertebrate and invertebrate mRNA from \ GenBank.

\ \

The strand information (+/-) for this track is in two parts. The\ first + indicates the orientation of the query sequence whose\ translated protein produced the match (here always 5' to 3', hence +).\ The second + or - indicates the orientation of the matching \ translated genomic sequence. Because the two orientations of a DNA \ sequence give different predicted protein sequences, there are four \ combinations. ++ is not the same as --; nor is +- the same as -+.\ \ \

Method

\

\ The alignments were passed through a near-best-in-genome filter.

\ \

Using the Filter

\

The track filter can be used to color, include, or exclude a subset of individual \ items within a track. This is helpful when many items are shown in the track\ display, especially when only some are relevant to the current task. To use the\ filter:\

    \
  1. Enter a value in one or more of the text boxes to filter the mRNA display. For\ example, to apply the filter to all brain mRNAs, type "brain" in the \ tissue box. For a list of permissible filter values, consult the non-positional table in\ the Table Browser that corresponds to the factor on which you wish to filter. For\ example, the non-positional table "tissue" contains all of the types of tissues\ that can be entered into the tissue text box. Wildcards can also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only mRNAs that match all of the filter criteria will\ be highlighted. If "or" is selected, mRNAs that match any one of the filter criteria\ will be highlighted.\
  3. Choose the color or display characteristic that will be used to highlight or\ include/exclude the filtered items. If "exclude" is chosen, the browser will not \ display mRNAs that match the filter criteria. If "include" is selected, the browser \ will display only those mRNAs that match the filter criteria.\

\

When you have finished configuring the filter, click the Submit button.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2004)\ GenBank: update. \ Nucleic Acids Res. 32 Database issue:D23-6.\ rna 1 cdsDrawOptions enabled\ xenoEst Non-$Organism ESTs psl xeno Non-$Organism ESTs from GenBank 0 65 0 0 0 127 127 127 1 0 0 http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?form=4&db=n&term=$$

Description

\

\ This track displays translated \ blat\ alignments of non-$organism vertebrate ESTs from \ GenBank.

\ \

The strand information (+/-) for this track is in two parts. The\ first + or - indicates the orientation of the query sequence whose\ translated protein produced the match. The second + or - indicates the\ orientation of the matching translated genomic sequence. Because the two\ orientations of a DNA sequence give different predicted protein sequences,\ there are four combinations. ++ is not the same as --; nor is +- the same\ as -+.\ \

Method

\

\ To generate this track, the ESTs are aligned against the genome using the blat\ program. The alignments are passed through a piecewise near-best-in-genome\ filter.

\ \

Using the Filter

\

The track filter can be used to change the color or include/exclude a subset of \ individual items within a track. This is helpful when many items are shown in the \ track display, especially when only some are relevant to the current task. To use the\ filter:\

    \
  1. Enter a value in one or more of the text boxes to filter the EST display. For\ example, to apply the filter to all ESTs expressed in the liver, type "liver" in the \ tissue box. For a list of permissible filter values, consult the non-positional table in\ the Table Browser that corresponds to the factor on which you wish to filter. For\ example, the non-positional table "tissue" contains all of the types of tissues\ that can be entered into the tissue text box. Wildcards can also be used in the\ filter.\
  2. If filtering on more than one value, choose the desired combination\ logic. If "and" is selected, only ESTs that match all of the filter criteria will\ be highlighted. If "or" is selected, ESTs that match any 1 of the filter criteria\ will be highlighted.\
  3. Choose the color or display characteristic that should be used to highlight or\ include/exclude the filtered items. If "exclude" is chosen, the browser will not \ display ESTs that match the filter criteria. If "include" is selected, the browser \ will display only those ESTs that match the filter criteria.\

\

When you have finished configuring the filter, click the Submit button.

\ \

Credits

\

\ This track is produced at UCSC from EST sequence data submitted to the\ international public sequence databases by scientists worldwide.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2004)\ GenBank: update. Nucleic Acids Res. 32 Database issue:D23-6.\ rna 1 tigrGeneIndex TIGR Gene Index genePred Alignment of TIGR Gene Index TCs Against the $Organism Genome 0 68 100 0 0 177 127 127 0 0 0 http://www.tigr.org/tigr-scripts/tgi/tc_report.pl?$$

Description

\

This track displays alignments of the TIGR Gene Index (TGI)\ against the $organism genome. The TIGR Gene Index is based\ largely on assemblies of EST sequences in the public databases.\ See \ www.tigr.org for more information about TIGR and the Gene Index.

\

Credits

\

Thanks to Foo Cheung and Razvan Sultana of the The Institute for Genomic Research, for converting these data into a track for the browser.

\ rna 1 rnaCluster Gene Bounds bed 12 Gene Boundaries as Defined by RNA and Spliced EST Clusters 0 71 200 0 50 227 127 152 0 0 0

Description

\

\ This track shows the boundaries of genes and the direction of\ transcription as deduced from clustering spliced ESTs and mRNAs\ against the genome. When there are many spliced variants\ of the same gene, this track shows the variant that\ spans the greatest distance in the genome.

\ \

Method

\

\ ESTs and mRNAs from \ GenBank are aligned against the genome with the \ blat\ program, and filtered to keep only those alignments\ that have at least 97.5% base identity within the \ aligning blocks. When multiple alignments occur, only the\ alignments with a percentage identity within 0.2% of the\ best alignment are kept. ESTs that align without any\ introns are discarded. Blocks that are less than 130 bases\ and are not next to an intron are discarded. Blocks smaller\ than 10 bases are discarded. The orientations of the \ ESTs and mRNAs are deduced from the GT/AG splice sites\ at the introns, and ESTs and mRNAs with overlapping blocks\ on the same strand are merged into clusters. Only the\ extent and orientation of the clusters are shown here.\

\

Credits

\

\ This track -- which was originally developed by Jim Kent --\ was generated at UCSC and uses data submitted to GenBank by \ scientists worldwide.

\ \

References

\

\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2004)\ GenBank: update. \ Nucleic Acids Res. 32 Database issue:D23-6.\ rna 1 affyGnfU74A GNF U74A expRatio GNF Expression Atlas on Mouse Affymetrix U74A Chip 0 82 0 0 0 127 127 127 0 0 0

Description

\

This track shows expression data from GNF (The Genomics Institute of the Novartis Research Foundation)\ using the Affymetrix U74A chip.\

\ \

Methods

\

For detailed information about the experiments, see Su et al., \ "Large-scale analysis of the human and mouse transcriptomes.", \ PNAS, Mar 19, 2002. Alignments displayed on the track\ correspond to the consensus sequences used by Affymetrix from which to\ choose probes.

\

In dense mode, the track color denotes the average signal over all\ experiments on a log base 2 scale. Lighter colors correspond to lower signals \ and darker colors correspond to higher signals. In full\ mode, the color of each item represents the log base 2 ratio of the signal of\ that particular experiment to the median signal of all experiments for that probe.\

More information about individual probes and probe sets is available at\ Affymetrix's netaffx.com website. \ \

Credits

\

Thanks to GNF for providing these data.

\ regulation 0 chip U74\ expScale 4.0\ expStep 0.5\ expTable gnfMouseU74aAllExps\ affyGnfU74B GNF U74B expRatio GNF Expression Atlas on Mouse Affymetrix U74B Chip 0 82.1 0 0 0 127 127 127 0 0 0

Description

\

This track shows expression data from GNF (The Genomics Institute of the Novartis Research Foundation)\ using the Affymetrix U74B chip.\

\ \

Methods

\

For detailed information about the experiments, see Su et al., \ "Large-scale analysis of the human and mouse transcriptomes.", \ PNAS, Mar 19, 2002. Alignments displayed on the track\ correspond to the consensus sequences used by Affymetrix from which to\ choose probes.

\

In dense mode, the track color denotes the average signal over all\ experiments on a log base 2 scale. Lighter colors correspond to lower signals \ and darker colors correspond to higher signals. In full\ mode, the color of each item represents the log base 2 ratio of the signal of\ that particular experiment to the median signal of all experiments for that probe.\

More information about individual probes and probe sets is available at\ Affymetrix's netaffx.com website. \ \

Credits

\

Thanks to GNF for providing these data.

\ regulation 0 chip U74\ expScale 4.0\ expStep 0.5\ expTable gnfMouseU74bAllExps\ affyGnfU74C GNF U74C expRatio GNF Expression Atlas on Mouse Affymetrix U74C Chip 0 82.2 0 0 0 127 127 127 0 0 0

Description

\

This track shows expression data from GNF (The Genomics Institute of the Novartis Research Foundation)\ using the Affymetrix U74C chip.\

\ \

Methods

\

For detailed information about the experiments, see Su et al., \ "Large-scale analysis of the human and mouse transcriptomes.", \ PNAS, Mar 19, 2002. Alignments displayed on the track\ correspond to the consensus sequences used by Affymetrix from which to\ choose probes.

\

In dense mode, the track color denotes the average signal over all\ experiments on a log base 2 scale. Lighter colors correspond to lower signals \ and darker colors correspond to higher signals. In full\ mode, the color of each item represents the log base 2 ratio of the signal of\ that particular experiment to the median signal of all experiments for that probe.\

More information about individual probes and probe sets is available at\ Affymetrix's netaffx.com website. \ \

Credits

\

Thanks to GNF for providing these data.

\ regulation 0 chip U74\ expScale 4.0\ expStep 0.5\ expTable gnfMouseU74cAllExps\ affyU74 Affy U74 psl . Alignments of Affymetrix Consensus Sequences from MG-U74 v2 (A,B, and C) 0 86 0 0 0 127 127 127 0 0 0

Description

\

This track shows the location of the consensus sequences used for the selection of probes on the Affymetrix MG-U74v2 set (A,B and C) of chips.

\ \

Methods

\

Consensus sequences were downloaded from the\ Affymetrix Product Support\ and mapped to the genome using blat followed by pslReps with the parameters\ -minCover=0.3, -minAli=0.95 and -nearTop=0.005. \ \

Credits

\ Thanks to Affymetrix for the data underlying this\ track.\ regulation 1 cpgIsland CpG Islands bed 4 + CpG Islands (Islands < 300 Bases are Light Green) 0 90 0 100 0 128 228 128 0 0 0

Description

\

\ CpG islands are associated with genes, particularly housekeeping\ genes, in vertebrates. CpG islands are typically common near\ transcription start sites, and may be associated with promoter\ regions. Normally a C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in\ vertebrate DNA because the C's in such an arrangement tend to be\ methylated. This methylation helps distinguish the newly synthesized\ DNA strand from the parent strand, which aids in the final stages of\ DNA proofreading after duplication. However, over evolutionary time\ methylated C's tend to turn into T's because of spontaneous\ deamination. The result is that CpG's are relatively rare unless\ there is selective pressure to keep them or a region is not methylated\ for some reason, perhaps having to do with the regulation of gene\ expression. CpG islands are regions where CpG's are present at\ significantly higher levels than is typical for the genome as a whole.\

\ \

Method

\

\ CpG islands are predicted by searching the sequence one base at a\ time, scoring each dinucleotide (+17 for CG and -1 for others) and\ identifying maximally scoring segments. Each segment is then\ evaluated to determine GC content (roughly >= 50%), length (> 200), and ratio of\ observed number of CG dinucleotides to the expected number on\ the basis of the GC content of the segment (> 0.6). \

\

\ The CpG count is the number of CG dinucleotides in the island. \ The Percentage CpG is the ratio of CpG nucleotide bases\ (twice the CpG count) to the length.\

\ \

Credits

\

\ This track was generated \ using a\ modification of a program developed by G. Miklem and L. Hillier. \

\ \ regulation 1 blatFugu Fugu Blat psl xeno Takifugu rubripes Translated Blat Alignments 1 113 0 60 120 200 220 255 1 0 0

Description

\

\ The Fugu v.3.0 (Aug. 2002) whole genome shotgun assembly was provided by the\ US \ DOE Joint Genome Institute (JGI). The assembly was constructed with the JGI\ assembler, JAZZ, from paired end sequencing reads produced at JGI, Myriad \ Genetics, and Celera Genomics, resulting in a sequence coverage of 5.7X. All reads are\ plasmid, cosmid, or BAC end sequences, with the predominant coverage\ derived from 2 Kb insert plasmids. This assembly contains 20,379\ scaffolds totaling 319 million base pairs. The largest 679 scaffolds\ total 160 million base pairs.\ \

Methods

\

The alignments were done \ with \ blat \ in translated protein mode requiring two nearby 4-mer matches\ to trigger a detailed alignment. The human\ genome was masked with \ RepeatMasker and Tandem Repeat Finder before \ running blat.

\ \

Credits

\

The 3.0 draft from the\ \ JGI Fugu rubripes website was used in the\ UCSC Genome Browser Fugu blat alignments. These data have been provided freely by the JGI\ for use in this publication only.

\ compGeno 1 ecoresFr1 Fugu Ecores bed 12 . Mouse/Fugu Evolutionary Conserved Regions 0 113.1 0 60 120 200 220 255 0 0 0 http://www.genoscope.cns.fr/comparative?premier=Mouse&deuxieme=Takifugu&position=$$

Description

\ This track shows Evolutionary Conserved Regions computed by the \ \ Exofish program at Genoscope.\ Each singleton block corresponds to an "ecore"; two blocks connected by a thin line \ correspond to an "ecotig", a set of colinear ecores in a syntenic region. \ \

Methods

\ Genome-wide sequence comparisons were done at the protein-coding level between the genome sequences\ of mouse, Mus musculus, and Fugu (Japanese pufferfish), Takifugu \ rubripes, to detect evolutionarily conserved regions (ECORES), which may \ also be viewed using the Genoscope comparative genomics browser. \ The sequence versions used in the comparison were \ Mouse (February 2003) and Fugu (August 2002). \ \ \

Credits

\ \ Thanks to Olivier Jaillon at Genoscope for contributing the data. \ \ compGeno 1 urlLabel Exofish Ecotig Display Link:\ ecoresTetraodon Tetraodon Ecores bed 12 . Mouse/Tetraodon Evolutionary Conserved Regions 0 113.2 0 60 120 200 220 255 0 0 0 http://www.genoscope.cns.fr/comparative?premier=Mouse&deuxieme=Tetraodon&position=$$

Description

\ This track shows Evolutionary Conserved Regions computed by the \ \ Exofish program at Genoscope.\ Each singleton block corresponds to an "ecore"; two blocks connected by a thin line \ correspond to an "ecotig", a set of colinear ecores in a syntenic region. \ \

Methods

\ Genome-wide sequence comparisons were done at the protein-coding level between the genome sequences\ of mouse, Mus musculus, and Tetraodon, Tetraodon nigroviridis, to detect evolutionarily conserved regions (ECORES), which may also be viewed \ using the Genoscope comparative genomics browser. \ The sequence versions used in the comparison were \ Mouse (February 2003) and Tetraodon (March 2004). \ \ \

Credits

\ \ Thanks to Olivier Jaillon at Genoscope for contributing the data. \ \ compGeno 1 urlLabel Exofish Ecotig Display Link:\ mm3Hg15L Human Cons sample 0 8 Mouse/Human (Apr. 2003/hg15) Evolutionary Conservation Score 2 120 100 50 0 175 150 128 0 0 0

Description

\

\ This track displays the conservation between the mouse (Feb. 2003) and human (April 2003) genomes for \ 50 bp windows in the mouse genome that have at least 15 bp aligned to\ human. Unlike previous versions of this track, it is based on the netting alignment which\ throws out some of the less well supported blastz alignment pieces.\ The score for a window reflects the probability that the\ level of observed conservation in that 50 bp region would occur by\ chance under neutral evolution. It is given on a logarithmic scale,\ and thus it is called the "L-score". An L-score of 1 means there is a\ 1/10 probability that the observed conservation level would occur by\ chance, an L-score of 2 means a 1/100 probability, an L-score of 3\ means a 1/1000 probability, etc. The L-scores display as\ "mountain ranges". Clicking on a mountain range, a detail page is\ displayed from which you can access the base level alignments, both\ for the whole region and for the individual 50 bp windows.\

\ \

Methods

\

\ Genome-wide alignments between mouse and human were produced by\ blastz and filtered for pseudogenes and artifacts with Jim Kent's "netting". \ A set of 50 bp windows in the mouse genome were determined\ by scanning the sequence, sliding 5 bases at a time, and only those\ windows with at least 15 aligned bases were kept. For each window,\ a conservation score defined by\

\

\ S = sqrt(n/m(1-m))(p-m)\
\
\ was calculated, where n is the number of aligning bases in the\ window, p is the percent identity between mouse and human for these\ aligning bases, and m is the average percent identity for aligned\ neutrally evolving bases in a larger region surrounding the 50 bp\ window being scored. Neutral bases were taken from ancestral repeat\ sequences, which are relics of transposons that were inserted before\ the mouse-human split. To transform S into an L-score, the empirical\ cumulative distribution function CDF(S) = P(x < S)\ is computed from the scores of all windows genome-wide, and\ the L-score is defined as\

\
\ L = -log_10(1 - CDF(S)).\
\
\
\ The L-score\ provides a frequentist confidence assessment. A Bayesian\ calculation of the probability that a window is under\ selection can also be made using a mixture decomposition of\ the empirical density of the scores for all windows\ genome-wide into a neutral and a selected component. Details\ are given in a manuscript in preparation. The results are\ summarized in the table below.\

\
\
\
L-score       Frequentist probability       Bayesian probability\
              of this L-score or greater    that window with this\
              given neutral evolution       L-score is under\
                                            selection\
\
------------------------------------------------------------------\
\
   1                0.1                          0.32 \
  2                0.01                         0.75\
  3                0.001                        0.94\
  4                0.0001                       0.97\
  5                0.00001                      0.98\
  6                0.000001                     0.99\
    7                0.0000001                    >0.99 \
   8                0.00000001                   >0.99\
\
\
\

\ \

Using the Filter

\

The track filter can be used to configure some of the display characteristics\ of the track. \

\

\ When you have finished configuring the filter, click the Submit button.

\ \

Credits

\

\ Thanks to Jim Kent for creating the blastz alignments and post-processing them to create the netted alignment.\ Ryan Weber computed the windows s-scores, computed the CDF of these scores, and created the remaining track display\ functions. Mark Diekhans and Krishna Roskin created software used in this process.\ Mouse sequence data are provided by the Mouse Genome Sequencing Consortium.\

\ \ compGeno 0 blastzTightHuman Human Tight psl xeno hg15 $o_Organism ($o_date/$o_db) Blastz Tight Subset of Alignments 0 124 100 50 0 255 240 200 1 0 0

Description

\ \

\ This track displays blastz alignments of the April 2003 human\ assembly to the mouse genome, filtered by axtBest and \ subsetAxt with very stringent constraints as described below. \ \

Each item in the display is identified by the chromosome, strand, and \ location of the match (in thousands).\ \

Methods

\

\ For blastz, we use 12 of 19 seeds and then score using: \

\
      A     C     G     T\
A    91  -114   -31  -123\
C  -114   100  -125   -31\
G   -31  -125   100  -114\
T  -123   -31  -114    91\
\
O = 400, E = 30, K = 3000, L = 3000, M = 50\
\

\ A second pass is done at reduced stringency (7mer seeds and\ MSP threshold of K=2200) to attempt to fill in gaps of up to about 10K bp.\ Lineage specific repeats are abridged during this alignment. \ axtBest \ is used to select only the best alignment for any given region\ of the genome. subsetAxt is then run on axtBest-filtered alignments \ with this matrix:\

\
\
      A     C     G     T\
A   100  -200  -100  -200\
C  -200   100  -200  -100\
G  -100  -200   100  -200\
T  -200  -100  -200   100\
\

\ with a gap open penalty of 2000 and a gap extension penalty of 50. \ The minimum score threshold was 3400.\

\

Using the Filter

\

The track filter can be used to turn on the chromosome color track or to \ filter the display output by chromosome.\

\ When you have finished configuring the filter, click the Submit button.\

\

Credits

\ The alignments are contributed by Scott Schwartz of the \ \ Penn State Bioinformatics Group. \ The best in genome filtering is done by UCSC's \ axtBest \ and subsetAxt programs. Mouse sequence data are provided by the \ Mouse Genome Sequencing Consortium. \ \ compGeno 1 otherDb hg15\ humanChain Human Chain chain hg15 $o_Organism ($o_date/$o_db) Chained Alignments 0 125 100 50 0 255 240 200 1 0 0

Description

\ This track shows human/$organism genomic alignments using\ a gap scoring system that allows longer gaps than traditional\ affine gap scoring systems. It can also tolerate gaps\ in both human and $organism simultaniously. These "double-sided"\ gaps can be caused by local inversions and overlapping deletions\ in both species. The human sequence is from the April 2003 (hg15) assembly.\

\ The chain track displays boxes joined together by either single or \ double lines. The boxes represent aligning regions. \ Single lines indicate gaps that are largely due to a deletion in the genome of \ the non-$organism species or an insertion in the $organism assembly.\ Double lines represent more complex gaps that involve substantial\ sequence in both species. This may result from inversions, overlapping\ deletions, an abundance of local mutation, or an unsequenced gap in one species.\ In cases where there are multiple \ chains over a particular portion of the $organism genome, chains with \ single-lined gaps are often due to processed pseudogenes, while chains \ with double-lined gaps are more often due to paralogs and unprocessed \ pseudogenes.\ \

The display indicates the chromosome, strand, and location of the \ match for each matching alignment (in thousands). \ \

Methods

\ Transposons that have been inserted since the human/$organism\ split are removed, and the resulting abbreviated genomes are\ aligned with blastz. The transposons are then put back into the\ alignments. The resulting alignments are converted into axt format\ and the resulting axts are fed into axtChain. AxtChain organizes all \ the alignments between a single human and a single $organism chromosome,\ into a group, and makes a kd-tree out of all the gapless subsections\ (blocks) of the alignments. Next, maximally scoring chains of these\ blocks are found by running a dynamic program over the kd-tree. Chains\ scoring below a threshold are discarded, and the remaining chains are\ displayed here.\ \

Credits

\

\ Blastz \ was developed at Pennsylvania State University by \ Scott Schwartz, Zheng Zhang, and Webb Miller with advice from\ Ross Hardison.

\ \

Lineage-specific repeats were identified by Arian Smit and his\ program RepeatMasker.

\ \

The axtChain program was developed at the University of California\ at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.

\ \

The browser display and database storage of the chains were made\ by Robert Baertsch and Jim Kent.

\ \

References

\

\ Chiaromonte F, Yap VB, Miller W (2002). \ Scoring pairwise genomic sequence alignments. \ Pac Symp Biocomput 2002;:115-26.

\

\ Kent WJ, Baertsch R, Hinrichs A, Miller W, and Haussler D (2003).\ Evolution's cauldron: \ Duplication, deletion, and rearrangement in the mouse and human genomes. \ Proc Natl Acad Sci USA 100(20):11484-11489 Sep 30 2003.\

\ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison R, Haussler D, and \ Miller W (2003). \ Human-Mouse \ Alignments with BLASTZ. Genome Res. 13(1):103-7.

\ compGeno 1 otherDb hg15\ humanNet Human Net netAlign hg15 humanChain $o_Organism ($o_date/$o_db) Alignment Net 0 126 0 0 0 127 127 127 1 0 0

Description

\ This track shows the best human/$organism chain for \ every part of the $organism genome. It is useful for\ finding orthologous regions and for studying genome\ rearrangement. In full mode the top level (level 1)\ chains are the largest, highest scoring chains that\ span this region. In many cases there are gaps in the\ top level chain. When possible, these are filled in by\ other chains that are displayed at level 2. The gaps in \ level 2 chains may be filled by level 3 chains and so\ forth. The human sequence used in this annotation is from the April 2003 (hg15) assembly.\

\ In the graphical display, the boxes represent ungapped \ alignments, while the lines represent gaps. Clicking\ on a box brings up detailed information about the chain\ as a whole, while clicking on a line brings up information\ on the gap. The detailed information is useful in determining\ the cause of the gap or, for lower level chains, the genomic\ rearrangement.
\ \

Methods

\ First, chains are derived from \ blastz alignments as\ described in the details pages of the chain tracks\ and sorted so that the highest scoring chain in the\ genome is first. The program chainNet then places\ chains one at a time, trimming them as necessary to\ fit into section that is not already covered by \ a higher scoring chain. During this process, a\ natural hierarchy emerges where chains that fill gaps\ in a previous chain are considered underneath the\ previous chain. The program netSyntenic fills in\ information about the relationship between upper\ and lower level chains, including whether a lower level\ chain is syntenic with the higher level chain, whether\ it is inverted with respect to the higher level chain,\ and so forth. The program netClass then fills in \ how much of the gaps and chains are filled with N's\ (sequencing gaps) in one or both species, how much\ is filled with transposons inserted before and after\ human and $organism diverged, and so forth.\ \

Credits

\

The chainNet, netSyntenic, and netClass programs were\ developed at the University of California\ at Santa Cruz by Jim Kent.\ For more information, see \ Kent WJ, Baertsch R, Hinrichs A, Miller W, and Haussler D (2003). \ Evolution's cauldron: \ Duplication, deletion, and rearrangement in the mouse and human genomes. \ Proc Natl Acad Sci USA 100(20):11484-11489 Sep 30 2003.\ \

Lineage-specific repeats were identified by Arian Smit and his\ program RepeatMasker.

\ \

The browser display and database storage of the nets were made\ by Robert Baertsch and Jim Kent.

\ \ compGeno 0 otherDb hg15\ syntenyHuman Human Synteny bed 4 + Mouse/$o_Organism ($o_date/$o_db) Synteny Using Blastz Single Coverage (100k window) 0 127 0 100 0 255 240 200 0 0 0

Description

\

\ This track shows syntenous (corresponding) regions between human and mouse chromosomes. The Apr. 2003 (hg15) human assembly was used for this annotation.\

Methods

\

\ We passed a 100k non-overlapping window over the genome and - using the \ blastz\ best in mouse \ genome alignments - looked for high-scoring regions with at least 40% of the bases aligning \ with the same region in mouse. 100k segments were joined together if they agreed in direction and\ were within 500kb of each other in the human genome and within 4mb of each other in the mouse. \ Gaps were joined between syntenic anchors if the bases between two flanking regions agreed with \ synteny (direction and mouse location). Finally, we extended the syntenic block to include those \ areas.

\

Credits

\

\ Contact Robert \ Baertsch at UCSC for more information about this track.\ Thanks to the Mouse Genome Sequencing Consortium for providing the mouse sequence data. \ compGeno 1 otherDb hg15\ blastzBestHuman Human Best psl xeno hg15 $o_Organism ($o_date/$o_db) Blastz Best-in-Genome Alignments 0 127.5 100 50 0 255 240 200 1 0 0

Description

\

This track displays blastz alignments of the $o_date $o_organism \ assembly to the $organism genome filtered to display only the best alignment for any\ given region of the $organism genome. The track has an optional\ feature that color codes alignments to indicate the chromosomes from which \ they are derived in the aligning assembly. To activate the color feature,\ click the on radio button next to "Color track based on chromosome".\

\

Methods

\

\ For blastz, we use 12 of 19 seeds and then score using: \

\
      A     C     G     T\
A    91  -114   -31  -123\
C  -114   100  -125   -31\
G   -31  -125   100  -114\
T  -123   -31  -114    91\
\
O = 400, E = 30, K = 3000, L = 3000, M = 50\
\

\ We then do a second pass at reduced stringency (7mer seeds and\ MSP threshold of K=2200) to attempt to fill in gaps of up to about 10K bp.\ Lineage specific repeats are abridged during this alignment.\

\

Using the Filter

\

The track filter can be used to turn on the chromosome color track or to \ filter the display output by chromosome.\

\ When you have finished configuring the filter, click the Submit button.\

\

Credits

\

\ These alignments are contributed by Scott Schwartz of the Penn State Bioinformatics Group. The best in genome filtering\ is done by UCSC's \ axtBest \ program. Mouse sequence data are provided by the \ Mouse Genome Sequencing Consortium. \

\ compGeno 1 otherDb hg15\ blastzTightRat Rat Tight psl xeno rn2 $o_Organism ($o_date/$o_db) Blastz Tight Subset of Best Alignments 0 140 100 50 0 255 240 200 1 0 0

Description

\ \

\ This track displays blastz alignments of the Jan. 2003 rat \ assembly to the mouse genome, filtered by axtBest and \ subsetAxt with very stringent constraints as described below. \ \

Each item in the display is identified by the chromosome, strand, and \ location of the match (in thousands).\ \

Methods

\

\ For blastz, we use 12 of 19 seeds and then score using: \

\
      A     C     G     T\
A    91  -114   -31  -123\
C  -114   100  -125   -31\
G   -31  -125   100  -114\
T  -123   -31  -114    91\
\
O = 400, E = 30, K = 3000, L = 3000, M = 50\
\

\ A second pass is done at reduced stringency (7mer seeds and\ MSP threshold of K=2200) to attempt to fill in gaps of up to about 10K bp.\ Lineage specific repeats are abridged during this alignment. \ axtBest \ is used to select only the best alignment for any given region of the genome. \ subsetAxt is then run on axtBest-filtered alignments \ with this matrix:\
\

\
      A     C     G     T\
A   100  -200  -100  -200\
C  -200   100  -200  -100\
G  -100  -200   100  -200\
T  -200  -100  -200   100\
\

\ with a gap open penalty of 2000 and a gap extension penalty of 50. \ The minimum score threshold was 3400.\

\

Using the Filter

\

The track filter can be used to turn on the chromosome color track or to \ filter the display output by chromosome.\

\ When you have finished configuring the filter, click the Submit button.\

\

Credits

\ The alignments are contributed by Scott Schwartz of the\ \ Penn State Bioinformatics Group. \ The best in genome filtering is done by UCSC's \ axtBest \ and subsetAxt programs. Mouse sequence data are provided by the \ Mouse Genome Sequencing Consortium. \ \ compGeno 1 otherDb rn2\ syntenyRat Rat Synteny bed 4 + $Organism/Rat (Nov. 2002/rn1) Synteny Using Blastz Single Coverage (100k window) 0 140 0 100 0 255 240 200 0 0 0

Description

\

\ This track shows syntenous (corresponding) regions between $Organism and rat chromosomes. The Nov. 2002 (rn1) assembly of the rat genome was used in this annotation.\

Methods

\

\ We passed a 100k non-overlapping window over the genome and using the \ blastz \ best in rat \ genome alignments - looked for high-scoring regions with at least 40% of the bases aligning \ with the same region in rat. 100k segments were joined together if they agreed in direction and\ were within 500kb of each other in the $Organism genome and within 4mb of each other in the rat. \ Gaps were joined between syntenic anchors if the bases between two flanking regions agreed with \ synteny (direction and rat location). Finally, we extended the syntenic block to include those \ areas.

\

Credits

\

\ Contact Robert \ Baertsch at UCSC for more information about this track.\ compGeno 1 otherDb rn1\ blastzBestRat Rat Best psl xeno rn2 $o_Organism ($o_date/$o_db) Blastz Best-in-Genome Alignments 0 141 100 50 0 255 240 200 1 0 0

Description

\

This track displays blastz alignments of the Nov. 2002 rat \ assembly to the mouse genome filtered to display only the best alignment for any\ given region of the mouse genome. The track has an optional\ feature that color codes alignments to indicate the chromosomes from which \ they are derived in the aligning assembly. To activate the color feature,\ click the on radio button next to "Color track based on chromosome".\

\

Methods

\

\ For blastz, we use 12 of 19 seeds and then score using: \

\
      A     C     G     T\
A    91  -114   -31  -123\
C  -114   100  -125   -31\
G   -31  -125   100  -114\
T  -123   -31  -114    91\
\
O = 400, E = 30, K = 3000, L = 3000, M = 50\
\

\ We then do a second pass at reduced stringency (7mer seeds and\ MSP threshold of K=2200) to attempt to fill in gaps of up to about 10K bp.\ Lineage specific repeats are abridged during this alignment.\

\

Using the Filter

\

The track filter can be used to turn on the chromosome color track or to \ filter the display output by chromosome.\