The UCSC Genes track shows gene predictions based on data from RefSeq, GenBank, and the tRNA Genes track. This is a moderately conservative set of predictions, requiring the support of one GenBank RNA sequence plus at least one additional line of evidence. The RefSeq RNAs are an exception to this, requiring no additional evidence. The track includes both protein-coding and putative non-coding transcripts. Some of these non-coding transcripts may actually code for protein, but the evidence for the associated protein is weak at best. Compared to RefSeq, this gene set has generally about 10% more protein-coding genes, approximately five times as many putative non-coding genes, and about twice as many splice variants.
This track in general follows the display conventions for gene prediction tracks. The exons for putative noncoding genes and untranslated regions are represented by relatively thin blocks, while those for coding open reading frames are thicker. The following color key is used:
This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. Click here for more information about this feature.
The UCSC Genes are built using a multi-step pipeline:
The UCSC Genes track was produced at UCSC using a computational pipeline developed by Jim Kent, Chuck Sugnet and Mark Diekhans. It is based on data from NCBI RefSeq, UniProt (including TrEMBL and TrEMBL-NEW) and GenBank. Our thanks to the people running these databases and to the scientists worldwide who have made contributions to them.
The UniProt data have the following terms of use, UniProt copyright(c) 2002 - 2004 UniProt consortium:
For non-commercial use, all databases and documents in the UniProt FTP directory may be copied and redistributed freely, without advance permission, provided that this copyright statement is reproduced with each copy.
For commercial use, all databases and documents in the UniProt FTP directory except the files
From January 1, 2005, all databases and documents in the UniProt FTP directory may be copied and redistributed freely by all entities, without advance permission, provided that this copyright statement is reproduced with each copy.
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779
Hsu F, Kent WJ, Clawson H, Kuhn RM, Diekhans M, Haussler D. The UCSC Known Genes. Bioinformatics. 2006 May 1;22(9):1036-46. PMID: 16500937
Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518