Description

Retrotransposition is a process involving the copying of DNA by a group of enzymes that have the ability to reverse transcribe spliced mRNAs, resulting in single-exon copies of genes and sometime chimeric genes. RetroGenes can be either functional genes that have acquired a promoter from a neighboring gene, non-functional pseudogenes, or transcribed pseudogenes.

Methods

All mRNAs of a species from GenBank were aligned to the genome using lastz (Miller lab, Pennsylvania State University). mRNAs that aligned twice in the genome (once with introns and once without introns) were initially screened. Next, a series of features were scored to determine candidates for retrotranspostion events. These features include position and length of the polyA tail, degree of synteny with mouse, coverage of repetitive elements, number of exons that can still be aligned to the retrogene and degree of divergence from the parent gene. Retrogenes are classified using a threshold score function that is a linear combination of this set of features. Retrogenes in the final set are selected using a score threshold based on a ROC plot against the Vega annotated pseudogenes.

Retrogene Statistics table:

Break in Orthology table:

Retrogenes inserted into the genome since the human/mouse divergence show a break in the mouse genome syntenic net alignments to the human genome. The percentage break represents the portion of the genome that is missing in each species relative to the reference genome (human hg19) at the retrogene locus as defined by syntenic alignment nets. Breaks in orthology with mouse and dog tend to be due to genomic insertions in the primate lineage. Relative orthology of dog/human and Rhesus macque/human nets are used to avoid false positives due to deletions in the mouse genome. Older retrogenes will not show a break in orthology so this feature is weighted lower than other features when scoring putative retrogenes.

These features can be downloaded from the table retroMrnaInfo in many formats using the Table Browser option from the Tools menu on the top blue bar.

Credits

The RetroFinder program and browser track were developed by Robert Baertsch at UCSC.

References

Pei B, Sisu C, Frankish A, Howald C, Habegger L, Mu XJ, Harte R, Balasubramanian S, Tanzer A, Diekhans M et al. The GENCODE pseudogene resource. Genome Biology 2012 Sep 26;13(9):R51.

Baertsch R, Diekhans M, Kent J, Haussler D, Brosius J. Retrocopy contributions to the evolution of the human genome. BMC Genomics 2008 Oct 8;9:466.

Deyou Zheng, Adam Frankish, Robert Baertsch, Philipp Kapranov, Alexandre Reymond , Siew Woh Choo , Yontao Lu , France Denoeud , Stylianos E Antonarakis , Michael Snyder , Yijun Ruan, Chia-Lin Wei , Thomas R. Gingeras , Roderic Guigo , Jennifer Harrow , and Mark B. Gerstein Pseudogenes in the ENCODE Regions: Consensus Annotation, Analysis of Transcription and Evolution. Genome Res. 2007 Jun;17(6):839-51.

Kent, W.J., Baertsch, R., Hinrichs, A., Miller, W., and Haussler, D. Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci USA 100(20), 11484-11489 (2003).

Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R., Haussler, D., and Miller, W. Human-Mouse Alignments with BLASTZ. Genome Res. 13(1), 103-7 (2003).