Platypus
Platypus (Ornithorhynchus anatinus)
Photo courtesy of (Wikimedia Commons)

The v5.0.1 Ornithorhynchus anatinus draft assembly was produced by the Genome Sequencing Center at Washington University, St. Louis. For more information about this assembly, see Ornithorhynchus_anatinus-5.0.1 in the NCBI Assembly database.

Sample position queries

A genome position can be specified by the accession number of a sequenced genomic region, an mRNA or EST, a chromosomal coordinate range, or keywords from the GenBank description of an mRNA. The following list shows examples of valid position queries for the platypus genome. See the User's Guide for more information.

Request:   Genome Browser Response:
 
chrX5   Displays all of chromosome X5
chr3:1-1000000   Displays first million bases of chr 3
chr3:1000000+2000 Displays a region of chr 3 that spans 2000 bases, starting with position 1000000
 
AF525116   Displays region of mRNA with GenBank accession number AF525116
EH000958   Displays region of EST with GenBank accession EH000958
homeobox caudal   Lists mRNAs for caudal homeobox genes
zinc finger   Lists many zinc finger mRNAs
kruppel zinc finger   Lists only kruppel-like zinc fingers
johansson   Lists mRNAs deposited by scientist named Johansson
Miller,R.D.   Lists mRNAs deposited by co-author R.D. Miller
 
Use this last format for author queries. Although GenBank requires the search format Miller R.D., internally it uses the format Miller,R.D..


Assembly details

The following assembly information is taken from the v5.0.1 release notes:

The platypus (Ornithorhynchus anatinus) genome of a female nicknamed "Glennie" (collected at the Upper Barnard River on Glen Rock Station, New South Wales) was sequenced to a total of 6x whole genome coverage. The sequencing strategy utilized combined whole genome shotgun plasmid, fosmid and BAC end sequences. The combined sequence reads were assembled using the PCAP software (Huang, et al., 2003) using stringent parameters. After initial assembly with the PCAP software, supercontigs (ordered/oriented contigs; contigs are contiguous sequences not interrupted by gaps) were linked to the physical map (Washington University Genome Sequencing Center) using BAC end sequences and in silico digests of the sequence itself. The physical map, then, was used to organize the supercontigs into "ultracontigs" (ordered/oriented supercontigs). With the exception of those supercontigs with alignments at >95% identity to a platypus EST (Washington University Genome Sequencing Center), supercontigs smaller than 2 kb were removed from the data set prior to submission if they were >97% identical over >97% of their length to other ultracontigs larger than 2 kb or if they were deemed to be >95% repetitive (based on analysis using RECON (Bao and Eddy, 2002) for repeat identification). Further, singleton contigs (those not part of a supercontig or ultracontig) smaller than 500 bp that did not have an alignment of >95% identity to a platypus EST were not submitted.

The assembly is composed of 205,536 supercontigs (based on the physical map, 4,197 supercontigs were organized into 689 ultracontigs; the remaining 218 ultracontigs (composed of 595 contigs) were formed based solely on alignments with platypus EST data) covering 1.84 Gb of actual sequence (without including estimated gap sizes) or almost 2.0 Gb including gap sizes. Of the 1.84 Gb, 437 Mb (1507 supercontigs organized into 145 ultracontigs) have been anchored and ordered along platypus chromosomes using the physical map in combination with FISH data. The N50 statistic is defined as the length L such that 50% of all nucleotides are contained in contigs of size at least L. The N50 number is 298 and the N50 size is 967 kb.

Future improvements to the platypus sequence assembly will be dependent on the availability of funding and improvements to existing assembler software. Funding for the sequencing of the platypus genome was provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH).

Bulk downloads of the sequence and annotation data are available via the Genome Browser FTP server or the Downloads page. These data have specific conditions for use. The Platypus browser annotation tracks were generated by UCSC and collaborators worldwide. See the Credits page for a detailed list of the organizations and individuals who contributed to the success of this release.


GenBank Pipeline Details

For the purposes of the GenBank alignment pipeline, this assembly is considered to be: low-coverage.