Platypus (Ornithorhynchus anatinus) Photo courtesy of (Wikimedia Commons) |
The v5.0.1 Ornithorhynchus anatinus draft assembly was produced by the Genome Sequencing Center at Washington University, St. Louis. For more information about this assembly, see Ornithorhynchus_anatinus-5.0.1 in the NCBI Assembly database.
A genome position can be specified by the accession number of a sequenced genomic region, an mRNA or EST, a chromosomal coordinate range, or keywords from the GenBank description of an mRNA. The following list shows examples of valid position queries for the platypus genome. See the User's Guide for more information.
Request: | Genome Browser Response: | |
---|---|---|
chrX5 | Displays all of chromosome X5 | |
chr3:1-1000000 | Displays first million bases of chr 3 | |
chr3:1000000+2000 | Displays a region of chr 3 that spans 2000 bases, starting with position 1000000 | |
| ||
AF525116 | Displays region of mRNA with GenBank accession number AF525116 | |
EH000958 | Displays region of EST with GenBank accession EH000958 | |
homeobox caudal | Lists mRNAs for caudal homeobox genes | |
zinc finger | Lists many zinc finger mRNAs | |
kruppel zinc finger | Lists only kruppel-like zinc fingers | |
johansson | Lists mRNAs deposited by scientist named Johansson | |
Miller,R.D. | Lists mRNAs deposited by co-author R.D. Miller | |
Use this last format for author queries. Although GenBank requires the search format Miller R.D., internally it uses the format Miller,R.D.. |
The following assembly information is taken from the v5.0.1 release notes:
The platypus (Ornithorhynchus anatinus) genome of a female nicknamed "Glennie" (collected at the Upper Barnard River on Glen Rock Station, New South Wales) was sequenced to a total of 6x whole genome coverage. The sequencing strategy utilized combined whole genome shotgun plasmid, fosmid and BAC end sequences. The combined sequence reads were assembled using the PCAP software (Huang, et al., 2003) using stringent parameters. After initial assembly with the PCAP software, supercontigs (ordered/oriented contigs; contigs are contiguous sequences not interrupted by gaps) were linked to the physical map (Washington University Genome Sequencing Center) using BAC end sequences and in silico digests of the sequence itself. The physical map, then, was used to organize the supercontigs into "ultracontigs" (ordered/oriented supercontigs). With the exception of those supercontigs with alignments at >95% identity to a platypus EST (Washington University Genome Sequencing Center), supercontigs smaller than 2 kb were removed from the data set prior to submission if they were >97% identical over >97% of their length to other ultracontigs larger than 2 kb or if they were deemed to be >95% repetitive (based on analysis using RECON (Bao and Eddy, 2002) for repeat identification). Further, singleton contigs (those not part of a supercontig or ultracontig) smaller than 500 bp that did not have an alignment of >95% identity to a platypus EST were not submitted.
The assembly is composed of 205,536 supercontigs (based on the physical map, 4,197 supercontigs were organized into 689 ultracontigs; the remaining 218 ultracontigs (composed of 595 contigs) were formed based solely on alignments with platypus EST data) covering 1.84 Gb of actual sequence (without including estimated gap sizes) or almost 2.0 Gb including gap sizes. Of the 1.84 Gb, 437 Mb (1507 supercontigs organized into 145 ultracontigs) have been anchored and ordered along platypus chromosomes using the physical map in combination with FISH data. The N50 statistic is defined as the length L such that 50% of all nucleotides are contained in contigs of size at least L. The N50 number is 298 and the N50 size is 967 kb.
Future improvements to the platypus sequence assembly will be dependent on the availability of funding and improvements to existing assembler software. Funding for the sequencing of the platypus genome was provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH).
Bulk downloads of the sequence and annotation data are available via the Genome Browser FTP server or the Downloads page. These data have specific conditions for use. The Platypus browser annotation tracks were generated by UCSC and collaborators worldwide. See the Credits page for a detailed list of the organizations and individuals who contributed to the success of this release.