Description

Nextstrain.org displays data about mutations that have occurred in the current 2019/2020 outbreak of SARS-CoV-2. Nextstrain has a powerful user interface for viewing the time stamped phylogenetic tree that it infers from the patterns of mutations in sequences worldwide. Nextstrain maintains an ongoing pipeline that continuously obtains SARS-CoV-2 genome sequences and metadata from GISAID, aligns them against the reference genome (NC_045512.2), and infers a phylogenetic tree.

This track shows the alternate allele frequency of each mutation reported by Nextstrain as a bar graph with the height indicating the frequency. (The Nextstrain Mutations track offers a more detailed display of the mutations, breaking up the vertical bars according to the order of virus genome samples in the phylogenetic tree.)

Methods

Nextstrain downloads SARS-CoV-2 genomes from GISAID as they are submitted by labs worldwide. The sequences are processed by an automated pipeline and annotations are written to a data file that UCSC downloads and extracts annotations for display.

Data Access

You can download the bigBed file underlying this track (nextstrainSamples*.bigWig) from our Download Server. The data can be explored interactively with the Table Browser or the Data Integrator. The data can be accessed from scripts through our API.

http://api.genome.ucsc.edu/getData/track?genome=wuhCor1;track=nextstrainFreqB4;chrom=NC_045512v2;start=100;end=7875
Command-line extraction can be accomplished using an example like the following command:
bigWigToWig -udcDir=. -chrom=NC_045512v2 -start=100 end=7875 http://hgdownload.soe.ucsc.edu/gbdb/wuhCor1/nextstrain/nextstrainSamples.bigWig myOutput.wig

Please refer to our mailing list archives for questions, or our Data Access FAQ for more information.

Data usage policy

The data presented here is intended to rapidly disseminate analysis of important pathogens. Unpublished data is included with permission of the data generators, and does not impact their right to publish. Please contact the respective authors (available via the Nextstrain metadata.tsv file) if you intend to carry out further research using their data. Derived data, such as phylogenies, can be downloaded from nextstrain.org (see "DOWNLOAD DATA" link at bottom of page) - please contact the relevant authors where appropriate.

Credits

Thanks to nextstrain.org for sharing its analysis of genomes collected by GISAID EpiCoV TM, and to researchers worldwide for sharing their SARS-CoV-2 genome sequences.

References

Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, Sagulenko P, Bedford T, Neher RA. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics. 2018 Dec 1;34(23):4121-4123. PMID: 29790939; PMC: PMC6247931

Sagulenko P, Puller V, Neher RA. TreeTime: Maximum-likelihood phylodynamic analysis. Virus Evol. 2018 Jan;4(1):vex042. PMID: 29340210; PMC: PMC5758920