The WebSTR track displays 1,710,833 short tandem repeat (STR) loci across the human genome from the WebSTR database.
This track is based on the EnsembleTR panel for the GRCh38/hg38 assembly, which represents a combined set of tandem repeats genotyped by four separate methods (HipSTR, GangSTR, ExpansionHunter, and AdVNTR) on data from the 1000 Genomes Project. EnsembleTR was applied to jointly genotype all 3,550 samples, producing consensus calls at over 1.7 million autosomal tandem repeat loci.
The track includes allele frequency distributions for five 1000 Genomes continental populations:
For each population, allele frequencies are defined as the number of copies of each allele divided by the total number of alleles in that population. Alleles are represented as the number of repeat unit copies.
Items are colored by expected heterozygosity, computed as het = 1 − ∑pi2 from allele frequencies pooled across all five 1000 Genomes populations weighted by sample count:
Each item is labeled by its repeat motif and copy count. Hovering over an item shows the repeat motif, number of reference copies, and heterozygosity. Clicking an item links to the corresponding WebSTR locus page, which provides interactive allele frequency histograms and additional annotations.
The EnsembleTR reference panel was constructed as follows:
For the UCSC Genome Browser track, the source data were converted from CSV to bigBed format. Per-population allele frequency distributions are stored as extra bigBed fields.
The raw data can be explored interactively with the Table Browser or the Data Integrator. For automated analysis, the data may be queried from our REST API. The underlying bigBed file can be downloaded from our download server.
The complete WebSTR dataset, including additional cohorts and data types not included in this track, is available from the WebSTR web portal. Programmatic access to the full WebSTR database is available through the WebSTR REST API.
Thanks to Melissa Gymrek (UC San Diego) and the WebSTR team for providing the data for this track.
Lundström OS, Adriaan Verbiest M, Xia F, Jam HZ, Zlobec I, Anisimova M, Gymrek M. WebSTR: A Population-wide Database of Short Tandem Repeat Variation in Humans. J Mol Biol. 2023 Oct 15;435(20):168260. PMID: 37678708
Ziaei Jam H, Li Y, DeVito R, Mousavi N, Ma N, Lujumba I, Adam Y, Maksimov M, Huang B, Dolzhenko E et al. A deep population reference panel of tandem repeat variation. Nat Commun. 2023 Oct 23;14(1):6711. PMID: 37872149; PMC: PMC10593948