This track shows allele frequencies for single-nucleotide variants (SNVs) and small indels joint-called across 1,027 population-consented human samples sequenced with PacBio HiFi long reads by the Consortium of Long Read Sequencing (CoLoRSdb). Sites were joint-genotyped with DeepVariant and merged across samples with GLnexus.
Two versions of the callset are displayed, chosen automatically according to the browser's reference assembly:
Only population-level allele frequencies (AF), allele counts (AC), and total allele numbers (AN) are displayed; per-sample genotypes are not included in the released VCF. Multi-allelic sites have been decomposed so that each alternate allele has its own VCF row.
The track is drawn in the standard UCSC VCF representation. Mouseover shows the variant, reference/alternate alleles, AC, AN and AF. When zoomed in, alleles are drawn with base-specific coloring. Homozygous ALT positions are marked with one letter, heterozygotes with two letters.
Long-read sequencing of 1,027 individuals was performed at multiple CoLoRSdb member sites. Per-sample variant calls were produced with DeepVariant and joint-genotyped with GLnexus to create a cohort-wide population VCF. Sites passing CoLoRSdb's standard quality filters were included.
The data can be explored interactively with the Table Browser or Data Integrator, and accessed from scripts via our API (track=colorsDbSnv).
The VCF files are available on our download server: GRCh38 VCF and CHM13 VCF. They are hard symlinks to the upstream CoLoRSdb releases, which are distributed from colorsdb.org and the CoLoRSdb GitHub repositories.
Thanks to the Consortium of Long Read Sequencing (CoLoRSdb) members and PacBio for producing and releasing these joint-called long-read variant frequencies.
See the main Variant Frequencies container track for the general context and a comparison table across all included frequency databases.