This track shows allele frequencies for single-nucleotide variants (SNVs) and small indels joint-called across 1,027 population-consented human samples sequenced with PacBio HiFi long reads by the Consortium of Long Read Sequencing (CoLoRSdb). Sites were joint-genotyped with DeepVariant and merged across samples with GLnexus.
Two versions of the callset are displayed, chosen automatically according to the browser's reference assembly:
Only population-level allele frequencies (AF), allele counts (AC), and total allele numbers (AN) are displayed; per-sample genotypes are not included in the released VCF. Multi-allelic sites have been decomposed so that each alternate allele has its own VCF row.
The track uses the standard UCSC VCF representation. Mouseover shows the variant, reference/alternate alleles, AC, AN and AF. At higher zoom levels, alleles use base-specific colors. Homozygous ALT positions are marked with one letter, heterozygotes with two letters.
CoLoRSdb member sites sequenced 1,027 individuals with PacBio HiFi long reads. Per-sample variant calls came from DeepVariant. GLnexus then joint-genotyped them into a cohort-wide population VCF. Only sites that passed CoLoRSdb's standard quality filters are included.
The data can be explored interactively with the Table Browser or Data Integrator, and accessed from scripts via our API (track=colorsDbSnv).
The VCF files are available on our download server: GRCh38 VCF and CHM13 VCF. They are hard symlinks to the upstream CoLoRSdb releases, which are distributed from colorsdb.org and the CoLoRSdb GitHub repositories.
Thanks to the Consortium of Long Read Sequencing (CoLoRSdb) members and PacBio, who produced and released these joint-called long-read variant frequencies.
See the main Variant Frequencies container track for the general context and a comparison table across all included frequency databases.