Description

GnomAD 4 used the whole-genome data from gnomAD 3 and added more exomes. The current v4.1 release includes a fix for the allele number issue. The v4.1 track shows variants from 807,162 individuals, including 730,947 exomes and 76,215 genomes. This includes the 76,156 genomes from the gnomAD v3.1.2 release as well as new exome data from 416,555 UK Biobank individuals. For more detailed information on gnomAD v4.1, see the related blog post.

Display Conventions and Configuration

Following the conventions on the gnomAD browser, items are shaded according to their Annotation type:
pLoF
Missense
Synonymous
Other

Mouse hover on an item will display the following details about each variant:

Clicking on an item will display additional details on the variant, including a population frequency table showing allele count in each sub-population.

Label Options

To maintain consistency with the gnomAD website, variants are by default labeled according to their chromosomal start position followed by the reference and alternate alleles, for example "chr1-1234-T-CAG". dbSNP rsID's are also available as an additional label, if the variant is present in dbSnp.

Filtering Options

Three filters are available for this track:

There is one additional configurable filter on the minimum minor allele frequency.

UCSC Methods

The gnomAD v4.1 data is unfiltered.

For the full steps used to create the gnomAD tracks at UCSC, please see the hg38 gnomad makedoc.

Data Access

The raw data can be explored interactively with the Table Browser, or the Data Integrator. For automated analysis, the data may be queried from our REST API, and the genome annotations are stored in files that can be downloaded from our download server, subject to the conditions set forth by the gnomAD consortium (see below).

The underlying bigBed only contains enough information necessary to use the track in the browser. The extra data like VEP annotations and CADD scores are available in the same directory as the bigBed but in the files details.tab.gz and details.tab.gz.gzi. The details.tab.gz contains the gzip compressed extra data in JSON format, and the .gzi file is available to speed searching of this data. Each variant has an associated md5sum in the name field of the bigBed which can be used along with the _dataOffset and _dataLen fields to get the associated external data. For example:

# find an item of interest, the last two fields are _dataOffset and _dataLen:
bigBedToBed genomes.bb stdout | head -4 | tail -1
chr1    12416    12417    854246d79dc5d02dcdbd5f5438542b6e    [..omitted..]    67293    902

# use _dataOffset and _dataLen (add one to _dataLen for the newline character):
bgzip -b 67293 -s 903 gnomad.v4.1.genomes.details.tab.gz
854246d79dc5d02dcdbd5f5438542b6e    {"DDX11L1": {"cons": ["non_coding_transcript_variant"...

The data can also be found directly from the gnomAD downloads page. Please refer to our mailing list archives for questions, or our Data Access FAQ for more information.

Credits

Thanks to the Genome Aggregation Database Consortium for making these data available. The data are released under the Creative Commons Zero Public Domain Dedication as described here.

Please note that some annotations within the provided files may have restrictions on usage. See here for more information.

References

Chen S, Francioli LC, Goodrich JK, Collins RL, Kanai M, Wang Q, Alföldi J, Watts NA, Vittal C, Gauthier LD et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature. 2024 Jan;625(7993):92-100. PMID: 38057664

Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, Collins RL, Laricchia KM, Ganna A, Birnbaum DP et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020 May;581(7809):434-443. PMID: 32461654; PMC: PMC7334197

Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O'Donnell-Luria AH, Ware JS, Hill AJ, Cummings BB et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016 Aug 17;536(7616):285-91. PMID: 27535533; PMC: PMC5018207