The gnomAD v2 tracks show variants from 125,748 exomes and 15,708 whole genomes, all mapped to the GRCh37/hg19 reference sequence and lifted to the GRCh38/hg38 assembly. The data originate from 141,456 unrelated individuals sequenced as part of various population-genetic and disease-specific studies collected by the Genome Aggregation Database (gnomAD), release 2.1.1. Raw data from all studies have been reprocessed through a unified pipeline and jointly variant-called to increase consistency across projects. For more information on the processing pipeline and population annotations, see the following blog post and the 2.1.1 README.
gnomAD v2 data are based on the GRCh37/hg19 assembly. These tracks display the GRCh38/hg38 lift-over provided by gnomAD on their downloads site.
The gnomAD MPC score (Missense Deleteriousness Prediction by Constraint) is available for now only on hg19.
For questions on the gnomAD data, also see the gnomAD FAQ.
The gnomAD v2.1.1 track follows the standard display and configuration options available for VCF tracks, briefly explained below.
Four filters are available for these tracks, the same as the underlying VCF:
There are two additional filters available, one for the minimum minor allele frequency, and a configurable filter on the QUAL score.
The raw data can be explored interactively with the Table Browser, or the Data Integrator. For automated analysis, the data may be queried from our REST API, and the genome annotations are stored in files that can be downloaded from our download server, subject to the conditions set forth by the gnomAD consortium (see below). Variant VCFs can be found in the vcf/ subdirectory.
The data can also be found directly from the gnomAD downloads page. Please refer to our mailing list archives for questions, or our Data Access FAQ for more information.
Thanks to the Genome Aggregation Database Consortium for making these data available. The data are released under the Creative Commons Zero Public Domain Dedication as described here.
Please note that some annotations within the provided files may have restrictions on usage. See here for more information.
Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, Collins RL, Laricchia KM, Ganna A, Birnbaum DP et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020 May;581(7809):434-443. PMID: 32461654; PMC: PMC7334197
Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O'Donnell-Luria AH, Ware JS, Hill AJ, Cummings BB et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016 Aug 17;536(7616):285-91. PMID: 27535533; PMC: PMC5018207