FinnGen is a public-private partnership that combines genotype data from Finnish biobanks with digital health record data from Finnish health registries. The R12 release contains imputed variants from 500,348 biobank samples obtained using genotyping arrays. The imputation used phased variants obtained from 8,554 high-quality whole genome sequences, also from Finland. This represents approximately 10% of the Finnish population. Phenotype links can be viewed at the FinnGen PheWeb.
Due to license restrictions, the data for this track cannot be downloaded from the UCSC Genome Browser. The Table Browser, Data Integrator, and download server are not available for this track.
TSV data can be requested via the form at FinnGen, which triggers an automated email containing the download link. A script in our GitHub repo converts this file to VCF (see Methods below).
FinnGen participants were genotyped using a custom Axiom FinnGen1 array, supplemented by legacy collections genotyped with other arrays. Imputation used a population-specific reference panel of high-coverage (25–30x) whole-genome sequences from Finnish individuals. Ancestry outliers were removed via PCA against 1000 Genomes reference samples, and 5,780 duplicates and monozygotic twins were excluded. Variant quality was assessed using VQSR.
R12 annotated variants were downloaded from the Google Cloud bucket link received through an email and converted to VCF with a custom Python script. We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. For some tracks, python scripts were necessary and are also available from GitHub.
We want to acknowledge the participants and investigators of the FinnGen study.
Kurki MI, Karjalainen J, Palta P, Sipilä TP, Kristiansson K, Donner KM, Reeve MP, Laivuori H, Aavikko M, Kaunisto MA et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature. 2023 Jan;613(7944):508-518. PMID: 36653562; PMC: PMC9849126