FinnGen is a public-private partnership that combines genotype data from Finnish biobanks with digital health record data from Finnish health registries. The R12 release contains imputed variants from 500,348 biobank samples typed on genotyping arrays. The imputation used phased variants from 8,554 high-quality whole genome sequences, also from Finland. That is roughly 10% of the Finnish population. Phenotype links can be viewed at the FinnGen PheWeb.
Due to license restrictions, the data for this track cannot be downloaded from the UCSC Genome Browser. The Table Browser, Data Integrator, and download server are not available for this track.
TSV data can be requested via the form at FinnGen, which triggers an automated email containing the download link. A script in our GitHub repo converts this file to VCF (see Methods below).
FinnGen participants were genotyped using a custom Axiom FinnGen1 array, supplemented by legacy collections genotyped with other arrays. Imputation used a population-specific reference panel of high-coverage (25–30x) whole-genome sequences from Finnish individuals. Ancestry outliers were removed via PCA against 1000 Genomes reference samples, and 5,780 duplicates and monozygotic twins were excluded. Variant quality was assessed using VQSR.
R12 annotated variants were downloaded from the Google Cloud bucket link received through an email and converted to VCF with a custom Python script. The conversion steps for all source files of the varFreqs track are recorded in the makeDoc file of the track. Some tracks also need python scripts, which live on GitHub.
Thanks to the participants and investigators of the FinnGen study.
Kurki MI, Karjalainen J, Palta P, Sipilä TP, Kristiansson K, Donner KM, Reeve MP, Laivuori H, Aavikko M, Kaunisto MA et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature. 2023 Jan;613(7944):508-518. PMID: 36653562; PMC: PMC9849126