This track displays structural variants (SVs) — deletions, insertions, and complex substitutions of at least 50 bp — from the Arabic Pangenome Reference (APR), a pangenome graph built from 53 UAE-resident Arab individuals drawn from eight countries (UAE, Saudi Arabia, Oman, Jordan, Egypt, Morocco, Syria, Yemen). Each bubble in the graph that contains an SV-sized alternative allele is shown as a single variant site, with allele counts aggregated across the 53 samples (the GRCh38 reference haplotype, present as an extra sample column in the source VCF, is excluded from the aggregation).
The APR pangenome was built on the T2T-CHM13v2 reference. Variants are shown natively on the hs1 browser and lifted to hg38 using the UCSC hs1ToHg38.over.chain.gz chain; variants that do not lift cleanly (often in T2T-added euchromatic sequence) are omitted from the hg38 version of the track.
Items are colored by SV type:
Each item spans from the start of REF to its end on the reference. The name field is the graph snarl ID (e.g. <951452<1012008), which identifies the variant site in the APR pangenome graph.
The source VCF is multi-allelic: a single graph snarl appears as one row with a comma-separated ALT list. For this track, each ALT is classified individually using the 50 bp threshold, and the row is emitted as a single bed item with:
Rows whose alts are all smaller than 50 bp are not shown.
The APR pangenome was assembled from 53 individuals sequenced with an average 35× PacBio HiFi, 54× ultralong ONT and 65× Hi-C coverage, producing haplotype-phased de novo assemblies with N50 > 124 Mb. The pangenome graph was built with Minigraph-Cactus v2.7.2 seeded on CHM13v2 (backbone) and GRCh38; variants were extracted and deconstructed from the graph. For this UCSC track, the decomposed VCF was parsed, filtered to alt alleles with ≥50 bp REF/ALT length difference, and merged per snarl site. See the build documentation in the kent source tree at src/hg/makeDb/doc/hg38/lrSv.txt for details.
The data can be explored interactively with the Table Browser or Data Integrator, and accessed from scripts via our API (track=aprSv).
For automated download, the bigBed files are at http://hgdownload.soe.ucsc.edu/gbdb/hs1/lrSv/apr.bb (native) and http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/apr.bb (lifted).
The original APR pangenome VCF and assemblies can be downloaded from https://www.mbru.ac.ae/the-arab-pangenome-reference/, and the project source code is at https://github.com/muddinmbru/arab_pangenome_reference.
Thanks to the Arabic Pangenome Reference team at Mohammed Bin Rashid University (Dubai), led by Mohammed Uddin, for producing and releasing the pangenome and its variant calls.
Nassir N, Almarri MA, Kumail M, Mohamed N, Balan B, Hanif S, AlObathani M, Jamalalail B, Elsokary H, Kondaramage D et al. A draft UAE-based Arab pangenome reference. Nat Commun. 2025 Jul 24;16(1):6747. PMID: 40707445; PMC: PMC12290100