low-complexity regions
Tracks in this set:
- Cent-Sat - Centromeric satellite repeats - gives the approximate locations
of centromeric satellite repeats and acrocentric short arms in GRCh38.
It was manually constructed based on the official satellite annotation,
the DNA-BRNN satellite annotation and where minigraph alignment faded.
This file is intended for filtering spurious alignments or variant calls
caused by centromeric repeats or acrocentric arms.
- Cent-Sat - Centromeric satellite repeats
- PAR regions on chrX, chrY - pseudo-autosomal regions (PARs) on chrX and chrY
- low-complexity regions excluding alpha and HSAT2/3 satellites - Column 4: "ldust" for longdust regions 50bp or longer; "mg" for regions overlapping with minigraph LCR SVs. Column 5: longest allele in each LCR.
- in LCR AND TRF - intersection of LCR track and the trf/simpleRepeats track
- in TRF not LCR - regions in trf/simpleRepeat that are not found in LCR
- in LCR not TRF - regions in LCR that are not found in trf/simpleRepeat
- MHC, T-cell receptor (TCR) and immunoglobulin (Ig) regions
- Segmental duplications by merging overlapping regions in genomicSuperDups.txt.gz
Intersections
- TRF/simpleRepeat coverage: 146,785,521 bases
- In TRF not in LCR: 116,912,031 bases
- hg38.lcr-v4 coverage: 35,426,253 bases
- In both TRF and LCR: 29,873,490 bases
- In LCR not in TRF: 5,552,763 bases
PAR regions
References
Qian Quin, Heng Li
Challenges in structural variant calling in low-complexity regions
arXiv. 2025 Sep;25:2509.23057
DOI: arXiv.2509.23057