Description

Long Interspersed Nuclear Element-1 (LINE-1, L1) is the only retrotransposon family in modern humans that still autonomously generates new copies by an RNA-mediated copy-and-paste mechanism. The L1-HS subfamily (HS for "human-specific") is responsible for ongoing retrotransposition activity and contributes to inter-individual genetic diversity: on average, two human genomes differ at hundreds of sites with respect to L1 insertion presence or absence. New L1-HS insertions are a recognised source of germline mutation and somatic mosaicism, and have been observed in many cancers and brain tissues.

This track shows the curated set of L1-HS insertion polymorphisms catalogued in euL1db, the European database of L1HS retrotransposon insertions in humans (Mir et al. 2015). Each feature is a Meta Retrotransposon Insertion Polymorphism (MRIP) — a non-redundant genomic site obtained by merging close Sample Retrotransposon Insertion Polymorphisms (SRIPs) reported across 32 published studies covering more than 900 samples.

Display Conventions and Configuration

Each item is a single MRIP. The score is the euL1db pseudo-allele frequency (field pseudoAlleleFreq) scaled to 0–1000. Items are coloured by the lineage of contributing SRIPs:

germline — all contributing SRIPs are germline insertions
somatic — all contributing SRIPs are somatic insertions
mixed — both germline and somatic SRIPs at this site
unknown — lineage not reported

Clicking an item opens a detail page that lists the studies and PubMed IDs reporting the insertion, the detection methods used, the tissues, clinical conditions and populations represented, and a table of the contributing samples (truncated to 200 rows for very large aggregations — see euL1db for the full sample breakdown). Filters are available for pseudo-allele frequency, SRIP and study counts, lineage, PCR validation, and whether the MRIP is annotated as already present in the reference genome.

Methods

euL1db integrates published L1-HS insertion calls from a wide range of detection assays. Most studies used enrichment-based protocols (RC-seq, L1-seq, Ewing PCR, TIP-seq), high-throughput whole-genome sequencing analysed with TranspoSeq, MELT or similar pipelines, or fosmid-based long-read approaches. For each accepted study the original authors' sample-level calls (SRIPs) were curated and re-mapped to hg19 where needed. SRIPs that are within 200 bp of each other on the same strand and are germline are merged into a single non-redundant MRIP. Somatic events are not merged, reflecting the unique nature of independent retrotransposition events. See Mir et al. 2015 for full curation details.

Track files were generated from the euL1db v1.00 release (data dump downloaded March 2018, last updated 14 October 2014) using the script meiEul1dbToBed.py, which joins the MRIP, SRIP, Sample, Individual, Study and Methods tables and emits a BED9+ file. For details of the build process see the makeDoc text file hg19/mei.txt, and the scripts directory src/hg/makeDb/scripts/mei. The Helman2014 study used numeric chromosome names (23 = X, 24 = Y); these were renamed in the build script. The hg19 BED was lifted to hg38 with liftOver using the standard hg19ToHg38.over.chain.gz chain. Of 8,991 hg19 MRIPs, 8,988 lifted successfully; the 3 unlifted MRIPs are listed in the build directory.

Data Access

The data can be explored interactively in table format with the Table Browser or the Data Integrator and exported from there to spreadsheet or tab-separated tables. From scripts, the data can be accessed through our REST API, track=meiEul1db.

For automated download and analysis, the annotation is stored in a bigBed file that can be downloaded from our download server. The file for this track is called eul1db.bb. Individual regions or the whole genome annotation can be obtained using our tool bigBedToBed, which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain features within a given range, e.g. bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/mei/eul1db.bb -chrom=chr21 -start=0 -end=100000000 stdout

The original annotation source data can be downloaded from eul1db.unice.fr via the Download tab.

Credits

Thanks to Gaël Cristofari and colleagues at IRCAN (Nice, France) for making the euL1db data freely available, and to the original study authors whose data are aggregated. Track built at UCSC by the Genome Browser group.

References

Mir AA, Philippe C, Cristofari G. euL1db: the European database of L1HS retrotransposon insertions in humans. Nucleic Acids Res. 2015 Jan;43(Database issue):D43-7. PMID: 25352549; PMC: PMC4383891