# Readme # # Description: These files contain a mixture of annotated and novel elements ("novel" = within the annotated intergenic or antisense space). Annotated elements are long GENCODE v10 elements only (i.e. all elements except short RNAs). ### Novel elements #### Novel element have been assembled using Cufflinks. The assembly was done by pooling all reads from all bioreplicates per experiment. The models were then subsequently merged for each RNAfraction (polyAplus, polyAminus, total) using Cuffmerge. For the polyAplus fraction additional models from Caltech RNAseq data were inculded to increase detection levels. ### Quantification ### # The quantification of the above elements are based on the FluxCapacitor software (M.Sammeth) for transcript quantification. The FluxCapacitor assigns expression values for individual transcripts based on read de-convolution, a non-redundant assingment dependent to the underlying transcript structure of a locus. Only reads from read pairs with the correct orientation were taken into account (that is, both mates are located on opposing strands facing each other). *TranscriptEnsV65IAcuff.gtf : transcript quantifications obtained from the FLux Capacitor *ExonsEnsV65IAcuff.gff : exon quantifications, derived from quantifications of individual transcripts by summing the expression of all transcripts that share the exon. *GeneEnsV65IAcuff.gff : gene quantifications, derived from quantifications of individual transcripts by summing the expression of all transcripts that belong to the respective gene ### npIDR ### # Non-paramtric IDR (npIDR, by A. Dobin) was performed on all experiments with 2 bioreplicates (using RPKM) to asses reproducibility. No IDR was performed on experiments with only a single replicate. IDR was performed on annotated and novel (intergenic, antisense) elements seperately, to acknowledge that cufflinks models were of different nature than Gencode elements. ### File description (.gtf/gff) ## Field1: Chromsome Field2: Source Filed3: Element Field4: Start Field5: End Field6: relative averaged Score = (RPKM1+RPKM2/2)*1000/max_RPKM Field7: Strand Field8: . Field9: key-value pairs => gene_id(s) "gene_id(s)"; transcript_id(s) "transcript_id(s)"; RPKM1 "RPKM1"; RPKM2 "RPKM2"; npIDR "npIDR" In the case of only one bioreplicate per experiment there will be only one RPKM value given with npIDR "NA". ############################################################################################################################ Contact: angelika.merkel@crg.es, sarah.djebali@crg.es