Meta-exon annotation
To create the meta-exon annotation the following steps were taken: 
- The exon annotation from Ensembl Biomart v.71 was downloaded. The file contained the following columns: 
chromosome, exon start, exon end, Ensembl exon id, Ensembl gene id, gene name, strand.
 - All additional contigs (GL*, LRG* etc) were removed, so that only ordinary chromosomes (1-22, X, Y, MT) remained. This was done by a custom script cutStrangeChr.py (see attachment). 
 - The Biomart file was converted to bed format and sorted by start coordinate:
 - Exons were merged using mergeBed tools from BEDTools suite:
 - The resulting file was converted to gtf format, retaining the strand information by a custom script mergedBed_to_gtf.py (see attachment).
 
The final commands to generate the meta-exon annotation were the following:
./cutStrangeChr.py biomart_export.txt | awk 'BEGIN {FS="\t"}; {OFS="\t"}; {if ($7 == "-1") $7 = "-"; else $7 = "+"}; {print $1, $2 - 1, $3, $4 ":" $5 ":" $6, ".", $7}' | sort -k1,1n -k2,2n | mergeBed -nms -d -1 -i stdin > biomart_export.merged.tmp
./mergedBed_to_gtf.py biomart_export.merged.tmp biomart_export.txt | sort -k1,1n -k4,4n > meta-exons_v71_cut_sorted_18-04-14.gtf
            Last modified 9 years ago
            Last modified on Sep 19, 2016 4:48:45 PM
          
        
        
      