Changes between Version 9 and Version 10 of GoNL_Immunochip_Data_Preparation


Ignore:
Timestamp:
Jul 1, 2011 4:14:23 PM (13 years ago)
Author:
laurent
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GoNL_Immunochip_Data_Preparation

    v9 v10  
    2323
    2424''fastaFromBed'' needs a [http://genome.ucsc.edu/FAQ/FAQformat.html#format1 UCSC BED] file as input. This file is tab-delimited and contains 3 columns: Chrom Start_seq End_seq. As we are only interested in specific loci, Start_seq and End_seq will be 1 base appart so that only the locus of interest is reported in the output file. This file can very easily be generated either from the initial VCF file or the PLINK BIM file:
    25 * From VCF:  grep -v '^#' in.vcf | awk '{OFS="\t";print $1,$2,$2+1}' > out.bed
     25
     26 * From VCF:  grep -v '!^#' in.vcf | awk '{OFS="\t";print $1,$2,$2+1}' > out.bed
    2627
    2728Once you have the input file, simply run ''fastaFromBed'' on it giving the Human Reference corresponding to the chip data as the other input. For more information on ''fastaFromBed'', see the [http://code.google.com/p/bedtools/ BEDTools] Manual.
    28 
    2929=== Re-arrange Ref/Alt alleles based on the Human Genome Reference ===
    3030Now that we have the Human Reference Genome loci, it is trivial to re-arrange the alleles so that the Ref and Alt alleles correspond to the Human Genome Reference. I wrote a small script, ''align-vcf-to-ref.pl'' that does the work provided the correct input. Note that when flipping the order of alleles in the VCF Ref/Alt columns, one must also flipped the genotypes correctly.