18 | | Gunzip fasta file. Build BWA index. Tar-gzip the results. |
19 | | |
20 | | == Split fastq file == |
21 | | |
22 | | [[Image(splitFastq.png, 50%)]] |
23 | | |
24 | | Splits a large fastq file (gzipped) into several smaller files with the unix command 'split'. The results are uploaded to the directory that is specified in 'gridOutputDir' |
25 | | |
26 | | == Alignment with BWA on each split file == |
27 | | |
28 | | [[Image(BWAparam.png, 50%)]] |
29 | | |
30 | | Runs BWA with adjustable parameter settings. |
31 | | * Matches sequence reads to a reference database |
32 | | * Convert sai to sam |
33 | | * Convert sam to bam |
34 | | * Sort bam file |
35 | | * Index sorted bam file |
36 | | * Tar-gzip all results. Also the intermediate files |
37 | | |
38 | | == Merge bam files == |
39 | | |
40 | | [[Image(MergeIndexSNPcall.png, 50%)]] |
41 | | |
42 | | * Downloads all bai, bam, sam and tar.gz files from the gridInputDirectory |
43 | | * Gunzip tar the tar.gz files if they are present |
44 | | * Gunzip the reference file (fasta format) |
45 | | * Merge all _sorted.bam files |
46 | | * Build index on this merged file |
47 | | * Call SNPs and make selection. Output in pileup format. |
48 | | * Convert pileup format to bed format |
49 | | |
50 | | == SNP calling with varscan, determine coverage == |
51 | | |
52 | | [[Image(Coverage_Varscan_BaseCoverage.png)]] |
53 | | |
54 | | * Creates a pileup file (with samtools pileup -f) Sends the output to Varscan. Calls SNPs, indels and copy number variations. |
55 | | * Calculates coverage per 50kbp |
56 | | * Calculates coverage per base |
| 19 | '''Status:''' Implemented on grid. Source code is made available. |