wiki:BigCompute

Version 5 (modified by Barbera van Schaik, 14 years ago) (diff)

--

Groningen cluster

People UMCG: Morris, Freerk, more?

Description Description here about code template and automatic PBS script generation. Job submission/monitoring

Port applications to Dutch Life Science Grid

People

  • AMC: Antoine van Kampen, Barbera van Schaik, Silvia D Olabarriaga, Mark Santcroos
  • Sara/BiGGrid: Tom Visser, more?

Description Software is going to be implemented as workflow components. The workflows will run on the Dutch life science grid.

Implemented workflow components at AMC

This list of workflow components are already available. We can expand it with Pindel and (parts of) the GATK pipeline.

  • Splitting of fastq files
  • Building a BWA index on the genome sequence (base space and color space)
  • BWA for shotgun reads (base space and color space) It is possible to do parameter sweeps. Output is in bam format
  • Merge bam results
  • Samtools pileup
  • Varscan (pileup to snp, indel and cns)
  • Bam2coverage creates a UCSC wiggle file to display the genome coverage (per 50kbp)
  • Coverage-per-base determines the coverage for every base in the genome and it summarizes the results (coverage versus frequency)
  • Annovar (currently working on the implementation). This is a pipeline to annotate variants (gene, dbsnp, hapmap, 1000g, conservation, etc)

To be implemented

  • BWA for paired end reads
  • The components of the Groningen pipeline that not implemented as a workflow component yet
  • Pindel

Things to address

  • Data access rights
  • Available disk space on the grid storage elements / worker nodes

Alternatives

Clusters

  • Groningen
  • Leiden
  • Huygens
  • Lisa
  • Philips
  • DAS

Grid

Attachments (3)

Download all attachments as: .zip