wiki:DataManagement/ProjectData

Version 8 (modified by freerkvandijk, 12 years ago) (diff)

--

/target/gpfs2/gcc/groups/gonl

Directory structure of data management for GoNL located on the Groningen cluster in /target/gpfs2/gcc/groups/gonl. Permission to read by gonl group, some folders also write.

  • /tools
    • All software, scripts and tools used to process the data.
    • Note that all users can install tools of common interest in this shared directory; tools should specify version as typically multiple versions of the same tool cohabit.
  • /resources
    • All resources needed for data processing, including genome references, dbSNP releases, etc.
    • Note that all users can put resources of common interest here.
  • /home
    • one private folder per member of this group
  • /general
    • presentations, publications, other stuff
  • /projects
    • /batchX
      • /rawdata
        • here is a list of fq.gz files
      • /results
        • /alignment
          • here is a list of bam files
        • /stats
          • here is one file per QC tool
        • /snp
          • here is one vcf file per analysis run
      • /logs
      • /intermediate_results
        • whatever is needed, will be empty at end of project
    • /denovo_asm
      • denovo assembly
    • /downsampling
    • /extraAnnotation
    • /FastQs?
      • GoNL fastQ files, copied from Grid
    • /gonl_sampleBAM_md5sums.zip
      • MD5 sums from all sample BAM files found in /batchX/results/alignment/
    • /gvnl_2flowcells
      • DEPRECATED, 2 flowcell to be re-analysed
    • /gwas-chip
      • GWAS data used for QC
    • /imputationBenchmarking
      • data used for imputation, imputation benchmarking project, eQtl analysis, gold standard etc.
    • /imputation_BMImeta
      • BMI meta-analysis data
    • /LoF
      • Loss-of-Function variants
    • /RdamExome?
      • ???
    • /re-analysis2
      • all results for the re-analysis of 2 flowcells which have LQ reads in the second read of read pairs
    • /splitbams
      • ???
    • /SV
      • all Structural Variants detected per SV tool, also validation results included in this directory.
    • /targeted_denovo
      • ???
    • /trio-analysis
      • /intermediate
        • intermediate results
      • /rawdata
        • all trio-realigned BAM files
      • /resources
        • genotype data
      • /results
        • /snps
          • /releaseX
            • all releaseX GoNL SNP calls per chromosome
    • /unified_genotyper_indel_calls
      • indel calls made by Unified Genotyper (GATK)
    • /variantBurdenNonCodingRna
      • testdata for variant burden project