wiki:DataManagement/SftpServer

Version 2 (modified by laurent, 13 years ago) (diff)

--

UMCG SFTP (application20.target.rug.nl)

The SFTP server can be used to access most of the data on the UMCG cluster. Please note that since bandwidth is limited, you should only download the minimum files you need and should download compressed version of the files when available (usually available for all plain text files).

/target/gpfs2/gcc/groups/gonl/sftp/

Root of the SFTP.

/target/gpfs2/gcc/groups/gonl/sftp/A4

Contains all the information about the A4 test trio, including all the raw and aligned data.

/target/gpfs2/gcc/groups/gonl/sftp/BGI

Contains all the data coming from BGI, including their variant calls. The data is organized by batch in the batchX subfolders. Each of the subfolders typically contains the following:

  • batchX/

A set of compressed files containing the plain text data and md5 files for downloading purpose. These are named as follows: timestamp.BGI.batchX.data_type.hg1X.data_format.tar.bz2. All plain text data should be available as a compressed file, including but not limited to: CNV, InDel?, InDel? annotations, SNP, SNP annotation. Some of these are available in multiple formats; see BGI data page for more explanation about the BGI data and its formats. md5 checksum files for all files.

  • batchX/bam OR batchX/alignment

The BAM files aligned by BGI

  • batchX/CNV

CNVs in CNV Detector format. If you want to download for all samples, please download the compressed archive from batchX/

  • batchX/indel

InDels? in samtools pileup format. If you want to download for all samples, please download the compressed archive from batchX/

  • batchX/indel_annotation

Indels annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/

  • batchX/SNP

SNP in SOAPsnp format. If you want to download for all samples, please download the compressed archive from batchX/

  • batchX/SNP_annotation

SNP annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/

  • batchX/vcf_format/CNV

CNV in VCF format. If you want to download for all samples, please download the compressed archive from batchX/

  • batchX/vcf_format/indel

Indel in VCF format. If you want to download for all samples, please download the compressed archive from batchX/

  • batchX/vcf_format/SNP

SNP in VCF format. If you want to download for all samples, please download the compressed archive from batchX/

NOTES:

  • Unless specified otherwise, all data is aligned on hg19
  • Some of the folder/filenames are inconsistent from one batch to the other. This is because the original names as found on the BGI HD have been kept.

/target/gpfs2/gcc/groups/gonl/sftp/pilot

Data fro the pilot, including aligned BAMs and SNPs.

/target/gpfs2/gcc/groups/gonl/sftp/resources

GoNL resources tarball (Thanks Freerk!)

/target/gpfs2/gcc/groups/gonl/sftp/upload

This is where everyone has write permissions. This directory should be used for data exchange.