24 | | |
25 | | ==== /target/gpfs2/gcc/groups/gonl/ ==== |
26 | | Root of the gonl data. |
27 | | |
28 | | ==== /target/gpfs2/gcc/groups/gonl/projects ==== |
29 | | Root of the gonl projects. This is where all the raw data and results live. They are organized into projects; please have a look at the full data structure here: DataManagement/ProjectData |
30 | | ==== /target/gpfs2/gcc/groups/gonl/projects/bgi ==== |
31 | | Contains all the data coming from BGI, including their variant calls. The data is organized by batch in the batchX subfolders. Each of the subfolders typically contains the following: |
32 | | |
33 | | * batchX/ |
34 | | * A set of compressed files containing the plain text data and md5 files for downloading purpose. These are named as follows: timestamp.BGI.batchX.data_type.hg1X.data_format.tar.bz2. All plain text data should be available as a compressed file, including but not limited to: CNV, !InDel, !InDel annotations, SNP, SNP annotation. Some of these are available in multiple formats; see BGI data page for more explanation about the BGI data and its formats. ** md5 checksum files for all files. |
35 | | |
36 | | * batchX/bam OR batchX/alignment |
37 | | * The BAM files aligned by BGI |
38 | | |
39 | | * batchX/CNV |
40 | | * CNVs in CNV Detector format. If you want to download for all samples, please download the compressed archive from batchX/ |
41 | | |
42 | | * batchX/indel |
43 | | * !InDels in samtools pileup format. If you want to download for all samples, please download the compressed archive from batchX/ |
44 | | |
45 | | * batchX/indel_annotation |
46 | | * Indels annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/ |
47 | | |
48 | | * batchX/SNP |
49 | | * SNP in SOAPsnp format. If you want to download for all samples, please download the compressed archive from batchX/ |
50 | | |
51 | | * batchX/SNP_annotation |
52 | | * SNP annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/ |
53 | | |
54 | | * batchX/vcf_format/CNV |
55 | | * CNV in VCF format. If you want to download for all samples, please download the compressed archive from batchX/ |
56 | | |
57 | | * batchX/vcf_format/indel |
58 | | * Indel in VCF format. If you want to download for all samples, please download the compressed archive from batchX/ |
59 | | |
60 | | * batchX/vcf_format/SNP |
61 | | * SNP in VCF format. If you want to download for all samples, please download the compressed archive from batchX/ |
62 | | |
63 | | NOTES: |
64 | | |
65 | | * Unless specified otherwise, all data is aligned on hg19 |
66 | | * Some of the folder/filenames are inconsistent from one batch to the other. This is because the original names as found on the BGI HD have been kept. |
| 24 | Personal users should refer to this page for the description of the folder structure holding the data: !DataManagement/ProjectData |