Version 16 (modified by 13 years ago) (diff) | ,
---|
This page is work-in-progress regarding the data Management of the GoNL.
Important note: The block size on the storage is 6MB, which means that each file -regardless of its real size- will occupy at least 6MB on the file system. This means that data should rather be kept in big files rather than a multitude of small files whenever possible. Typically things like logs, old submit scripts, etc. should be compressed into 1 file for archiving.
See
- DataManagement/ProjectData - how are rawdata, intermediate data and result data organized on the compute clusters
- DataManagement/SftpServer - how to access the SFTP for data sharing
- DataManagement/ProjectResources - where resources and tools that are used by the pipelines
- DataManagement/FileNameConventions - how are files named so we understand eachother