Version 2 (modified by 8 years ago) (diff) | ,
---|
BIOS data location and access overview
Processed files are stored at the BIOS VM. BIOS VM is a virtual machine, a digital computer, running at SURFsara's HPC cloud. It behaves like a normal Linux server would, though it can be connected to graphically.
BIOS Metadatabase can be accessed via a web browser or any client program supporting HTTPS.
Raw files are stored at SURFsara's Grid. SURFsara's Grid is an online data storage system. Accessing Grid requires a certain level of technical skills and to follow a registration procedure.
BIOS virtual machine provided by SURFsara
For safety and privacy reasons, BIOS data (genome, transcriptome, methylome and phenome) is only accessible for downstream analysis at a SURFsara virtual machine (VM). The BIOS VM is managed by Martijn Vermaat and Leon Mei. Since the resources and capacity of this VM are limited, it should only be used for downstream analysis. If you want to work on many BAM files or similar expensive analysis you would normally use a cluster for, it's probably better to get acquainted with working on the grid directly. If you are not sure contact Martijn or Leon.
The current test VM runs these specs:
- 16 processors
- 128 Gb RAM
- 4.5Tb disk space mounted at /virdir. This is the place you could keep your analysis data. The files in /virdir/Backup folder will be backed up about once per month. The files in /virdir/Scratch are not backed up.
- 2GB soft limit and 3GB hard limit per user at /home
BIOS VM Access
To get access, please send a request to Leon Mei (h.mei[at]lumc.nl
) or Martijn Vermaat (m.vermaat.hg[at]lumc.nl
) with your public SSH key (instructions).
For remote access from a Linux or Mac OSX terminal, type
ssh username@bios-vm.bbmrirp3-lumc.vm.surfsara.nl
where your private SSH key is in the standard location ~/.ssh/id_rsa
(alternatively, specify it with -i
).
For terminal access from Windows, use the PuTTY tool and configure the VM IP address, your username, and your private SSH key.
For graphical access from windows use MobaXterm as advised by SURF in the HPC cloud documentation (https://doc.hpccloud.surfsara.nl/access-your-VM).
Alternatively for graphical access from Windows or Mac OSX, use X2Go and configure the VM IP address, your username, your private SSH key and the session type/desktop manager (Gnome). Or, use a remote desktop connection client (for mac: http://www.microsoft.com/nl-nl/download/details.aspx?id=18140).
Step by step instructions for using a public/private SSH pair for access to the VM
Connection troubleshooting page (under construction)
Rstudio server
There is a Rstudio server running on the BIOS VM: http://bios-vm.bbmrirp3-lumc.vm.surfsara.nl:8787
You could log in using your username and password as your ssh session.
UCSC Genome Browser tracks
Viewing RP3 data in the UCSC Genome Browser can be done by using the WWW export directory on the virdir and selecting the exported URLs as custom tracks.
Please note that no privacy sensitive data should be stored here, as it will be world-readable.
Example sessions:
Grid SRM access from the BIOS VM
In case you need to have access to some raw BIOS data (e.g., RNAseq, methylation), you will have to download them from the Grid SRM storage to the BIOS VM. Here are instructions on how to do it.
Note: Requesting access to the SRM takes quite some time and using the SRM itself is not the easiest thing to learn. As such, if there already is someone at your institute with access and experience using the SRM, it might be faster and easier to ask that person for help.
Grid SRM Access
Before proceeding, follow the steps in Obtaining access to grid infrastructure.
Prepare a proxy
To download data from the Grid SRM to the BIOS VM you'll need a proxy and your keys (the grid certificate). You should have access to a UI to start a proxy. For example gb-ui-lumc.lumc.nl, ui.lsg.psy.vu.nl or another site.
- On the UI there should be a
.globus
folder in your home dir, that contains the grid certificate. Copy the.globus
directory from your local home folder to the UI home folder. Make sure the permissions are set accordingly. (log into an UI and issue the commandschmod 644 usercert.pem
andchmod 400 userkey.pem
). These files don't need to be renewed..globus/: total 8 -rw-r--r-- 1 mgalen mgalen 1769 Aug 14 16:55 usercert.pem -r-------- 1 mgalen mgalen 1751 Aug 14 16:55 userkey.pem
- The proxy can be started by logging into an UI and use
startGridSession
:startGridSession bbmri.nl:/bbmri.nl/RP3
- This creates your own x509 in the
/tmp
dir on the UI which looks something like this.-rw------- 1 mgalen 6.1K Aug 27 09:55 x509up_u40208
- You may have to change the permissions of this file using
chmod 644 x509up_u40208
. Copy this file to a place at the BIOS VM for later use. (Maybe to/tmp
also.) Make sure you copy the x509 file associated with your username. This is valid for 7 days, you need to renew this weekly.
- Now log in the BIOS VM and use this command to fetch the file you just created on the UI to the BIOS VM.
scp mgalen@uimd.grid.sara.nl:/tmp/x509up_u1234 /tmp (replace 'uimd.grid.sara.nl' with the address of your UI)
Downloading files
- Once these files are in place, you can copy data from the Grid SRM to the BIOS VM using curl. For example, login to the VM and issue the following command, where
-E
points to the path where you put the proxy file. Don't forget to redirect the output from curl to a local filename.mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/README >README
- You can also upload data to the Grid SRM from the BIOS VM using curl. To upload a local file
test.txt
, use the--upload-file
(or-T
) argument:mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/test.txt --upload-file test.txt
(In practice, of course use a more appropriate directory on the Grid SRM instead of the project root.) Instead of specifying the full target name including filename, you can also just specify the target directory ending in a/
. Curl will than use your local filename also on the Grid SRM:mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/ --upload-file test.txt
- If you want to delete a file from the Grid SRM, use
-X DELETE
(use this with caution):mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/test.txt -X DELETE
- Just checking if a file exists, without really downloading it, can be done with the
-I
option (A response with200 OK
means the file exists,404 Not Found
means it doesn't):mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/README HTTP/1.1 200 OK Date: Tue, 28 Jan 2014 15:26:17 GMT ETag: 0000EADAD57CE41D47F5A8F069A7C24F8003_-1773128220 Last-Modified: Wed, 15 Jan 2014 14:31:15 GMT Content-Length: 154 Server: Jetty(7.3.1.v20110307)
mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/nonexisting HTTP/1.1 404 Not Found Content-Type: text/html Transfer-Encoding: chunked Server: Jetty(7.3.1.v20110307)
Data Storage
Read Storage in the cloud
Attachments (3)
- BIOSdataInfrastructure.png (91.5 KB) - added by 8 years ago.
- BIOSdataInfrastructure_2.png (82.8 KB) - added by 7 years ago.
- BBMRI_BIOS_data_infrastructure_2018_09_28.png (98.4 KB) - added by 6 years ago.
Download all attachments as: .zip