Version 8 (modified by 11 years ago) (diff) | ,
---|
Gene, exon and transcript counts
Counts and Spearman correlations for run 1
Date: 06-november-2013
Analysis by: Peter-Bram 't Hoen
The combined gene counts for the 2330 samples from run 1 are available on the VM: /virdir/Backup/run_1_gene_counts/combined_gene_count_run_1.txt and were generated using this script: R script for merging gene count tables
Subsequently, pairwise Spearman correlations were calculated: /virdir/Backup/run_1_gene_counts/Spearman_correlations_complete_gene_data_run_1.txt
From these the median Spearman correlation for each sample to each other sample was calculated. This is also called the D-statistic. The D-statistics (ranked from low to high) can be found in this file Median Spearman correlations
Boxplot of median Spearman correlations grouped by flowcell (Martijn Vermaat)
Dstat_biobank_boxplot.pdf Boxplot of median Spearman correlations grouped by biobank]
After removing the two samples with very low Spearman correlations to all other samples, the distance matrix was calculated (1 - correlation matrix), and a two-dimensional MDS plot was created using the R function cmdscale. This is the resulting mdsplot. The plot was colored according to the following color scheme:
"LL" - gold
"RS" - blue
"CODAM" - orange
"LLS" - pink
"Amsterdam" - darkred
Attachments (7)
- merge_count_script.r (756 bytes) - added by 11 years ago.
- Median_pairwise_spearman_correlations_complete_gene_data_run_1.txt (54.0 KB) - added by 11 years ago.
- Median_pairwise_spearman_correlations_by_flowcell_complete_gene_data_run_1.pdf (22.9 KB) - added by 11 years ago.
- mdsplot_filt_colored_biobank.pdf (21.1 KB) - added by 11 years ago.
- Dstat_biobank_boxplot.pdf (5.6 KB) - added by 11 years ago.
- VM_QC_correlations_only_expressed_genes.R (1.7 KB) - added by 11 years ago.
- mdsplot_filt_colored_gc.pdf (24.1 KB) - added by 11 years ago.
Download all attachments as: .zip