21 | | TODO. Including citations. |
| 25 | ==== Summary ==== |
| 26 | The study data was lifted over from human genome build 36 to build 37 using Plink^([#hn 1])^ and UCSC liftOver, followed by alignment to reference data and filtering on MAF larger than 1%, Hardy-Weinberg Equilibrium p-value of 1e-4 and a call rate higher than 0.95. Afterwards the study data was pre-phased per chromosome using SHAPEIT2 v.2.644^([#hn 2])^. Finally the imputation over genome chunks of 5Mb was performed using IMPUTE2 2.3.0^([#hn 3])^. Here we used Genome of the Netherlands release 4 (499 unrelated individuals)^([#hn 4])^ and 1000 Genomes phase1 integrated version 3 (1092 individuals)^([#hn 5])^ respectively as reference panel. |
| 27 | |
| 28 | We used MOLGENIS compute^([#hn 6])^ to implement the imputation pipeline, keep track of all analysis jobs and easily distribute all imputation chunks in parallel on our PBS compute cluster and the national life science grid. All pipelines are available as open source via http://www.molgenis.org/wiki/ComputeStart. |
| 29 | |
| 30 | ==== Acknowledgements ==== |
| 31 | We would like to thank The Target project (http://www.rug.nl/target) for providing the compute infrastructure used for imputation and the BigGrid/eBioGrid project (http://www.ebiogrid.nl) for sponsoring the imputation pipeline implementation. |
| 32 | |
| 33 | ==== References ==== |
| 34 | 1. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81, Available: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950838/ |
| 35 | 2. Delaneau O, Zagury J-F, Marchini J (2013) Improved whole-chromosome phasing for disease and population genetic studies. Nature methods 10: 5–6. Available: http://dx.doi.org/10.1038/nmeth.2307 |
| 36 | 3. Howie BN, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS genetics 5: e1000529. Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2689936&tool=pmcentrez&rendertype=abstract |
| 37 | 4. Placeholder for GoNL paper |
| 38 | 5. The 100 Genomes Project Consortium (2012) An integrated map of genetic variation from 1,092 human genomes. Nature, 491. Available: http://www.nature.com/nature/journal/v491/n7422/full/nature11632.html |
| 39 | 6. Byelas, H., Dijkstra, M., Neerincx, P., Van Dijk, F., Kanterakis, A., Deelen, P., & Swertz, M. (2013). Scaling bio-analyses from computational clusters to grids. IWSG 2013. |
| 40 | |
| 41 | [[br]] |
| 42 | [[br]] |
| 43 | |