{"id":1613,"date":"2012-11-12T11:00:00","date_gmt":"2012-11-12T19:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/msr_er\/2012\/11\/12\/supercomputing-on-demand-with-windows-azure\/"},"modified":"2017-04-13T11:11:50","modified_gmt":"2017-04-13T18:11:50","slug":"supercomputing-on-demand-with-microsoft-azure","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/supercomputing-on-demand-with-microsoft-azure\/","title":{"rendered":"Supercomputing on Demand with Microsoft Azure"},"content":{"rendered":"<p><img decoding=\"async\" style=\"margin: 5px 6px; border: 0px currentColor; float: left;\" title=\"Searching for genetic causes of disease\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/MSDNBlogsFS\/prod.evol.blogs.msdn.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/01\/32\/81\/8546.Moondog_Blog_270wjpg.jpg\" alt=\"Searching for genetic causes of disease\" \/>Think about supercomputers of the recent past. Just 15 years ago, supercomputers were rare and exotic machines. Government laboratories in the United States and Japan spent hundreds of millions of dollars on custom computing rigs and specialized facilities to house them, in a bid to tackle the world\u2019s toughest problems.<\/p>\n<p>But now there is an alternative that is more attractive for scientists and businesses. Today, you can rent supercomputing horsepower by the hour online from public cloud providers. Amazing.<\/p>\n<p>Windows Azure can help ensure that you\u2019re not paying more than you can afford for your supercomputing time and it makes overall management of large-scale computations very simple. Unlike other cloud providers, Windows Azure has no virtual memory (VM) image you need to manage or store in your account; with tens of thousands of instances, this could add up\u2014both from a management and cost standpoint. And Windows Azure provides the operating system for you (and keeps it up to date with patches)\u2014you just copy your application to Windows Azure and run it in the cloud.<\/p>\n<p>The Microsoft HPC Pack 2012 (a free download that will be available from the Microsoft <a href=\"http:\/\/cm-edgetun.pages.dev\/en-us\/download\/default.aspx\" target=\"_blank\">Download Center<\/a> later this year) makes it very easy to manage compute resources and schedule your jobs in Windows Azure. You take the proven cluster management tool from Windows Server, connect it to Windows Azure, and then let it do the work. All you need to get started is a Windows Azure account. A set-up wizard takes care of the preparation, and the job scheduler runs your computations.<\/p>\n<p>What\u2019s more, there\u2019s no commitment: you can pay as you go, or you can negotiate a discount if you are going to use a lot of core hours.<\/p>\n<p>As Bill Hilf, general manager of product management for Windows Azure observes, it\u2019s easy to manage a wide range of sizes and types of workloads on Windows Azure. Like Bill, we, too, are extremely enthusiastic about the possibilities offered by the supercomputing prowess of Windows Azure. Such massive computational power is critical for \u201cbig data\u201d studies that increase our understanding of complex systems.<\/p>\n<p>The genome-wide association study (GWAS) is a case in point. Microsoft Research conducted a 27,000-core run on Windows Azure to crunch data from this study. With the nodes busy for 72 hours, 1 million tasks were consumed\u2014the equivalent of approximately 1.9 million compute hours. If the same computation had been run on an 8-core system, it would have taken 25 years to complete!<\/p>\n<p>The GWAS offers a powerful approach to identifying genetic markers that are associated with human diseases. It used data from a Wellcome Trust study of the British population, which examined some 2,000 individuals and a shared set of about 13,000 controls for each of seven major diseases. But as in all genome-wide association studies, this study had to overcome this significant problem: to study the genetics of a particular condition, say heart disease, researchers need a large sample of people who have the disorder, which means that some of these people are likely to be related to one another\u2014even if it\u2019s a distant relationship. This means that certain positive associations between specific genes and heart disease are false positives, the result of two people sharing a common ancestor rather than sharing a common propensity for clogged coronaries. In other words, your sample is not truly random, and you must statistically correct for \u201cconfounding,\u201d which was caused by the relatedness of your subjects.<\/p>\n<p><iframe loading=\"lazy\" title=\"FaST-LMM and Windows Azure Put Genetics Research on Faster Track\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/yMWRGo6DOl8?feature=oembed&rel=0\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<p>This is not an insurmountable statistical problem: there are so-called linear mixed models (LMMs) that can eliminate the confounding. Use of these, however, is a computational problem, because it takes an inordinately large amount of computer runtime and memory to run LMMs to account for the relatedness among thousands of people in your sample. In fact, the runtime and memory footprint that are required by these models scale as the cube and square of the number of individuals in the dataset, respectively. So, when you\u2019re dealing with a 10,000-person sample, the cost of the computer time and memory can quickly become prohibitive. And it is precisely these large datasets that offer the most promise for finding the connections between genetics and disease.<\/p>\n<p>To avoid this computational roadblock, Microsoft Research developed the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/um\/redmond\/projects\/mscompbio\/fastlmm\/\" target=\"_blank\">Factored Spectrally Transformed Linear Mixed Model<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (better known as FaST-LMM), an algorithm that extends the ability to detect new biological relations by using data that is several orders of magnitude larger. It allows much larger datasets to be processed and can, therefore, detect more subtle signals in the data.<\/p>\n<p>By using Windows Azure, Microsoft Research ran FaST-LMM on data from the Wellcome Trust, analyzing 63,524,915,020 pairs of genetic markers, looking for interactions among these markers for bipolar disease, coronary artery disease, hypertension, inflammatory bowel disease (Crohn\u2019s disease), rheumatoid arthritis, and type I and type II diabetes. The result: the discovery of new associations between the genome and these diseases\u2014discoveries that could presage potential breakthroughs in prevention and treatment.<\/p>\n<p>Results from individual pairs and the FaST-LMM algorithm are available via online query in <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/datamarket.azure.com\/dataset\/microsoftresearch\/EpistasisGWAS\" target=\"_blank\">Epistasis GWAS for 7 common diseases<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0in the Windows Azure Marketplace (free access), so researchers can independently validate results that they find in their lab.<\/p>\n<p>Today\u2019s smartphones have put a computer in your pocket. Now, with cloud computing through Window Azure, you have a supercomputer in your\u2014well, not in your pocket, but probably within your budget. Whatever your big-data concerns, Windows Azure can provide supercomputing power at an affordable price.<\/p>\n<p><em>\u2014<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/um\/people\/heckerman\/\" target=\"_blank\">David Heckerman<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Distinguished Scientist, Microsoft Research; Robert Davidson, Principal Software Architect, Microsoft Research, eScience; Carl Kadie, Principal Research Software Design Engineer, Microsoft Research, eScience; Jeff Baxter, Development Lead, Windows HPC, Microsoft; Jennifer Listgarten, Researcher, Microsoft Research Connections; and Christoph Lippert, Researcher, Microsoft Research Connections<\/em><\/p>\n<p><strong>Learn More<\/strong><\/p>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/stories\/HW_FaST-LMM-Genetics-Research_CS.pdf\" target=\"_blank\">Case Study: FaST-LMM and Windows Azure Put Genetics Research on Faster Track<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0(PDF file, 1.68 MB)<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.windowsazure.com\/en-us\/\" target=\"_blank\">Windows Azure<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/datamarket.azure.com\/dataset\/microsoftresearch\/EpistasisGWAS\" target=\"_blank\">Epistasis GWAS for 7 common diseases\u2014Windows Azure Marketplace<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/projects\/azure\/\" target=\"_blank\">Cloud Research Engagement<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/um\/redmond\/projects\/mscompbio\/fastlmm\/\" target=\"_blank\">FaST-LMM<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/apps\/video\/default.aspx?id=175586\" target=\"_blank\">Video: FaST-LMM and Windows Azure Put Genetics Research on Faster Track<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/blogs.msdn.com\/b\/msr_er\/archive\/2011\/09\/19\/identifying-genetic-factors-in-disease-with-big-data.aspx\" target=\"_blank\">Blog: Identifying Genetic Factors in Disease with Big Data<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/focus\/health\/default.aspx\" target=\"_blank\">Health and Wellbeing on Microsoft Research Connections<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/focus\/escience\/default.aspx\" target=\"_blank\">eScience on Microsoft Research Connections<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Think about supercomputers of the recent past. Just 15 years ago, supercomputers were rare and exotic machines. Government laboratories in the United States and Japan spent hundreds of millions of dollars on custom computing rigs and specialized facilities to house them, in a bid to tackle the world\u2019s toughest problems. But now there is an [&hellip;]<\/p>\n","protected":false},"author":32627,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[205399,197938],"tags":[194615,186831,194814,186889,195055,187230,195361,195549,187077,195659,195664,195670,195751,186715,196230,196243,196415,196439,197343,197751,187239],"research-area":[13563],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1613","post","type-post","status-publish","format-standard","hentry","category-azure","category-case-studies","tag-algorithm","tag-big-data","tag-bill-hilf","tag-cloud-computing","tag-cloud-services","tag-computer-science","tag-disease","tag-factored-spectrally-transformed-linear-mixed-model","tag-fast-lmm","tag-genetic-markers","tag-genome","tag-genome-wide-association-study","tag-gwas","tag-health","tag-linear-mixed-models","tag-lmm","tag-microsoft-hpc-pack-2012","tag-microsoft-research-connections","tag-supercomputing","tag-wellcome-trust-study","tag-windows-azure","msr-research-area-data-platform-analytics","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"November 12, 2012","formattedExcerpt":"Think about supercomputers of the recent past. Just 15 years ago, supercomputers were rare and exotic machines. Government laboratories in the United States and Japan spent hundreds of millions of dollars on custom computing rigs and specialized facilities to house them, in a bid to&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1613","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/32627"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1613"}],"version-history":[{"count":4,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1613\/revisions"}],"predecessor-version":[{"id":377558,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1613\/revisions\/377558"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1613"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1613"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1613"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1613"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1613"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1613"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1613"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1613"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1613"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1613"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1613"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}