{"id":5251,"date":"2016-02-23T09:00:45","date_gmt":"2016-02-23T17:00:45","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/msr_er\/?p=5251"},"modified":"2016-07-20T07:28:40","modified_gmt":"2016-07-20T14:28:40","slug":"microsoft-opens-up-online-infrastructure-to-the-research-community","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/microsoft-opens-up-online-infrastructure-to-the-research-community\/","title":{"rendered":"Microsoft opens up online infrastructure to the research community"},"content":{"rendered":"<p>As our lives become increasingly conducted online, the growth rate of data recording our daily activities has exploded. These records, which range from the details of our online social engagements to the graphs gathered by major search engine companies representing a snapshot of our collective curiosity and knowledge, have largely been kept in the hands of the corporations that collect them, out of reach of the research community.<\/p>\n<p>A recent research challenge changed all that. The <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"WSDM Cup\" href=\"https:\/\/wsdmcupchallenge.azurewebsites.net\/\" target=\"_blank\">WSDM Cup,<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> a competition run by Microsoft Research in partnership with the 9th ACM Conference on Web Search and Data Mining (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"WSDM\" href=\"http:\/\/www.wsdm-conference.org\/2016\/\" target=\"_blank\">WSDM<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>), gave researchers open access to the data in Microsoft\u2019s academic graph to assess the query-independent importance of scholarly articles in the graph.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/02\/WSDM.jpg\"><img decoding=\"async\" class=\"aligncenter wp-image-5382 size-full\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/02\/WSDM.jpg\" alt=\"WSDM Cup participants and workshop attendees\" height=\"100%\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/02\/WSDM.jpg 900w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/02\/WSDM-300x200.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/02\/WSDM-768x512.jpg 768w\" sizes=\"(max-width: 900px) 100vw, 900px\" \/><\/a><em>WSDM Cup participants and workshop attendees<\/em><\/p>\n<p>\u201cThe WSDM Cup is the first time a major commercial search engine, Microsoft Academic powered by Bing, has opened its data to the academic community for research,\u201d said Kuansan Wang, Director of Microsoft Research\u2019s Internet Services Research Center. The graph, a continuously growing collection of millions of pieces of information about scientific publications, authors, institutions, journals, conferences, and fields of study, is the largest such graph in existence.<\/p>\n<p>\u201cWe are also opening the graph\u2019s back-end <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Academic Knowledge API\" href=\"http:\/\/aka.ms\/academicapi\" target=\"_blank\">Academic Knowledge API<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u201d said Wang. The service-based API enables researchers to access fresh data from the web crawled by an industrial-grade search engine. \u201cThe community can build on top of our baseline system and test innovative ideas.\u201d<\/p>\n<p>The result? Another first: A graph of web data about researchers, built on and contributed to by both internal and external researchers.<\/p>\n<p><strong>Challenge accepted<\/strong><\/p>\n<p>The goal of the WSDM Cup was to provide the best static rank values for each publication in the Microsoft Academic Graph. The challenge attracted 80 teams from 34 institutions across 13 countries to compete fiercely over a two-month period.<\/p>\n<p>Signaling a sign of technological advancement and a milestone in computer science research, more than half of the final submissions fared better than the seminal PageRank algorithm that was first made popular by the web search giant Google and is still in use by many publishers and independent consulting firms in assessing the impact of scholarly research.<\/p>\n<p>\u201cThe most commonly used measures of importance and impact in scholarship, such as citation counts, Journal Impact Factor, and <em>h-index<\/em>, are one-dimensional, looking solely at the citations between publications,\u201d said workshop co-chair Alex Wade, Director of Scholarly Communications at Microsoft Research. \u201cBut as we look into the richer and more varied relationships between the people, places, and things that make up the scholarly record, new opportunities for ranking and evaluation emerge. A key goal of this challenge was to test whether that heterogeneity can actually lead to improved ranking solutions.\u201d<\/p>\n<p>The top eight teams from the cup\u2019s first phase were invited to participate in Phase 2 of the challenge and to present their approaches at a WSDM conference workshop. During Phase 2, the top eight datasets were used to power the ranker used by Bing for academic queries.<\/p>\n<p>The WSDM Cup Entity Ranking Challenge <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Workshop\" href=\"https:\/\/wsdmcupchallenge.azurewebsites.net\/Home\/Workshop\" target=\"_blank\">Workshop<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> brought together researchers in the areas of data mining and large heterogeneous networks. Organized by Northeastern University, Elsevier, and Microsoft Research, the workshop was held this week in San Francisco, showcasing the efforts of the leading teams in the 2016 WSDM Cup.<\/p>\n<blockquote><p><em>\u201cPeople are really excited that Microsoft was willing to share this data with the research community,\u201d said Jevin West, Assistant Professor at the University of Washington, workshop attendee, and a member of the second-place Eigenfactor team. \u201cWe\u2019re really grateful Microsoft engaged with the research community on this important problem of finding better ways of searching scientific content. I hope we can continue seeing these collaborations between industry and the research community.\u201d<span style=\"color: #ff6600;\"><strong><br \/>\n<\/strong><\/span><\/em><\/p><\/blockquote>\n<p><strong>New challenges and opportunities<\/strong><\/p>\n<p>The WSDM Cup\u2019s online experiments allowed participants to see how their algorithms performed in front of real users in addition to the traditional static evaluation found in similar competitions.\u00a0The live competition generated considerable excitement among the internet research crowd\u2014and conference organizers agreed.<\/p>\n<p>Based on the widespread popularity of the WSDM Cup challenge, Microsoft has been invited to partner in two similar competitions this year. Microsoft will run the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"2016 KDD Cup\" href=\"http:\/\/kddcup2016.azurewebsites.net\/\" target=\"_blank\">2016 KDD Cup<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> in partnership with the newly established Big Scholarly Data Institute in Tsinghua, Beijing. And to strengthen the research community\u2019s understanding and use of online evaluations as a key part of modern information retrieval, Microsoft has partnered with TREC, overseen by the National Institute of Standards and Technology, to support a new <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"OpenSearch\" href=\"http:\/\/trec-open-search.org\/\" target=\"_blank\">OpenSearch<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> track that will be run for the first time at TREC 2016.<\/p>\n<p>\u201cThis is the beginning of a whole new era in data access and analysis that will benefit the research community for many years to come,\u201d said Wade.<\/p>\n<p><em>\u2014Christine Clifton-Thornton, Senior Writer, Microsoft Research<\/em><\/p>\n<p><strong>Learn more<\/strong><\/p>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"The winning WSDM Cup solution\" href=\"https:\/\/doc.co\/LLcDRM\/7HPGvd\" target=\"_blank\">The winning WSDM Cup solution, <em>An Efficient Solution to Reinforce Paper Ranking using Author\/Venue\/Citation Information<\/em><span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0(paper)<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Top eight WSDM Cup teams\" href=\"https:\/\/docs.com\/alex-wade\/4623\/wsdm-cup-2016%20\" target=\"_blank\">Top eight WSDM Cup teams <span class=\"sr-only\"> (opens in new tab)<\/span><\/a>(papers)<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"https:\/\/wsdmcupchallenge.azurewebsites.net\/Home\/Leaderboard\" target=\"_blank\">2016 WSDM Cup Leaderboard<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Academic Knowledge KPI\" href=\"https:\/\/www.projectoxford.ai\/academic\" target=\"_blank\">Academic Knowledge API<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>As our lives become increasingly conducted online, the growth rate of data recording our daily activities has exploded. These records, which range from the details of our online social engagements to the graphs gathered by major search engine companies representing a snapshot of our collective curiosity and knowledge, have largely been kept in the hands [&hellip;]<\/p>\n","protected":false},"author":32627,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[194460],"tags":[194547,194549,194561,194611,186854,196110,196909,186439,197825],"research-area":[13563,13555],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-5251","post","type-post","status-publish","format-standard","hentry","category-search-and-information-retrieval","tag-academic-graph","tag-academic-knowledge-api","tag-acm","tag-alex-wade","tag-data-mining","tag-kdd-cup","tag-project-oxford","tag-web-search","tag-wsdm-cup","msr-research-area-data-platform-analytics","msr-research-area-search-information-retrieval","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"February 23, 2016","formattedExcerpt":"As our lives become increasingly conducted online, the growth rate of data recording our daily activities has exploded. These records, which range from the details of our online social engagements to the graphs gathered by major search engine companies representing a snapshot of our collective&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/5251","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/32627"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=5251"}],"version-history":[{"count":1,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/5251\/revisions"}],"predecessor-version":[{"id":236913,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/5251\/revisions\/236913"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=5251"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=5251"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=5251"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=5251"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=5251"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=5251"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=5251"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=5251"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=5251"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=5251"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=5251"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}