{"id":722389,"date":"2021-02-03T09:04:55","date_gmt":"2021-02-03T17:04:55","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?p=722389"},"modified":"2021-02-11T12:37:06","modified_gmt":"2021-02-11T20:37:06","slug":"microsoft-vision-model-resnet-50-combines-web-scale-data-and-multi-task-learning-to-achieve-state-of-the-art","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/microsoft-vision-model-resnet-50-combines-web-scale-data-and-multi-task-learning-to-achieve-state-of-the-art\/","title":{"rendered":"Microsoft Vision Model ResNet-50 combines web-scale data and multi-task learning to achieve state of the art"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_graph_animation_no_logo_.gif\" alt=\"Two part animation: \n\nIn part one, an animation graphs shows Microsoft Vision Model ResNet-50 is a state-of-the-art pretrained ResNet-50 model,   \nmeasured above by the mean average score across seven popular computer vision benchmarks. \n\nIn part two, three models are represented through visuals and text. \"\/><figcaption>Microsoft Vision Model ResNet-50 is a state-of-the-art pretrained ResNet-50 model, measured above by the mean average score across seven popular computer vision benchmarks.<\/figcaption><\/figure>\n\n\n\n<p>Pretrained vision models accelerate deep learning research and bring down the cost of performing computer vision tasks in production. By pretraining one large vision model to learn general visual representation of images, then transferring the learning across multiple downstream tasks, a team achieves competitive performance at a fraction of the cost when compared to collecting new data and training a new model for each task. Further fine-tuning of the pretrained model with task-specific training data often yields even higher performance than training specialized models.<\/p>\n\n\n\n<p>Microsoft Vision Model ResNet-50 is a large pretrained vision model created by the Multimedia Group at Microsoft Bing. The model is built using the search engine\u2019s web-scale image data in order to power its <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.bing.com\/images\/trending\">Image Search<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.bing.com\/visualsearch\/\">Visual Search<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. We are excited to announce that we are making Microsoft Vision Model ResNet-50 publicly available today.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><\/td><td><strong>Microsoft <br>Vision Model<\/strong><\/td><td><strong>Google <br>Big Transfer<\/strong><\/td><td><strong>OpenAI <br>CLIP<\/strong><\/td><td><strong>PyTorch <br>ResNet-50<\/strong><\/td><\/tr><tr><td><strong>CIFAR-10<\/strong><\/td><td>92.64<\/td><td>92.51<\/td><td>87.85<\/td><td>82.23<\/td><\/tr><tr><td><strong>CIFAR-100<\/strong><\/td><td>76.05<\/td><td>79.84<\/td><td>67.02<\/td><td>61.36<\/td><\/tr><tr><td><strong>STL-10<\/strong><\/td><td>98.10<\/td><td>98.71<\/td><td>97.20<\/td><td>96.32<\/td><\/tr><tr><td><strong>SVHN<\/strong><\/td><td>72.64<\/td><td>64.22<\/td><td>64.33<\/td><td>52.05<\/td><\/tr><tr><td><strong>CUB<\/strong><\/td><td>82.20<\/td><td>82.75<\/td><td>68.38<\/td><td>38.79<\/td><\/tr><tr><td><strong>Flowers-102<\/strong><\/td><td>99.28<\/td><td>99.38<\/td><td>95.23<\/td><td>77.62<\/td><\/tr><tr><td><strong>ImageNet <\/strong><\/td><td>73.85<\/td><td>72.83<\/td><td>57.00<\/td><td>75.63<\/td><\/tr><tr><td><strong>Average<\/strong><\/td><td>84.97<\/td><td>84.32<\/td><td>76.72<\/td><td>69.14<\/td><\/tr><\/tbody><\/table><figcaption>Evaluation of Microsoft Vision Model ResNet-50 and comparable models on seven popular computer vision benchmarks<\/figcaption><\/figure>\n\n\n\n<p>We evaluate Microsoft Vision Model ResNet-50 against the state-of-the-art pretrained ResNet-50 models and the baseline PyTorch implementation of ResNet-50, following the experiment setup of <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/cdn.openai.com\/papers\/Learning_Transferable_Visual_Models_From_Natural_Language_Supervision.pdf\">OpenAI CLIP<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Linear probe is a standard evaluation protocol for representation learning in which a linear classifier is trained on frozen embeddings of the pretrained vision model for each benchmark.<\/p>\n\n\n\n<p>To achieve the state-of-the-art performance in a cost-sensitive production setting, Microsoft Vision Model ResNet-50 leverages multi-task learning and optimizes separately for four datasets, including <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1409.0575\">ImageNet-22k<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/microsoft-coco-common-objects-in-context\/\">Microsoft COCO<\/a>, and two <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/arxiv.org\/pdf\/1906.03219v1.pdf\">web-supervised<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> datasets containing 40 million image-label pairs collected from image search engines.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/Argus-Vision-Model-Blog-Fig-1.jpg\" alt=\"\" class=\"wp-image-722407\" width=\"-145\" height=\"-122\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/Argus-Vision-Model-Blog-Fig-1.jpg 742w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/Argus-Vision-Model-Blog-Fig-1-300x253.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/Argus-Vision-Model-Blog-Fig-1-14x12.jpg 14w\" sizes=\"(max-width: 742px) 100vw, 742px\" \/><figcaption>Multi-task learning via hard parameter sharing<\/figcaption><\/figure><\/div>\n\n\n\n<p>We chose to use multi-task learning with hard parameter sharing&nbsp;(see&nbsp;figure above). Single neural networks&nbsp;will optimize each classification problem at the same time. By using tasks of varied sizes\u2014up&nbsp;to 40&nbsp;million&nbsp;images with 100,000&nbsp;different labels from web-supervised sources\u2014Microsoft Vision&nbsp;Model&nbsp;ResNet-50&nbsp;can achieve high robustness and good transferability to different domains.&nbsp;&nbsp;<\/p>\n\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1002645\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Spotlight: AI-POWERED EXPERIENCE<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300 display-block\" href=\"https:\/\/aka.ms\/research-copilot\/?OCID=msr_researchforum_Copilot_MCR_Blog_Promo\" aria-label=\"Microsoft research copilot experience\" data-bi-cN=\"Microsoft research copilot experience\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/01\/MSR-Chat-Promo.png\" alt=\"\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">Microsoft research copilot experience<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p id=\"microsoft-research-copilot-experience\" class=\"large\">Discover more about research at Microsoft through our AI-powered experience<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/aka.ms\/research-copilot\/?OCID=msr_researchforum_Copilot_MCR_Blog_Promo\" aria-describedby=\"microsoft-research-copilot-experience\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" data-bi-cN=\"Microsoft research copilot experience\" target=\"_blank\">\n\t\t\t\t\t\t\tStart now\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n\n<p>During training, images from each dataset are sampled proportionally to the size of the datasets. This approach promotes larger datasets at first, but once optimization flattens, the optimizer needs to look for improvements to smaller datasets without downgrading the performance of the larger ones. As a result, final accuracy scores for each of the datasets are competitive with specialized models trained on each specific dataset.<\/p>\n\n\n\n<h2 id=\"dive-deeper-into-microsoft-vision-model-resnet-50\">Dive deeper into Microsoft Vision Model ResNet-50<\/h2>\n\n\n\n<div class=\"annotations \" data-bi-aN=\"margin-callout\">\n\t<article class=\"annotations__list card depth-16 bg-body p-4 annotations__list--right\">\n\t\t<div class=\"annotations__list-item\">\n\t\t\t\t\t\t\t<a href=\"https:\/\/note.microsoft.com\/MSR-Webinar-Vision-Model-Registration-Live.html?wt.mc_id=blog_MSR-WBNR_bingmulti_margin\" target=\"_self\" aria-label=\"Microsoft Vision Model ResNet-50: Pretrained vision model built with web-scale data\" data-bi-type=\"annotated-link\" data-bi-cN=\"Microsoft Vision Model ResNet-50: Pretrained vision model built with web-scale data\" class=\"annotations__list-thumbnail\" >\n\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"170\" height=\"96\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-300x169.png\" class=\"mb-2\" alt=\"Microsoft Research Webinars\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-300x169.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-1024x576.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-768x432.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-1536x865.png 1536w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-2048x1153.png 2048w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-16x9.png 16w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-1066x600.png 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-655x368.png 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-343x193.png 343w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-640x360.png 640w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-960x540.png 960w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-1280x720.png 1280w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2020\/12\/1400x788_Webinar_Social_Asset_noicons-1920x1080.png 1920w\" sizes=\"auto, (max-width: 170px) 100vw, 170px\" \/>\t\t\t\t<\/a>\n\t\t\t\t\t\t\t<span class=\"annotations__type d-block text-uppercase font-weight-semibold text-neutral-300 small\">Webinar <\/span>\n\t\t\t<a href=\"https:\/\/note.microsoft.com\/MSR-Webinar-Vision-Model-Registration-Live.html?wt.mc_id=blog_MSR-WBNR_bingmulti_margin\" data-bi-cN=\"Microsoft Vision Model ResNet-50: Pretrained vision model built with web-scale data\" data-external-link=\"false\" data-bi-aN=\"margin-callout\" data-bi-type=\"annotated-link\" class=\"annotations__link font-weight-semibold text-decoration-none\"><span>Microsoft Vision Model ResNet-50: Pretrained vision model built with web-scale data<\/span>&nbsp;<span class=\"glyph-in-link glyph-append glyph-append-chevron-right\" aria-hidden=\"true\"><\/span><\/a>\t\t\t\t\t<\/div>\n\t<\/article>\n<\/div>\n\n\n\n<div class=\"annotations \" data-bi-aN=\"margin-callout\">\n\t<article class=\"annotations__list card depth-16 bg-body p-4 annotations__list--left\">\n\t\t<div class=\"annotations__list-item\">\n\t\t\t\t\t\t<span class=\"annotations__type d-block text-uppercase font-weight-semibold text-neutral-300 small\">Code<\/span>\n\t\t\t<a href=\"\" data-bi-cN=\"Microsoft Vision Model ResNet-50\" data-external-link=\"false\" data-bi-aN=\"margin-callout\" data-bi-type=\"annotated-link\" class=\"annotations__link font-weight-semibold text-decoration-none\"><span>Microsoft Vision Model ResNet-50<\/span>&nbsp;<span class=\"glyph-in-link glyph-append glyph-append-chevron-right\" aria-hidden=\"true\"><\/span><\/a>\t\t\t\t\t<\/div>\n\t<\/article>\n<\/div>\n\n\n\n<p>You can get your hands on Microsoft Vision Model ResNet-50 by visiting <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/aka.ms\/microsoftvision\">https:\/\/aka.ms\/microsoftvision<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. On this webpage, you will find a description of how to install and use the model to encode images into embedding vectors. We are also hosting a public webinar about our model on February 25 at 10 AM PT. Part of the webinar will be a demo of applying the model to example computer vision tasks, and there will be a live Q&A session at the end. You can <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/note.microsoft.com\/MSR-Webinar-Vision-Model-Registration-Live.html?wt.mc_id=blog_MSR-WBNR_bingmulti_link\">learn more and register<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> for the webinar at its registration page.<\/p>\n\n\n\n<h3 id=\"acknowledgments\">Acknowledgments<\/h3>\n\n\n\n<p>Microsoft Vision Model ResNet-50 is one in a family of world-class computer vision models we\u2019ve built at Microsoft Bing Multimedia Group. We thank Mark Bolin, Ravi Yada, Kun Wu, Meenaz Merchant, Arun Sacheti, and Jordi Ribas for enabling this work. If the opportunity to build pioneering computer vision models excites you, visit our <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/careers.microsoft.com\/us\/en\/search-results?keywords=bing%20multimedia\">career page<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> to learn about our openings.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pretrained vision models accelerate deep learning research and bring down the cost of performing computer vision tasks in production. By pretraining one large vision model to learn general visual representation of images, then transferring the learning across multiple downstream tasks, a team achieves competitive performance at a fraction of the cost when compared to collecting [&hellip;]<\/p>\n","protected":false},"author":38838,"featured_media":722935,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[],"research-area":[13562],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-722389","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-computer-vision","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[{"type":"guest","value":"zygmunt-lenyk","user_id":"722944","display_name":"Zygmunt Lenyk","author_link":"<a href=\"https:\/\/www.linkedin.com\/in\/zlenyk\/\" aria-label=\"Visit the profile page for Zygmunt Lenyk\">Zygmunt Lenyk<\/a>","is_active":true,"last_first":"Lenyk, Zygmunt","people_section":0,"alias":"zygmunt-lenyk"},{"type":"guest","value":"junwon-park","user_id":"722941","display_name":"Junwon  Park","author_link":"<a href=\"https:\/\/www.linkedin.com\/in\/productceo\/\" aria-label=\"Visit the profile page for Junwon  Park\">Junwon  Park<\/a>","is_active":true,"last_first":"Park, Junwon ","people_section":0,"alias":"junwon-park"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-960x540.jpg\" class=\"img-object-cover\" alt=\"Graphs shows Microsoft Vision Model ResNet-50 is a state-of-the-art pretrained ResNet-50 model, measured above by the mean average score across seven popular computer vision benchmarks.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-960x540.jpg 960w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-300x169.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-1024x576.jpg 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-768x432.jpg 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-1536x864.jpg 1536w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-2048x1152.jpg 2048w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-16x9.jpg 16w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-1066x600.jpg 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-655x368.jpg 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-343x193.jpg 343w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-640x360.jpg 640w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-1280x720.jpg 1280w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/1400x788_Visual_language_no_logo_still-2-1920x1080.jpg 1920w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"<a href=\"https:\/\/www.linkedin.com\/in\/zlenyk\/\" title=\"Go to researcher profile for Zygmunt Lenyk\" aria-label=\"Go to researcher profile for Zygmunt Lenyk\" data-bi-type=\"byline author\" data-bi-cN=\"Zygmunt Lenyk\">Zygmunt Lenyk<\/a> and <a href=\"https:\/\/www.linkedin.com\/in\/productceo\/\" title=\"Go to researcher profile for Junwon  Park\" aria-label=\"Go to researcher profile for Junwon  Park\" data-bi-type=\"byline author\" data-bi-cN=\"Junwon  Park\">Junwon  Park<\/a>","formattedDate":"February 3, 2021","formattedExcerpt":"Pretrained vision models accelerate deep learning research and bring down the cost of performing computer vision tasks in production. By pretraining one large vision model to learn general visual representation of images, then transferring the learning across multiple downstream tasks, a team achieves competitive performance&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/722389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/38838"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=722389"}],"version-history":[{"count":11,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/722389\/revisions"}],"predecessor-version":[{"id":725695,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/722389\/revisions\/725695"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media\/722935"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=722389"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=722389"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=722389"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=722389"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=722389"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=722389"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=722389"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=722389"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=722389"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=722389"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=722389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}