{"id":493301,"date":"2018-07-02T07:29:36","date_gmt":"2018-07-02T14:29:36","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?post_type=msr-research-item&#038;p=493301"},"modified":"2018-10-16T22:34:53","modified_gmt":"2018-10-17T05:34:53","slug":"plug-in-regularized-estimation-of-high-dimensional-parameters-in-nonlinear-semiparametric-models","status":"publish","type":"msr-research-item","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/plug-in-regularized-estimation-of-high-dimensional-parameters-in-nonlinear-semiparametric-models\/","title":{"rendered":"Plug-in Regularized Estimation of High-Dimensional Parameters in Nonlinear Semiparametric Models"},"content":{"rendered":"<p><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">We develop a theory for estimation of a high-dimensional sparse parameter <\/span><span id=\"MathJax-Element-1-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-1\" class=\"math\"><span id=\"MathJax-Span-2\" class=\"mrow\"><span id=\"MathJax-Span-3\" class=\"mi\">\u03b8<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\"> defined as a minimizer of a population loss function <\/span><span id=\"MathJax-Element-2-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-4\" class=\"math\"><span id=\"MathJax-Span-5\" class=\"mrow\"><span id=\"MathJax-Span-6\" class=\"msubsup\"><span id=\"MathJax-Span-7\" class=\"mi\">L<\/span><span id=\"MathJax-Span-8\" class=\"mi\">D<\/span><\/span><span id=\"MathJax-Span-9\" class=\"mo\">(<\/span><span id=\"MathJax-Span-10\" class=\"mi\">\u03b8<\/span><span id=\"MathJax-Span-11\" class=\"mo\">,<\/span><span id=\"MathJax-Span-12\" class=\"msubsup\"><span id=\"MathJax-Span-13\" class=\"mi\">g<\/span><span id=\"MathJax-Span-14\" class=\"mn\">0<\/span><\/span><span id=\"MathJax-Span-15\" class=\"mo\">)<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\"> which, in addition to <\/span><span id=\"MathJax-Element-3-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-16\" class=\"math\"><span id=\"MathJax-Span-17\" class=\"mrow\"><span id=\"MathJax-Span-18\" class=\"mi\">\u03b8<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">, depends on a, potentially infinite dimensional, nuisance parameter <\/span><span id=\"MathJax-Element-4-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-19\" class=\"math\"><span id=\"MathJax-Span-20\" class=\"mrow\"><span id=\"MathJax-Span-21\" class=\"msubsup\"><span id=\"MathJax-Span-22\" class=\"mi\">g<\/span><span id=\"MathJax-Span-23\" class=\"mn\">0<\/span><\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">. Our approach is based on estimating <\/span><span id=\"MathJax-Element-5-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-24\" class=\"math\"><span id=\"MathJax-Span-25\" class=\"mrow\"><span id=\"MathJax-Span-26\" class=\"mi\">\u03b8<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\"> via an <\/span><span id=\"MathJax-Element-6-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-27\" class=\"math\"><span id=\"MathJax-Span-28\" class=\"mrow\"><span id=\"MathJax-Span-29\" class=\"msubsup\"><span id=\"MathJax-Span-30\" class=\"mi\">\u2113<\/span><span id=\"MathJax-Span-31\" class=\"mn\">1<\/span><\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">-regularized minimization of a sample analog of <\/span><span id=\"MathJax-Element-7-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-32\" class=\"math\"><span id=\"MathJax-Span-33\" class=\"mrow\"><span id=\"MathJax-Span-34\" class=\"msubsup\"><span id=\"MathJax-Span-35\" class=\"mi\">L<\/span><span id=\"MathJax-Span-36\" class=\"mi\">S<\/span><\/span><span id=\"MathJax-Span-37\" class=\"mo\">(<\/span><span id=\"MathJax-Span-38\" class=\"mi\">\u03b8<\/span><span id=\"MathJax-Span-39\" class=\"mo\">,<\/span><span id=\"MathJax-Span-40\" class=\"texatom\"><span id=\"MathJax-Span-41\" class=\"mrow\"><span id=\"MathJax-Span-42\" class=\"munderover\"><span id=\"MathJax-Span-43\" class=\"mi\">g<\/span><span id=\"MathJax-Span-44\" class=\"mo\">^<\/span><\/span><\/span><\/span><span id=\"MathJax-Span-45\" class=\"mo\">)<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">, plugging in a first-stage estimate <\/span><span id=\"MathJax-Element-8-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-46\" class=\"math\"><span id=\"MathJax-Span-47\" class=\"mrow\"><span id=\"MathJax-Span-48\" class=\"texatom\"><span id=\"MathJax-Span-49\" class=\"mrow\"><span id=\"MathJax-Span-50\" class=\"munderover\"><span id=\"MathJax-Span-51\" class=\"mi\">g<\/span><span id=\"MathJax-Span-52\" class=\"mo\">^<\/span><\/span><\/span><\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">, computed on a hold-out sample. We define a population loss to be (Neyman) orthogonal if the gradient of the loss with respect to <\/span><span id=\"MathJax-Element-9-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-53\" class=\"math\"><span id=\"MathJax-Span-54\" class=\"mrow\"><span id=\"MathJax-Span-55\" class=\"mi\">\u03b8<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">, has pathwise derivative with respect to <\/span><span id=\"MathJax-Element-10-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-56\" class=\"math\"><span id=\"MathJax-Span-57\" class=\"mrow\"><span id=\"MathJax-Span-58\" class=\"mi\">g<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\"> equal to zero, when evaluated at the true parameter and nuisance component. We show that orthogonality implies a second-order impact of the first stage nuisance error on the second stage target parameter estimate. Our approach applies to both convex and non-convex losses, albeit the latter case requires a small adaptation of our method with a preliminary estimation step of the target parameter. Our result enables oracle convergence rates for <\/span><span id=\"MathJax-Element-11-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-59\" class=\"math\"><span id=\"MathJax-Span-60\" class=\"mrow\"><span id=\"MathJax-Span-61\" class=\"mi\">\u03b8<\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\"> under assumptions on the first stage rates, typically of the order of <\/span><span id=\"MathJax-Element-12-Frame\" class=\"MathJax\"><span id=\"MathJax-Span-62\" class=\"math\"><span id=\"MathJax-Span-63\" class=\"mrow\"><span id=\"MathJax-Span-64\" class=\"msubsup\"><span id=\"MathJax-Span-65\" class=\"mi\">n<\/span><span id=\"MathJax-Span-66\" class=\"texatom\"><span id=\"MathJax-Span-67\" class=\"mrow\"><span id=\"MathJax-Span-68\" class=\"mo\">\u2212<\/span><span id=\"MathJax-Span-69\" class=\"mn\">1<\/span><span id=\"MathJax-Span-70\" class=\"texatom\"><span id=\"MathJax-Span-71\" class=\"mrow\"><span id=\"MathJax-Span-72\" class=\"mo\">\/<\/span><\/span><\/span><span id=\"MathJax-Span-73\" class=\"mn\">4<\/span><\/span><\/span><\/span><\/span><\/span><\/span><span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">. <\/span><br \/>\n<span style=\"float: none;background-color: transparent;color: #000000;font-family: 'Lucida Grande',helvetica,arial,verdana,sans-serif;font-size: 14.4px;font-style: normal;font-variant: normal;font-weight: 400;letter-spacing: normal;line-height: 20px;text-align: left;text-decoration: none;text-indent: 0px\">We show how such an orthogonal loss can be constructed via a novel orthogonalization process for a general model defined by conditional moment restrictions. We apply our theory to high-dimensional versions of standard estimation problems in statistics and econometrics, such as: estimation of conditional moment models with missing data, estimation of structural utilities in games of incomplete information and estimation of treatment effects in regression models with non-linear link functions. <\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We develop a theory for estimation of a high-dimensional sparse parameter \u03b8 defined as a minimizer of a population loss function LD(\u03b8,g0) which, in addition to \u03b8, depends on a, potentially infinite dimensional, nuisance parameter g0. Our approach is based on estimating \u03b8 via an \u21131-regularized minimization of a sample analog of LS(\u03b8,g^), plugging in [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2018-06-13","msr_highlight_text":"","msr_notes":"arXiv preprint arXiv:1806.04823","msr_longbiography":"","msr_publicationurl":"https:\/\/arxiv.org\/abs\/1806.04823","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13548],"msr-publication-type":[193726],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-493301","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-economics","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2018-06-13","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"arXiv preprint arXiv:1806.04823","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"https:\/\/arxiv.org\/abs\/1806.04823","msr_doi":"","msr_publication_uploader":[{"type":"url","title":"https:\/\/arxiv.org\/abs\/1806.04823","viewUrl":false,"id":false,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":0,"url":"https:\/\/arxiv.org\/abs\/1806.04823"}],"msr-author-ordering":[{"type":"text","value":"Victor Chernozhukov","user_id":0,"rest_url":false},{"type":"text","value":"Denis Nekipelov","user_id":0,"rest_url":false},{"type":"text","value":"Vira Semenova","user_id":0,"rest_url":false},{"type":"edited_text","value":"Vasilis Syrgkanis","user_id":34499,"rest_url":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Vasilis Syrgkanis"}],"msr_impact_theme":[],"msr_research_lab":[199563],"msr_event":[],"msr_group":[656316],"msr_project":[656325,332666],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"unpublished","related_content":{"projects":[{"ID":656325,"post_title":"EconML","post_name":"econml","post_type":"msr-project","post_date":"2020-06-02 09:40:48","post_modified":"2025-03-31 18:33:10","post_status":"publish","permalink":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/project\/econml\/","post_excerpt":"EconML\u00a0is an open source Python package developed by the ALICE team at Microsoft Research that applies the power of machine learning techniques to estimate individualized causal responses from observational or experimental data. The suite of estimation methods provided in EconML represents the latest advances in causal machine learning. By incorporating individual machine learning steps into interpretable causal models, these methods improve the reliability of what-if predictions and make causal analysis quicker and easier for a&hellip;","_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/656325"}]}},{"ID":332666,"post_title":"ALICE","post_name":"alice","post_type":"msr-project","post_date":"2016-12-08 05:45:31","post_modified":"2020-04-14 07:52:37","post_status":"publish","permalink":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/project\/alice\/","post_excerpt":"ALICE Automated Learning and Intelligence for Causation and Economics Alice is a project to direct Artificial Intelligence towards economic decision making. We are building tools that combine state-of-the-art machine learning with econometrics \u2013 the measurement of economic systems -- in order to bring automation to economic decision making. The heart of this project is a striving to measure causation: if you want to understand or make policy decisions in a complex economy, you need to&hellip;","_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/332666"}]}}]},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/493301","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/493301\/revisions"}],"predecessor-version":[{"id":493313,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/493301\/revisions\/493313"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=493301"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=493301"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=493301"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=493301"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=493301"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=493301"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=493301"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=493301"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=493301"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=493301"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=493301"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=493301"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=493301"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}