{"id":714352,"date":"2020-12-29T07:05:05","date_gmt":"2020-12-29T15:05:05","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?post_type=msr-research-item&#038;p=714352"},"modified":"2023-02-21T03:20:38","modified_gmt":"2023-02-21T11:20:38","slug":"wiki2row-the-ins-and-outs-or-row-suggestion-with-a-large-scale-knowledge-base-2","status":"publish","type":"msr-research-item","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/wiki2row-the-ins-and-outs-or-row-suggestion-with-a-large-scale-knowledge-base-2\/","title":{"rendered":"Wiki2row \u2013 the In\u2019s and Out\u2019s or Row Suggestion with a Large Scale Knowledge Base"},"content":{"rendered":"<p>Row suggestion, a generalization of set expansion, is the task of augmenting a given table of text and numbers with additional, relevant rows. A viable approach is to generate trustworthy suggestions by grounding candidates in a verifiable source and in our work we focus on knowledge bases, in particular on Wikidata. Our pipeline begins by linking existing rows and columns to entities and properties. The primary focus of this work is to improve candidate generation and ranking without requiring in-domain training or fine-tuning. \u00a0Our novel contributions are to account for semantic information by using BigGraph embeddings and GPT-3 free text generation, and tabular information by differentiating between in-table properties (explicit in given columns) and out-of-table properties (implicit from the knowledge base). Measured on the WikiTables benchmark, our solution exceeds or achieves comparable performance to previous state-of-the-art systems that use small, carefully curated knowledge bases (such as DBpedia). We extend our algorithm to present the first approach to bias-aware row suggestion when table completion is not achievable, that is, when we cannot define a complete set of entities. We suggest quantitative measures to evaluate performance on this task.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Row suggestion, a generalization of set expansion, is the task of augmenting a given table of text and numbers with additional, relevant rows. A viable approach is to generate trustworthy suggestions by grounding candidates in a verifiable source and in our work we focus on knowledge bases, in particular on Wikidata. Our pipeline begins by [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"3100854292","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2020-10-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13545,13555],"msr-publication-type":[193724],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246691,248503,248683,248644,248698,248701,248707,248710,248713,248704],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-714352","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-research-area-search-information-retrieval","msr-locale-en_us","msr-field-of-study-computer-science","msr-field-of-study-information-retrieval","msr-field-of-study-knowledge-base","msr-field-of-study-ranking","msr-field-of-study-row","msr-field-of-study-row-and-column-spaces","msr-field-of-study-set-expansion","msr-field-of-study-text-messaging","msr-field-of-study-trustworthiness","msr-field-of-study-verifiable-secret-sharing"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2020-10-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/wiki2row-the-ins-and-outs-or-row-suggestion-with-a-large-scale-knowledge-base\/","label_id":"243109","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Alperen Karaoglu","user_id":0,"rest_url":false},{"type":"text","value":"Carina Negreanu","user_id":0,"rest_url":false},{"type":"text","value":"Shuang Chen","user_id":0,"rest_url":false},{"type":"text","value":"Jack Williams","user_id":0,"rest_url":false},{"type":"text","value":"Dany Fabian","user_id":0,"rest_url":false},{"type":"text","value":"Andy Gordon","user_id":0,"rest_url":false},{"type":"text","value":"Chin-Yew Lin","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[199560,199561],"msr_event":[],"msr_group":[144919],"msr_project":[792599],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"miscellaneous","related_content":{"projects":[{"ID":792599,"post_title":"Table Interpretation","post_name":"table-interpretation","post_type":"msr-project","post_date":"2021-11-05 02:02:36","post_modified":"2024-09-25 11:42:48","post_status":"publish","permalink":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/project\/table-interpretation\/","post_excerpt":"Bringing out the power of semantics in tabular data Tables are commonly used to organize information, playing a key role in data analytics, scientific research, and business communication. The ability to automatically extract semantics in tables can empower many downstream applications such as data analytics, robotic process automation (RPA), knowledge base population, etc. In this project, we explore multiple aspects of semantic table understanding and real-world applications of such technologies. One of the outcomes of&hellip;","_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/792599"}]}}]},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/714352","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/714352\/revisions"}],"predecessor-version":[{"id":714355,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/714352\/revisions\/714355"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=714352"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=714352"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=714352"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=714352"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=714352"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=714352"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=714352"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=714352"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=714352"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=714352"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=714352"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=714352"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=714352"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}