{"id":1171670,"date":"2026-05-12T15:59:42","date_gmt":"2026-05-12T22:59:42","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/clin-summ-incremental-longitudinal-summarization-of-clinical-notes-enables-scalable-representation-and-early-disease-prediction\/"},"modified":"2026-05-21T15:02:28","modified_gmt":"2026-05-21T22:02:28","slug":"clin-summ-incremental-longitudinal-summarization-of-clinical-notes-enables-scalable-representation-and-early-disease-prediction","status":"publish","type":"msr-research-item","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/clin-summ-incremental-longitudinal-summarization-of-clinical-notes-enables-scalable-representation-and-early-disease-prediction\/","title":{"rendered":"CLIN-SUMM: Incremental Longitudinal Summarization of Clinical Notes Enables Scalable Representation and Early Disease Prediction"},"content":{"rendered":"<p>Electronic health records contain years of longitudinal clinical notes rich in evolving patient information, yet their volume, redundancy, and fragmentation limit clinical usability and scalable modeling. We present CLIN-SUMM (Clinical Longitudinal Insight from Notes using Summarization), a framework that restructures summarization as a longitudinal representation problem. Rather than collapsing histories into static summaries, CLIN-SUMM incrementally constructs structured, categorized, date-partitioned patient representations, summarizing only newly documented information at each encounter while preserving temporal fidelity without access to future data. This standardized representation can be computed once and reused across downstream tasks, thereby decoupling narrative processing from prediction. Across 12,356 Massachusetts General Hospital patients, CLIN-SUMM achieved 70% token reduction while maintaining high clinician-rated correctness and completeness. Using dementia as a case study, fine-tuning Clinical ModernBERT on CLIN-SUMM summaries yielded AUROC 0.86 for diagnosis and 0.81 for 3-year risk prediction, with longitudinal analyses demonstrating progressive risk separation years before a formal diagnosis. CLIN-SUMM summaries also enabled efficient extraction of longitudinal medication trajectories, improving medication capture compared to structured EHR data while maintaining high dosage agreement. CLIN-SUMM provides a scalable representation layer for clinical review and longitudinal machine learning, enhancing disease modeling, risk prediction, and other longitudinal reasoning tasks.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Electronic health records contain years of longitudinal clinical notes rich in evolving patient information, yet their volume, redundancy, and fragmentation limit clinical usability and scalable modeling. We present CLIN-SUMM (Clinical Longitudinal Insight from Notes using Summarization), a framework that restructures summarization as a longitudinal representation problem. Rather than collapsing histories into static summaries, CLIN-SUMM incrementally [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"medRxiv","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2026-04-28","msr_highlight_text":"","msr_notes":"Preprint","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":false,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":null,"footnotes":""},"msr-research-highlight":[],"research-area":[13553],"msr-publication-type":[193724],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246985],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1171670","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-medical-health-genomics","msr-locale-en_us","msr-field-of-study-medicine"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2026-04-28","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"medRxiv","msr_notes":"Preprint","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":0,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"doi","viewUrl":"false","id":"false","title":"10.64898\/2025.11.28.25341233","label_id":"252679","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC13131725\/","label_id":"252679","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Valentina D&#039;Souza","user_id":0,"rest_url":false},{"type":"text","value":"Danielle F Pace","user_id":0,"rest_url":false},{"type":"text","value":"Alaleh Azhir","user_id":0,"rest_url":false},{"type":"text","value":"Arash Nargesi","user_id":0,"rest_url":false},{"type":"text","value":"Erik B Holbrook","user_id":0,"rest_url":false},{"type":"text","value":"Wei He","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Tristan Naumann","user_id":37929,"rest_url":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Tristan Naumann"},{"type":"text","value":"Samuel Friedman","user_id":0,"rest_url":false},{"type":"text","value":"Steven J Atlas","user_id":0,"rest_url":false},{"type":"text","value":"Christopher D Anderson","user_id":0,"rest_url":false},{"type":"text","value":"Judy Hung","user_id":0,"rest_url":false},{"type":"text","value":"Mahnaz Maddah","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"miscellaneous","related_content":[],"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1171670","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1171670\/revisions"}],"predecessor-version":[{"id":1172992,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1171670\/revisions\/1172992"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1171670"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=1171670"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1171670"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=1171670"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=1171670"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=1171670"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1171670"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1171670"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=1171670"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=1171670"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=1171670"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1171670"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1171670"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}