{"id":933432,"date":"2023-04-07T08:53:17","date_gmt":"2023-04-07T15:53:17","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?post_type=msr-event&#038;p=933432"},"modified":"2023-08-15T12:30:11","modified_gmt":"2023-08-15T19:30:11","slug":"norcaldb-2023","status":"publish","type":"msr-event","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/event\/norcaldb-2023\/","title":{"rendered":"NorCalDB 2023"},"content":{"rendered":"\n\n\n\n\n<p>NorCalDB Day is a single-day, workshop-style event where participants from academia and industry in Northern California meet to present ideas and discuss their research and experiences. In 2023, NorCalDB Day will be held at the Microsoft Silicon Valley Campus in Mountain View, on Thursday May 11, 2023.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Registration<\/strong>: Registration for in-person attendance is now closed. Although there is no registration fee,&nbsp;<strong>you must register to attend<\/strong>. Breakfast, lunch and coffee breaks will be provided by Microsoft.<\/li>\n\n\n\n<li><strong>Location<\/strong>: Microsoft Silicon Valley Campus, 1045 La Avenida Street, Mountain View, CA 94043.<\/li>\n\n\n\n<li><strong>Arrival:<\/strong> Plan to arrive early for parking, check-in and to enjoy breakfast with us! Sessions will begin at 9:00 AM.<\/li>\n\n\n\n<li><strong>Previous meetings<\/strong>: Earlier NorCalDB Day events have taken place at UC Davis (2011), UC Berkeley (2012), <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/forum.stanford.edu\/events\/2022-annual-affiliates-meeting\/annual-meeting-archives\/2013-annual-affiliates-meeting-1\" target=\"_blank\" rel=\"noopener noreferrer\">Stanford University (2013)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/researcher.ibm.com\/researcher\/view_group.php?id=5292\" target=\"_blank\" rel=\"noopener noreferrer\">IBM Almaden Research Center (2014)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/sites.google.com\/a\/soe.ucsc.edu\/dbday2015\/home\" target=\"_blank\" rel=\"noopener noreferrer\">UC Santa Cruz (2015)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Google (2016), <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/sites.google.com\/view\/norcaldb17\/home\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon (2017)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/sites.google.com\/view\/norcaldb18\/home\" target=\"_blank\" rel=\"noopener noreferrer\">Oracle (2018)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2019.splashthat.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">LinkedIn (2019)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/li>\n\n\n\n<li><strong>Hashtag<\/strong>: Please use the event hashtag for social media posts: #NorCalDBDay<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"agenda\">Agenda<\/h3>\n\n\n\n<figure class=\"wp-block-table is-style-regular\"><table><tbody><tr><td>8:00 &#8211; 9:00 AM<\/td><td><strong>Registration and Light Breakfast<\/strong><\/td><td><\/td><\/tr><tr><td>9:00 &#8211; 9:15 AM<\/td><td><strong>Introduction and Logistics<\/strong><\/td><td><\/td><\/tr><tr><td>9:15 &#8211; 10:00 AM<\/td><td><strong>Keynote: Benchmarking and Tuning Log-Structured Table Formats (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ramakrishnan_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<\/strong><\/td><td>Raghu Ramakrishnan, Microsoft<\/td><\/tr><tr><td>10:00 &#8211; 10:30 AM<\/td><td><strong>Presto: A Decade of SQL Analytics at Meta<\/strong><\/td><td>James Sun, Meta<\/td><\/tr><tr><td>10:30 &#8211; 11:00 AM<\/td><td><strong>Coffee Break and Posters<\/strong><\/td><td><\/td><\/tr><tr><td>11:00 &#8211; 12:00 PM<\/td><td><strong>Gong Show<\/strong><br><strong>1. Cal Poly Database and Data Science Work After COVID (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/calpoly_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<br>2. Stanford @ NorCalDB Day (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/stanford_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<br>3. UC Berkeley @ NorCalDB Day (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucberkeley_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<br>4. Resilient Journey in Building Fault-tolerant Systems (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucdavis_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<br>5. Insights from Sketch-based Relational Query Optimization (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucmerced_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<br>6. UC Santa Cruz @ NorCalDB Day (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucsc_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<\/strong><\/td><td><br>1. Alexander Dekhtyar, Cal Poly<br>2. Peter Kraft, Stanford University<br>3. Aditya Parameswaran, UC Berkeley<br>4. Dakai Kang, UC Davis<br>5. Florin Rusu, UC Merced<br>6. Peter Alvaro, UC Santa Cruz<\/td><\/tr><tr><td>12:00 &#8211; 1:00 PM<\/td><td><strong>Lunch and Posters<\/strong><\/td><td><\/td><\/tr><tr><td>1:00 &#8211; 1:30 PM<\/td><td><strong>Unexpected Lessons from Production Systems Impacting the Foundations of Distributed Computing (<strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/malkhi_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/strong>)<\/strong><\/td><td>Dahlia Malkhi, Chainlink Labs<\/td><\/tr><tr><td>1:30 &#8211; 2:30 PM<\/td><td><strong>Panel Discussion: DB and AI<\/strong><\/td><td>Moderator: Fatma \u00d6zcan, Google<br>Panelists:<br>Dipti Borkar, Microsoft<br>Idan Gazit, GitHub<br>Jure Leskovec, Stanford University<br>Edo Liberty, Pinecone<br>Aditya&nbsp;Parameswaran, UC Berkeley<\/td><\/tr><tr><td>2:30 &#8211; 3:00 PM<\/td><td><strong><strong>Coffee Break and Posters<\/strong><\/strong><\/td><td><\/td><\/tr><tr><td>3:00 &#8211; 3:30 PM<\/td><td><strong>Bringing Structure to Unstructured Data with an AI-First System Design (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/gaviria_rojas_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<\/strong> <\/td><td>Will Gaviria Rojas, CoactiveAI<\/td><\/tr><tr><td>3:30 &#8211; 4:15 PM<\/td><td><strong>Keynote: Hydro: A Data-Centric Compiler Stack for the Cloud (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/hellerstein_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>)<\/strong><\/td><td>Joe Hellerstein, UC Berkeley<\/td><\/tr><tr><td>4:15 &#8211; 4:30 PM<\/td><td><strong>Closing Remarks<\/strong><\/td><td><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"keynote-talks\">Keynote talks<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"keynote-speaker-1-raghu-ramakrishnan-microsoft\">Keynote Speaker 1: Raghu Ramakrishnan, Microsoft<\/h4>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400.jpg\" alt=\"Raghu Ramakrishnan wearing glasses and smiling at the camera\" class=\"wp-image-723775\" width=\"300\" height=\"300\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400.jpg 400w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400-300x300.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400-12x12.jpg 12w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400-180x180.jpg 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400-360x360.jpg 360w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/figure>\n\n\n\n<p><strong>Title: Benchmarking and Tuning Log-Structured Table Formats<\/strong><\/p>\n\n\n\n<p><strong>Abstract:<\/strong><br>In recent years, analytic SQL databases have adopted updatable column-oriented table formats based on Parquet. These represent a profound shift from traditional row-oriented page-based data representation that continues to dominate OLTP SQL systems. In this talk, we will present a quick overview of updatable Parquet table implementations such as Delta Lake, Hudi and Iceberg and then consider the new challenges in rigorously comparing their performance. We describe LST-Bench, a new benchmarking framework that adapts a base workload such as TPC-DS, and present the results of a comparison that we carried out. We have open sourced LST-Bench. There are a number of exciting problems in this space that are exposed by our results, such as the opportunity (and need!) for auto-tuning various parameters that heavily influence performance of updatable table implementations.<br><strong>Bio:<\/strong><br><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/raghu\/\" target=\"_blank\" rel=\"noreferrer noopener\">Raghu Ramakrishnan<\/a> is CTO for Data, and a Technical Fellow at Microsoft. Previously, he was a professor at University of Wisconsin-Madison, where he wrote the widely used text \u201cDatabase Management Systems\u201d with Johannes Gehrke, and Chief Scientist at Yahoo! He has received the Innovation Award from both ACM SIGMOD and SIGKDD, multiple 10-year paper awards, and the ACM SIGMOD Contributions Award.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"keynote-speaker-2-joe-hellerstein-uc-berkeley\">Keynote Speaker 2: Joe Hellerstein, UC Berkeley<\/h4>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt.png\" alt=\"Joe Hellerstein\" class=\"wp-image-936174\" width=\"300\" height=\"300\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt.png 2070w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-300x300.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-1024x1024.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-150x150.png 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-768x768.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-1536x1536.png 1536w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-2048x2048.png 2048w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-180x180.png 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt-360x360.png 360w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/figure>\n\n\n\n<p><strong>Title: Hydro: A Data-Centric Compiler Stack for the Cloud<\/strong><\/p>\n\n\n\n<p><strong>Abstract:<\/strong><br>Relational Databases were invented to hide the concerns of how data is laid out, and how queries are executed.<br>Forty years later, Cloud Computing was invented to hide the concerns of how computing resources are laid out, and how general-purpose computations are executed. Surely lessons from the database community can translate to this new domain!<br>This is not a facile analogy or empty vision.&nbsp;I am convinced that the opportunities for outbound, translational research from databases to general-purpose modern computing are profound. This has been&nbsp;a longstanding agenda in my group, which is maturing into high-performance software with significant benefits for developers. The ideas that powered the success of databases \u2013 declarative languages, dataflow parallelism, data replication and consistency, query optimization, etc. \u2013 can be fruitfully applied to a wide variety of systems challenges, particularly related to distributed systems.&nbsp;<br>Our current hypothesis is that we can build low-latency, high-performance, elastic cloud infrastructure out of declarative queries. Can we? Should we? I believe we can, and that there are significant engineering benefits to doing so. Our prior work has included declarative networking (e.g. Overlog, P2), declarative IoT (TinyDB), declarative implementations of Big Data distributed infrastructure (BOOM Analytics), general-purpose distributed programming models (Dedalus, Bloom), declarative ML (Apache MADlib), stateful serverless technologies (Anna KVS, Cloudburst), and coordination-free foundations including the CALM Theorem.&nbsp;<br>In 2021, in a collaboration between Berkeley and Sutter Hill Ventures, we kicked off an ambitious research effort to cull lessons from this work, and push forward into a new generation of cloud technology. The emerging agenda is embodied in&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/hydro.run\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hydro: a language stack for distributed programming<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;&#8212;&nbsp;or as we sometimes call it, &#8220;LLVM for the cloud&#8221;.&nbsp;<br>In this talk I&#8217;ll overview the goals of the Hydro project, and give some status reports on the language stack, with early use cases including an autoscaling Key Value Store and an optimizable Multipaxos implementation. Hydro is a young but growing&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/hydro-project\" target=\"_blank\" rel=\"noopener noreferrer\">open source project<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;and we welcome collaborators!<br><strong>Bio:<\/strong><br>Since 1995,&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/dsf.berkeley.edu\/jmh\/\" target=\"_blank\" rel=\"noopener noreferrer\">Joe&nbsp;Hellerstein<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;has had the good fortune to serve on the faculty at UC Berkeley, where he is the Jim Gray Professor of Computer Science. During&nbsp;that time he has done research on a range of topics across computing and data, advised dozens of remarkable graduate students, taught thousands of undergraduates, and helped co-direct a number of research labs. Outside Berkeley,&nbsp;Joe&nbsp;was a co-founder of Trifacta, the AI-assisted visual data wrangling company, where he served for a decade as founding CEO and Chief Strategy Officer.&nbsp;Joe&nbsp;is currently a co-founder at&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/aqueducthq.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Aqueduct<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, which provides open source to make it easy to run AI workloads on standard cloud infrastructure.&nbsp;Joe&nbsp;continues to advise&nbsp;a number of startups in data and AI systems. For the last two years&nbsp;Joe&nbsp;has been on leave as a Faculty Fellow at Sutter Hill Ventures, which has been funding him to focus on his research.&nbsp; Outside of work,&nbsp;Joe&nbsp;plays music &#8212; mostly jazz, but he recently has been heard on recordings by&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/open.spotify.com\/album\/1KC81XomZspBG1uVh5rSCb\" target=\"_blank\" rel=\"noopener noreferrer\">James Combs<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;and his Americana band,&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.youtube.com\/watch?v=D7kAHDcqdJM\" target=\"_blank\" rel=\"noopener noreferrer\">Great Willow<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"organizing-committee\">Organizing committee<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/subru\/\" target=\"_blank\" rel=\"noreferrer noopener\">Subru Krishnan<\/a>, Microsoft<\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.linkedin.com\/in\/chris-douglas-73333a1\/\" target=\"_blank\" rel=\"noopener noreferrer\">Chris Douglas<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, UC Berkeley<\/li>\n\n\n\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/avflor\/\" target=\"_blank\" rel=\"noreferrer noopener\">Avrilia Floratou<\/a>, Microsoft<\/li>\n\n\n\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jesusca\/\" target=\"_blank\" rel=\"noreferrer noopener\">Jesus Camacho-Rodriguez<\/a>, Microsoft<\/li>\n\n\n\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/yuanyuantian\/\" target=\"_blank\" rel=\"noreferrer noopener\">Yuanyuan Tian<\/a>, Microsoft<\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.linkedin.com\/in\/fatma-ozcan-3299858\/\" target=\"_blank\" rel=\"noopener noreferrer\">Fatma \u00d6zcan<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Google<\/li>\n<\/ul>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--1\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"mailto:norcaldb2023@microsoft.com\" target=\"_blank\" rel=\"noreferrer noopener\">Contact us with questions<\/a><\/div>\n<\/div>\n\n\n\n\n\n<h3 class=\"wp-block-heading\" id=\"invited-talks\">Invited Talks<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"presto-a-decade-of-sql-analytics-at-meta-by-james-sun-meta\">Presto: A Decade of SQL Analytics at Meta by James Sun, Meta<\/h4>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/image001.jpg\" alt=\"James Sun\" class=\"wp-image-936396\" width=\"300\" height=\"300\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/image001.jpg 500w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/image001-300x300.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/image001-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/image001-180x180.jpg 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/image001-360x360.jpg 360w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/figure>\n\n\n\n<p><strong>Abstract:<\/strong><br>Presto is an open-source distributed SQL query engine that supports analytics workloads involving multiple exabyte-scale data sources. Presto is used for low-latency interactive use cases as well as long-running ETL jobs at Meta. It was originally launched at Meta in 2013 and donated to the Linux Foundation in 2019. Over the last ten years, upholding query latency and scalability with the hyper growth of data volume at Meta as well as new SQL analytics requirements have raised impressive challenges for Presto. A top priority has been ensuring query reliability does not regress with the shift towards smaller, more elastic container allocation, which requires queries to run with substantially smaller memory headroom and can be preempted at any time. In this talk, we discuss several successful evolutions in recent years that have improved Presto latency as well as scalability by several orders of magnitude in production at Meta. Some of the notable ones are hierarchical caching, native vectorized execution engines, materialized views, and Presto on Spark. With these new capabilities, we have deprecated or are in the process of deprecating various legacy query engines so that Presto becomes the single piece to serve interactive, ad-hoc, ETL, and graph processing workloads for the entire data warehouse.<br><strong>Bio:<\/strong><br>James Sun is a software Engineer at Meta working on large-scale data systems. His interests are query optimization, low-latency query execution, and system scalability. He led the Presto team developing the open-source distributed SQL query engine at EB scale. He received a Ph.D. in Computer Science from University of California, Santa Barbara focusing on data integration and data-centric processes.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"unexpected-lessons-from-production-systems-impacting-the-foundations-of-distributed-computing-by-dahlia-malkhi-chainlink-labs\">Unexpected Lessons from Production Systems Impacting the Foundations of Distributed Computing by Dahlia Malkhi, Chainlink Labs<\/h4>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2.jpg\" alt=\"Dahlia Malkhi\" class=\"wp-image-937587\" width=\"300\" height=\"300\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2.jpg 1365w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2-300x300.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2-1024x1024.jpg 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2-768x768.jpg 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2-180x180.jpg 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2-360x360.jpg 360w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/figure>\n\n\n\n<p><strong>Abstract:<\/strong><br>In this talk, I will share insights from distributed systems I worked on that led to breaking certain myths in distributed computing, including positive answers to the following questions:<br>\u2022 Can you build a <strong>permissioned<\/strong> blockchain with <strong>linear communication <\/strong>complexity<strong>, <\/strong>namely, the same communication complexity of Bitcoin merely spreading updates, but without the energy consumption?<br>\u2022 Can you scale-out distributed databases <strong>with<\/strong> a centralized coordinator?<br>\u2022 Can you geo-replicate data consistently <strong>without<\/strong> intersecting quorums?<br><strong>Bio:<\/strong><br>Dahlia Malkhi currently serves as a Distinguished Scientist at Chainlink Labs. Dr. Malkhi\u2019s research spans broad aspects of reliability and security of distributed systems, recently focused on blockchains and advances in financial technology. Her work over two decades resulted in over 150 publications as well as a strong impact on computing technology, notably HotStuff (driving the Diem blockchain core engine, the Aptos blockchain core engine), VMware blockchain, Flexible Paxos, CorfuDB, and the FairPlay project. Previously, Dr. Malkhi served as CTO, lead maintainer, and lead researcher of the Diem(Libra) project, founder and Principal Researcher at VMWare research, Partner Principal Researcher at Microsoft Research, tenured Associate Professor of the Hebrew University of Jerusalem, and senior researcher at AT&T Labs.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"bringing-structure-to-unstructured-data-with-an-ai-first-system-design-by-will-gaviria-rojas-coactiveai\">Bringing Structure to Unstructured Data with an AI-First System Design by Will Gaviria Rojas, CoactiveAI<\/h4>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712.jpeg\" alt=\"Will Gaviria Rojas\" class=\"wp-image-937590\" width=\"300\" height=\"300\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712.jpeg 800w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712-300x300.jpeg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712-150x150.jpeg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712-768x768.jpeg 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712-180x180.jpeg 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712-360x360.jpeg 360w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/figure>\n\n\n\n<p><strong>Abstract:<\/strong><br>Today, over 80% of enterprise data is unstructured and this fraction is expected to rapidly increase with the proliferation of generative AI tools. However, doing anything meaningful with this unstructured content remains extremely challenging as traditional data systems have not adapted, and ad hoc machine learning approaches remain expensive to implement and difficult to scale. In this talk, I will present the pressing need to create AI-powered data systems for understanding unstructured data, share our experiences building these systems, and present key design considerations when building these systems for end-to-end applications.<br><strong>Bio:<\/strong><br>A former Data Scientist at eBay, Will has previously held various roles as a visiting researcher. His most recent work focuses on the intersection of AI and data systems, including performance benchmarks for data-centric AI and computer vision (e.g., DataPerf @ ICML 2022, the Dollar Street dataset @ NeurIPS 2022). His previous academic work spans from IoT electronics to design and performance benchmarking of deep learning in neuromorphic systems. Will holds a PhD in Materials Science from Northwestern University and a BS from MIT.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"gong-show\">Gong Show<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Cal Poly Database and Data Science Work After COVID<\/strong>. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/users.csc.calpoly.edu\/~dekhtyar\/\" target=\"_blank\" rel=\"noopener noreferrer\">Alexander Dekhtyar<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Cal Poly<\/li>\n\n\n\n<li><strong>Stanford @ NorCalDB Day<\/strong>. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/petereliaskraft.net\/\" target=\"_blank\" rel=\"noopener noreferrer\">Peter Kraft<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Stanford University<\/li>\n\n\n\n<li><strong>UC Berkeley @ NorCalDB Day<\/strong>. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/dsf.berkeley.edu\/jmh\/\" target=\"_blank\" rel=\"noopener noreferrer\">Joe Hellerstein<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/people.eecs.berkeley.edu\/~adityagp\/\" target=\"_blank\" rel=\"noopener noreferrer\">Aditya Parameswaran<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, UC Berkeley<\/li>\n\n\n\n<li><strong>Resilient Journey in Building Fault-tolerant Systems<\/strong>. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/dakaikang.github.io\/\" target=\"_blank\" rel=\"noopener noreferrer\">Dakai Kang<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, UC Davis<\/li>\n\n\n\n<li><strong>Insights from Sketch-based Relational Query Optimization<\/strong>. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/faculty.ucmerced.edu\/frusu\/\" target=\"_blank\" rel=\"noopener noreferrer\">Florin Rusu<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, UC Merced<\/li>\n\n\n\n<li><strong>UC Santa Cruz @ NorCalDB Day<\/strong>. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/people.ucsc.edu\/~palvaro\/\" target=\"_blank\" rel=\"noopener noreferrer\">Peter Alvaro<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, UC Santa Cruz<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"panel-discussion-db-and-ai\">Panel Discussion: DB and AI<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"moderator\">Moderator:<\/h4>\n\n\n\n<figure class=\"wp-block-image alignright is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" width=\"200\" height=\"200\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Fatma-Ozcan.png\" alt=\"Fatma Ozcan\" class=\"wp-image-938778\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Fatma-Ozcan.png 200w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Fatma-Ozcan-150x150.png 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Fatma-Ozcan-180x180.png 180w\" sizes=\"auto, (max-width: 200px) 100vw, 200px\" \/><\/figure>\n\n\n\n<p><strong>Fatma \u00d6zcan<\/strong>, <strong>Google<\/strong><br>Fatma \u00d6zcan is a Principal Engineer at Systems Research@Google. Before that, she was a Distinguished Research Staff Member and a senior manager at IBM Almaden Research Center. Her current research focuses on platforms and infra-structure for large-scale data analysis, machine learning for databases, and democratizing analytics via NLQ and conversational interfaces to data. Dr \u00d6zcan got her PhD degree in computer science from University of Maryland, College Park, and her BSc degree in computer engineering from METU, Ankara. She has over 21 years of experience in industrial research, and has delivered core technologies into various IBM products. She has been a contributor to various SQL standards, including SQL\/XML, SQL\/JSON and SQL\/PTF. She is the co-author of the book &#8220;Heterogeneous Agent Systems&#8221;, and co-author of several conference papers and patents. She received the VLDB Women in Database Research Award in 2022. She is an ACM Distinguished Member, and the vice chair of ACM SIGMOD. She has served on the board of trustees for the VLDB Endowment (2016-2022), and on the board of directors of CRA (2020-2023).<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"panelists\">Panelists:<\/h4>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Image-zoomed.jpg\" alt=\"Dipti Borkar\" class=\"wp-image-938580\" width=\"200\" height=\"200\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Image-zoomed-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Image-zoomed-180x180.jpg 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Image-zoomed-360x360.jpg 360w\" sizes=\"auto, (max-width: 200px) 100vw, 200px\" \/><\/figure>\n\n\n\n<p><strong>Dipti Borkar, Microsoft<\/strong><br>Dipti is a senior technology executive and entrepreneur with over 18 years of experience in cloud, open source and distributed data\/database technologies. She is Vice President & General Manager at Microsoft where she is responsible for SaaS App Development, Strategic ISVs and Azure Databricks. She founded Ahana acquired by IBM in 2023 and created a cloud managed service for SQL on data lakes and was Chief Product Officer and Vice President of Cloud & open-source engineering. She also served as the Chairperson of Presto Foundation, Community team.<br>Prior to Ahana, Dipti held VP roles at Alluxio, Kinetica, and Couchbase. At Alluxio, she was Vice President of Products and at Couchbase she held several leadership positions there including VP, Product Management & Head of Global Solution Engineering. Earlier in her career Dipti managed development teams at IBM DB2 Distributed where she started her career as a database software engineer. Dipti holds a M.S. in Computer Science from UC San Diego, and an MBA from the Haas School of Business at UC Berkeley.<\/p>\n\n\n\n<figure class=\"wp-block-image alignright is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" width=\"200\" height=\"200\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Idan-Gazit-Headshot.jpg\" alt=\"Idan Gazit\" class=\"wp-image-940002\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Idan-Gazit-Headshot.jpg 200w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Idan-Gazit-Headshot-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Idan-Gazit-Headshot-180x180.jpg 180w\" sizes=\"auto, (max-width: 200px) 100vw, 200px\" \/><\/figure>\n\n\n\n<p><strong>Idan Gazit, GitHub<br><\/strong>Idan is a Senior Director of Research at GitHub Next. He is a hybrid designer-developer, and can usually be found geeking out about the Web, data visualization, typography, and color. Prior to GitHub, he led the Data UX team at Heroku, which built the human interfaces to Heroku&#8217;s Postgres, Redis, and Kafka datastores. He lives in the East Bay with his family and surrounds himself with a rotating cast of half-finished projects.<br><br><br><br><br><\/p>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-scaled.jpg\" alt=\"Jure Leskovec\" class=\"wp-image-938616\" width=\"200\" height=\"200\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-scaled.jpg 2560w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-300x300.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-1024x1024.jpg 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-scaled-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-768x768.jpg 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-1536x1536.jpg 1536w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-2048x2048.jpg 2048w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-180x180.jpg 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-360x360.jpg 360w\" sizes=\"auto, (max-width: 200px) 100vw, 200px\" \/><\/figure>\n\n\n\n<p><strong>Jure Leskovec, Stanford University<br><\/strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/cs.stanford.edu\/~jure\" target=\"_blank\" rel=\"noopener noreferrer\">Jure&nbsp;Leskovec<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> is Professor of Computer Science at Stanford University, and a co-Founder of Stanford Data Science Initiative. He co-founded several machine learning start-ups and spent 6 years as Chief Scientist at Pinterest building AI systems. Leskovec pioneered the field of Graph Neural Networks and has successfully deployed them across many industrial use cases. Leskovec also co-authored PyG, the most widely-used graph neural network library. Leskovec&#8217;s research area is machine learning and data science for complex, richly-labeled relational structures, graphs, and networks for systems at all scales, from interactions of proteins in a cell to interactions between humans in a society. Applications include commonsense reasoning, recommender systems, social network analysis, computational social science, and computational biology with an emphasis on drug discovery. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor&#8217;s degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University.<\/p>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1612010808496.jpg\" alt=\"Edo Liberty\" class=\"wp-image-939885\" width=\"200\" height=\"200\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1612010808496.jpg 231w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1612010808496-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1612010808496-180x180.jpg 180w\" sizes=\"auto, (max-width: 200px) 100vw, 200px\" \/><\/figure>\n\n\n\n<p><strong>Edo Liberty, Pinecone<br><\/strong>Edo Liberty is the Founder and CEO of Pinecone, the managed database for large-scale vector search.<br>Until April 2019, Edo was a Director of Research at AWS and Head of Amazon AI Labs. The Lab built cutting-edge machine learning algorithms, systems, and services for AWS customers. The team built parts of SageMaker, Kinesis, QuickSight, Amazon ElasticSearch, Glue, Rekognition, DeepRacer, Personalize, Forecast, and other yet-to-be-released services.<br>Before AWS, Edo was a Senior Research Director at Yahoo and Head of Yahoo\u2019s Research Lab in New York. He worked on building horizontal machine learning platforms and improving applications such as online advertising, search, security, media recommendation, email abuse prevention, and many more.<br>Edo received his B.Sc in Physics and Computer Science from Tel Aviv University and my Ph.D. in Computer Science from Yale University. After that, he was a Postdoctoral fellow at Yale in the Program in Applied Mathematics. He is the author of more than 75 academic papers and patents about machine learning, systems, and optimization.<\/p>\n\n\n\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2.jpg\" alt=\"Aditya Parameswaran\" class=\"wp-image-938613\" width=\"200\" height=\"200\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2.jpg 2400w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-300x300.jpg 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-1024x1024.jpg 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-150x150.jpg 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-768x768.jpg 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-1536x1536.jpg 1536w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-2048x2048.jpg 2048w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-180x180.jpg 180w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2-360x360.jpg 360w\" sizes=\"auto, (max-width: 200px) 100vw, 200px\" \/><\/figure>\n\n\n\n<p><strong>Aditya&nbsp;Parameswaran, UC Berkeley<br><\/strong>Aditya&nbsp;Parameswaran is an Associate Professor at&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fwww.berkeley.edu%2F&data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xIktH26Sh7Ebkm8g%2FOgZrRSanz870VKBToJ8ZBgMdPE%3D&reserved=0\" target=\"_blank\" rel=\"noopener noreferrer\">UC<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.berkeley.edu\/\" target=\"_blank\" rel=\"noopener noreferrer\"> <span class=\"sr-only\"> (opens in new tab)<\/span><\/a><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fwww.berkeley.edu%2F&data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xIktH26Sh7Ebkm8g%2FOgZrRSanz870VKBToJ8ZBgMdPE%3D&reserved=0\" target=\"_blank\" rel=\"noopener noreferrer\">Berkeley<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.&nbsp;Aditya&nbsp;co-directs the&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fepic.berkeley.edu%2F&data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=v6%2Fphvsg48a%2Bh3xqHq9ieksSmqwKq%2BgO880p3OjHJI8%3D&reserved=0\" target=\"_blank\" rel=\"noopener noreferrer\">EPIC Data<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/epic.berkeley.edu\/\" target=\"_blank\" rel=\"noopener noreferrer\"> <span class=\"sr-only\"> (opens in new tab)<\/span><\/a><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fepic.berkeley.edu%2F&data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=v6%2Fphvsg48a%2Bh3xqHq9ieksSmqwKq%2BgO880p3OjHJI8%3D&reserved=0\" target=\"_blank\" rel=\"noopener noreferrer\">Lab<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, a lab targeted at low\/no-code data tooling with a special emphasis on social justice applications.&nbsp;Aditya&nbsp;also serves as the President of&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/ponder.io\/\" target=\"_blank\" rel=\"noopener noreferrer\">Ponder<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, a company he co-founded with his students based on popular data science tools developed at Berkeley.&nbsp;Aditya&nbsp;develops human-centered tools for scalable data science \u2014 making it easy for end-users and teams to leverage and make sense of their large and complex datasets. His visualization and data exploration tools have been downloaded&nbsp;millions of times.<\/p>\n\n\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>AI4Reporters: Helping Reporters Cover the State Legislature.<\/strong> Thomas Gerrity, Foaad Khosmood, Kenny Lau, Alex Dekhtyar, Patrick Howe, Christine Robertson, and Lindsay Grace. Cal Poly<\/li>\n\n\n\n<li><strong>Conditioned Sketches from Masked Models.<\/strong> Brian Tsan, Asoke Datta, Yesdaulet Izenov, and Florin Rusu. UC Merced<\/li>\n\n\n\n<li><strong>Data Science Capstone Project: Comparative Analysis of Face Blurring Techniques for PathML.<\/strong> Sarah Keadle, Duncan Appelgarth, Jacob Cavanaugh, and Sarah Ellwein. Cal Poly<\/li>\n\n\n\n<li><strong>Dissecting BFT Consensus: In Trusted Components we Trust!<\/strong> Suyash Gupta, Sajjad Rahnama, Shubham Pandey, Natacha Crooks, and Mohammad Sadoghi. UC Davis<\/li>\n\n\n\n<li><strong>Good Plans Despite No Cardinalities?<\/strong> Asoke Datta, Brian Tsan, Yesdaulet Izenov, and Florin Rusu. UC Merced<\/li>\n\n\n\n<li><strong>Environment-Aware Optimization for Geo-spatial Video Queries Preprocessing.<\/strong> Chanwut (Mick) Kittivorawong, Yongming Ge, Yousef Helal, and Alvin Cheung. UC Berkeley<\/li>\n\n\n\n<li><strong>Optimizing Distributed Protocols with Query Rewrites.<\/strong> David Chu, Rithvik Panchapakesan, Shadaj Laddad, Chris Liu, Kaushik Shivakumar, Natacha Crooks, Joe Hellerstein, and Heidi Howard. UC Berkeley<\/li>\n\n\n\n<li><strong>Optimizing Stateful Dataflow with E-Graphs.<\/strong> Shadaj Laddad, Tyler Hou, Conor Power, Mae Milano, Alvin Cheung, and Joe Hellerstein. UC Berkeley<\/li>\n\n\n\n<li><strong>Optimizing Transactional Hit Rate for Web Caches.<\/strong> Audrey Cheng, David Chu, Terrance Li, Jason Chan, Natacha Crooks, Joe Hellerstein, Ion Stoica, and Xiangyao Yu. UC Berkeley<\/li>\n\n\n\n<li><strong>Practical View-Change-Less Protocol through Rapid View Synchronization.<\/strong> Dakai Kang, Sajjad Rahnama, Jelle Helling, and Mohammad Sadoghi. UC Davis<\/li>\n\n\n\n<li><strong>Resilient Consensus Sustained Collaboratively.<\/strong> Junchao Chen, Suyash Gupta, Alberto Sonnino, Lefteris Kokoris-Kogias, and Mohammad Sadoghi. UC Davis<\/li>\n\n\n\n<li><strong>ResilientDB: Global-Scale Sustainable Blockchain Fabric.<\/strong> Junchao Chen, Dakai Kang, Sajjad Rahnama, Suyash Gupta, Shesha Vishnu Prasad, Jinxiao Yu, Arindaam Roy, Divjeet Singh Jas, Wayne Wang, Julieta Duarte, Glenn Chen, Apratim Shukla, Priyal Soni, Kaustubh Shete, Gopal Nambiar, Tim Huang, Haskell Lark Macaraig, Steve Chen, Jared Givens, Saipranav Kotamreddy, Aditya Bej, and Mohammad Sadoghi. UC Davis<\/li>\n\n\n\n<li><strong>SkyPIE: A Fast & Accurate Object Placement Oracle.<\/strong> Tiemo Bang, Chris Douglas, Shadaj Laddad, Alvin Cheung, Natacha Crooks, and Joe Hellerstein. UC Berkeley<\/li>\n\n\n\n<li><strong>Sub-optimal Join Order Classification by L1-error.<\/strong> Yesdaulet Izenov, Asoke Datta, Brian Tsan, and Florin Rusu. UC Merced<\/li>\n<\/ol>\n\n\n","protected":false},"excerpt":{"rendered":"<p>NorCalDB Day is a single-day, workshop-style event where participants from academia and industry in Northern California meet to present ideas and discuss their research and experiences. In 2023, NorCalDB Day will be held at the Microsoft Silicon Valley Campus in Mountain View, on Thursday May 11, 2023. 8:00 &#8211; 9:00 AM Registration and Light Breakfast [&hellip;]<\/p>\n","protected":false},"featured_media":933447,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_startdate":"2023-05-11","msr_enddate":"2023-05-11","msr_location":"Mountain View, CA","msr_expirationdate":"","msr_event_recording_link":"","msr_event_link":"","msr_event_link_redirect":false,"msr_event_time":"","msr_hide_region":false,"msr_private_event":false,"msr_hide_image_in_river":0,"footnotes":""},"research-area":[13563],"msr-region":[197900],"msr-event-type":[197944],"msr-video-type":[],"msr-locale":[268875],"msr-program-audience":[],"msr-post-option":[],"msr-impact-theme":[],"class_list":["post-933432","msr-event","type-msr-event","status-publish","has-post-thumbnail","hentry","msr-research-area-data-platform-analytics","msr-region-north-america","msr-event-type-hosted-by-microsoft","msr-locale-en_us"],"msr_about":"<!-- wp:msr\/event-details {\"title\":\"NorCalDB Day 2023\",\"image\":{\"id\":933447,\"url\":\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720.jpg\",\"alt\":\"overhead view of Microsoft Silicon Valley campus\"}} \/-->\n\n<!-- wp:msr\/content-tabs -->\n<!-- wp:msr\/content-tab -->\n<!-- wp:paragraph {\"placeholder\":\"Add Group Overview content\u2026\"} -->\n<p>NorCalDB Day is a single-day, workshop-style event where participants from academia and industry in Northern California meet to present ideas and discuss their research and experiences. In 2023, NorCalDB Day will be held at the Microsoft Silicon Valley Campus in Mountain View, on Thursday May 11, 2023.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul><!-- wp:list-item -->\n<li><strong>Registration<\/strong>: Registration for in-person attendance is now closed. Although there is no registration fee,&nbsp;<strong>you must register to attend<\/strong>. Breakfast, lunch and coffee breaks will be provided by Microsoft.<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Location<\/strong>: Microsoft Silicon Valley Campus, 1045 La Avenida Street, Mountain View, CA 94043.<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Arrival:<\/strong> Plan to arrive early for parking, check-in and to enjoy breakfast with us! Sessions will begin at 9:00 AM.<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Previous meetings<\/strong>: Earlier NorCalDB Day events have taken place at UC Davis (2011), UC Berkeley (2012), <a href=\"https:\/\/forum.stanford.edu\/events\/2022-annual-affiliates-meeting\/annual-meeting-archives\/2013-annual-affiliates-meeting-1\" target=\"_blank\" rel=\"noreferrer noopener\">Stanford University (2013)<\/a>, <a href=\"https:\/\/researcher.ibm.com\/researcher\/view_group.php?id=5292\" target=\"_blank\" rel=\"noreferrer noopener\">IBM Almaden Research Center (2014)<\/a>, <a href=\"https:\/\/sites.google.com\/a\/soe.ucsc.edu\/dbday2015\/home\" target=\"_blank\" rel=\"noreferrer noopener\">UC Santa Cruz (2015)<\/a>, Google (2016), <a href=\"https:\/\/sites.google.com\/view\/norcaldb17\/home\" target=\"_blank\" rel=\"noreferrer noopener\">Amazon (2017)<\/a>, <a href=\"https:\/\/sites.google.com\/view\/norcaldb18\/home\" target=\"_blank\" rel=\"noreferrer noopener\">Oracle (2018)<\/a>, and <a href=\"https:\/\/norcaldb2019.splashthat.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">LinkedIn (2019)<\/a>.<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Hashtag<\/strong>: Please use the event hashtag for social media posts: #NorCalDBDay<\/li>\n<!-- \/wp:list-item --><\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:heading {\"level\":3} -->\n<h3 class=\"wp-block-heading\" id=\"agenda\">Agenda<\/h3>\n<!-- \/wp:heading -->\n\n<!-- wp:table {\"className\":\"is-style-regular\"} -->\n<figure class=\"wp-block-table is-style-regular\"><table><tbody><tr><td>8:00 - 9:00 AM<\/td><td><strong>Registration and Light Breakfast<\/strong><\/td><td><\/td><\/tr><tr><td>9:00 - 9:15 AM<\/td><td><strong>Introduction and Logistics<\/strong><\/td><td><\/td><\/tr><tr><td>9:15 - 10:00 AM<\/td><td><strong>Keynote: Benchmarking and Tuning Log-Structured Table Formats (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ramakrishnan_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<\/strong><\/td><td>Raghu Ramakrishnan, Microsoft<\/td><\/tr><tr><td>10:00 - 10:30 AM<\/td><td><strong>Presto: A Decade of SQL Analytics at Meta<\/strong><\/td><td>James Sun, Meta<\/td><\/tr><tr><td>10:30 - 11:00 AM<\/td><td><strong>Coffee Break and Posters<\/strong><\/td><td><\/td><\/tr><tr><td>11:00 - 12:00 PM<\/td><td><strong>Gong Show<\/strong><br><strong>1. Cal Poly Database and Data Science Work After COVID (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/calpoly_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<br>2. Stanford @ NorCalDB Day (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/stanford_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<br>3. UC Berkeley @ NorCalDB Day (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucberkeley_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<br>4. Resilient Journey in Building Fault-tolerant Systems (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucdavis_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<br>5. Insights from Sketch-based Relational Query Optimization (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucmerced_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<br>6. UC Santa Cruz @ NorCalDB Day (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/ucsc_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<\/strong><\/td><td><br>1. Alexander Dekhtyar, Cal Poly<br>2. Peter Kraft, Stanford University<br>3. Aditya Parameswaran, UC Berkeley<br>4. Dakai Kang, UC Davis<br>5. Florin Rusu, UC Merced<br>6. Peter Alvaro, UC Santa Cruz<\/td><\/tr><tr><td>12:00 - 1:00 PM<\/td><td><strong>Lunch and Posters<\/strong><\/td><td><\/td><\/tr><tr><td>1:00 - 1:30 PM<\/td><td><strong>Unexpected Lessons from Production Systems Impacting the Foundations of Distributed Computing (<strong><a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/malkhi_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a><\/strong>)<\/strong><\/td><td>Dahlia Malkhi, Chainlink Labs<\/td><\/tr><tr><td>1:30 - 2:30 PM<\/td><td><strong>Panel Discussion: DB and AI<\/strong><\/td><td>Moderator: Fatma \u00d6zcan, Google<br>Panelists:<br>Dipti Borkar, Microsoft<br>Idan Gazit, GitHub<br>Jure Leskovec, Stanford University<br>Edo Liberty, Pinecone<br>Aditya&nbsp;Parameswaran, UC Berkeley<\/td><\/tr><tr><td>2:30 - 3:00 PM<\/td><td><strong><strong>Coffee Break and Posters<\/strong><\/strong><\/td><td><\/td><\/tr><tr><td>3:00 - 3:30 PM<\/td><td><strong>Bringing Structure to Unstructured Data with an AI-First System Design (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/gaviria_rojas_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<\/strong> <\/td><td>Will Gaviria Rojas, CoactiveAI<\/td><\/tr><tr><td>3:30 - 4:15 PM<\/td><td><strong>Keynote: Hydro: A Data-Centric Compiler Stack for the Cloud (<a href=\"https:\/\/norcaldb2023.blob.core.windows.net\/slides\/hellerstein_norcaldb_2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">slides<\/a>)<\/strong><\/td><td>Joe Hellerstein, UC Berkeley<\/td><\/tr><tr><td>4:15 - 4:30 PM<\/td><td><strong>Closing Remarks<\/strong><\/td><td><\/td><\/tr><\/tbody><\/table><\/figure>\n<!-- \/wp:table -->\n\n<!-- wp:paragraph -->\n<p><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading {\"level\":3} -->\n<h3 class=\"wp-block-heading\" id=\"keynote-talks\">Keynote talks<\/h3>\n<!-- \/wp:heading -->\n\n<!-- wp:heading {\"level\":4} -->\n<h4 class=\"wp-block-heading\" id=\"keynote-speaker-1-raghu-ramakrishnan-microsoft\">Keynote Speaker 1: Raghu Ramakrishnan, Microsoft<\/h4>\n<!-- \/wp:heading -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":723775,\"width\":300,\"height\":300,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/02\/ramakrishnan_raghu_400x400.jpg\" alt=\"Raghu Ramakrishnan wearing glasses and smiling at the camera\" class=\"wp-image-723775\" width=\"300\" height=\"300\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Title: Benchmarking and Tuning Log-Structured Table Formats<\/strong><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>Abstract:<\/strong><br>In recent years, analytic SQL databases have adopted updatable column-oriented table formats based on Parquet. These represent a profound shift from traditional row-oriented page-based data representation that continues to dominate OLTP SQL systems. In this talk, we will present a quick overview of updatable Parquet table implementations such as Delta Lake, Hudi and Iceberg and then consider the new challenges in rigorously comparing their performance. We describe LST-Bench, a new benchmarking framework that adapts a base workload such as TPC-DS, and present the results of a comparison that we carried out. We have open sourced LST-Bench. There are a number of exciting problems in this space that are exposed by our results, such as the opportunity (and need!) for auto-tuning various parameters that heavily influence performance of updatable table implementations.<br><strong>Bio:<\/strong><br><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/raghu\/\" target=\"_blank\" rel=\"noreferrer noopener\">Raghu Ramakrishnan<\/a> is CTO for Data, and a Technical Fellow at Microsoft. Previously, he was a professor at University of Wisconsin-Madison, where he wrote the widely used text \u201cDatabase Management Systems\u201d with Johannes Gehrke, and Chief Scientist at Yahoo! He has received the Innovation Award from both ACM SIGMOD and SIGKDD, multiple 10-year paper awards, and the ACM SIGMOD Contributions Award.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading {\"level\":4} -->\n<h4 class=\"wp-block-heading\" id=\"keynote-speaker-2-joe-hellerstein-uc-berkeley\">Keynote Speaker 2: Joe Hellerstein, UC Berkeley<\/h4>\n<!-- \/wp:heading -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":936174,\"width\":300,\"height\":300,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jmh-blackshirt.png\" alt=\"Joe Hellerstein\" class=\"wp-image-936174\" width=\"300\" height=\"300\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Title: Hydro: A Data-Centric Compiler Stack for the Cloud<\/strong><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>Abstract:<\/strong><br>Relational Databases were invented to hide the concerns of how data is laid out, and how queries are executed.<br>Forty years later, Cloud Computing was invented to hide the concerns of how computing resources are laid out, and how general-purpose computations are executed. Surely lessons from the database community can translate to this new domain!<br>This is not a facile analogy or empty vision.&nbsp;I am convinced that the opportunities for outbound, translational research from databases to general-purpose modern computing are profound. This has been&nbsp;a longstanding agenda in my group, which is maturing into high-performance software with significant benefits for developers. The ideas that powered the success of databases \u2013 declarative languages, dataflow parallelism, data replication and consistency, query optimization, etc. \u2013 can be fruitfully applied to a wide variety of systems challenges, particularly related to distributed systems.&nbsp;<br>Our current hypothesis is that we can build low-latency, high-performance, elastic cloud infrastructure out of declarative queries. Can we? Should we? I believe we can, and that there are significant engineering benefits to doing so. Our prior work has included declarative networking (e.g. Overlog, P2), declarative IoT (TinyDB), declarative implementations of Big Data distributed infrastructure (BOOM Analytics), general-purpose distributed programming models (Dedalus, Bloom), declarative ML (Apache MADlib), stateful serverless technologies (Anna KVS, Cloudburst), and coordination-free foundations including the CALM Theorem.&nbsp;<br>In 2021, in a collaboration between Berkeley and Sutter Hill Ventures, we kicked off an ambitious research effort to cull lessons from this work, and push forward into a new generation of cloud technology. The emerging agenda is embodied in&nbsp;<a href=\"https:\/\/hydro.run\/\" target=\"_blank\" rel=\"noreferrer noopener\">Hydro: a language stack for distributed programming<\/a>&nbsp;--&nbsp;or as we sometimes call it, \"LLVM for the cloud\".&nbsp;<br>In this talk I'll overview the goals of the Hydro project, and give some status reports on the language stack, with early use cases including an autoscaling Key Value Store and an optimizable Multipaxos implementation. Hydro is a young but growing&nbsp;<a href=\"https:\/\/github.com\/hydro-project\" target=\"_blank\" rel=\"noreferrer noopener\">open source project<\/a>&nbsp;and we welcome collaborators!<br><strong>Bio:<\/strong><br>Since 1995,&nbsp;<a href=\"https:\/\/dsf.berkeley.edu\/jmh\/\" target=\"_blank\" rel=\"noreferrer noopener\">Joe&nbsp;Hellerstein<\/a>&nbsp;has had the good fortune to serve on the faculty at UC Berkeley, where he is the Jim Gray Professor of Computer Science. During&nbsp;that time he has done research on a range of topics across computing and data, advised dozens of remarkable graduate students, taught thousands of undergraduates, and helped co-direct a number of research labs. Outside Berkeley,&nbsp;Joe&nbsp;was a co-founder of Trifacta, the AI-assisted visual data wrangling company, where he served for a decade as founding CEO and Chief Strategy Officer.&nbsp;Joe&nbsp;is currently a co-founder at&nbsp;<a href=\"https:\/\/aqueducthq.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Aqueduct<\/a>, which provides open source to make it easy to run AI workloads on standard cloud infrastructure.&nbsp;Joe&nbsp;continues to advise&nbsp;a number of startups in data and AI systems. For the last two years&nbsp;Joe&nbsp;has been on leave as a Faculty Fellow at Sutter Hill Ventures, which has been funding him to focus on his research.&nbsp; Outside of work,&nbsp;Joe&nbsp;plays music -- mostly jazz, but he recently has been heard on recordings by&nbsp;<a href=\"https:\/\/open.spotify.com\/album\/1KC81XomZspBG1uVh5rSCb\" target=\"_blank\" rel=\"noreferrer noopener\">James Combs<\/a>&nbsp;and his Americana band,&nbsp;<a href=\"https:\/\/www.youtube.com\/watch?v=D7kAHDcqdJM\" target=\"_blank\" rel=\"noreferrer noopener\">Great Willow<\/a>.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading {\"level\":3} -->\n<h3 class=\"wp-block-heading\" id=\"organizing-committee\">Organizing committee<\/h3>\n<!-- \/wp:heading -->\n\n<!-- wp:list -->\n<ul><!-- wp:list-item -->\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/subru\/\" target=\"_blank\" rel=\"noreferrer noopener\">Subru Krishnan<\/a>, Microsoft<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><a href=\"https:\/\/www.linkedin.com\/in\/chris-douglas-73333a1\/\" target=\"_blank\" rel=\"noreferrer noopener\">Chris Douglas<\/a>, UC Berkeley<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/avflor\/\" target=\"_blank\" rel=\"noreferrer noopener\">Avrilia Floratou<\/a>, Microsoft<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jesusca\/\" target=\"_blank\" rel=\"noreferrer noopener\">Jesus Camacho-Rodriguez<\/a>, Microsoft<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/yuanyuantian\/\" target=\"_blank\" rel=\"noreferrer noopener\">Yuanyuan Tian<\/a>, Microsoft<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><a href=\"https:\/\/www.linkedin.com\/in\/fatma-ozcan-3299858\/\" target=\"_blank\" rel=\"noreferrer noopener\">Fatma \u00d6zcan<\/a>, Google<\/li>\n<!-- \/wp:list-item --><\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:buttons -->\n<div class=\"wp-block-buttons\"><!-- wp:button {\"className\":\"is-style-outline\"} -->\n<div class=\"wp-block-button is-style-outline\"><a class=\"wp-block-button__link wp-element-button\" href=\"mailto:norcaldb2023@microsoft.com\" target=\"_blank\" rel=\"noreferrer noopener\">Contact us with questions<\/a><\/div>\n<!-- \/wp:button --><\/div>\n<!-- \/wp:buttons -->\n<!-- \/wp:msr\/content-tab -->\n\n<!-- wp:msr\/content-tab {\"title\":\"Detailed Agenda\"} -->\n<!-- wp:heading {\"level\":3} -->\n<h3 class=\"wp-block-heading\" id=\"invited-talks\">Invited Talks<\/h3>\n<!-- \/wp:heading -->\n\n<!-- wp:heading {\"level\":4} -->\n<h4 class=\"wp-block-heading\" id=\"presto-a-decade-of-sql-analytics-at-meta-by-james-sun-meta\">Presto: A Decade of SQL Analytics at Meta by James Sun, Meta<\/h4>\n<!-- \/wp:heading -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":936396,\"width\":300,\"height\":300,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/image001.jpg\" alt=\"James Sun\" class=\"wp-image-936396\" width=\"300\" height=\"300\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Abstract:<\/strong><br>Presto is an open-source distributed SQL query engine that supports analytics workloads involving multiple exabyte-scale data sources. Presto is used for low-latency interactive use cases as well as long-running ETL jobs at Meta. It was originally launched at Meta in 2013 and donated to the Linux Foundation in 2019. Over the last ten years, upholding query latency and scalability with the hyper growth of data volume at Meta as well as new SQL analytics requirements have raised impressive challenges for Presto. A top priority has been ensuring query reliability does not regress with the shift towards smaller, more elastic container allocation, which requires queries to run with substantially smaller memory headroom and can be preempted at any time. In this talk, we discuss several successful evolutions in recent years that have improved Presto latency as well as scalability by several orders of magnitude in production at Meta. Some of the notable ones are hierarchical caching, native vectorized execution engines, materialized views, and Presto on Spark. With these new capabilities, we have deprecated or are in the process of deprecating various legacy query engines so that Presto becomes the single piece to serve interactive, ad-hoc, ETL, and graph processing workloads for the entire data warehouse.<br><strong>Bio:<\/strong><br>James Sun is a software Engineer at Meta working on large-scale data systems. His interests are query optimization, low-latency query execution, and system scalability. He led the Presto team developing the open-source distributed SQL query engine at EB scale. He received a Ph.D. in Computer Science from University of California, Santa Barbara focusing on data integration and data-centric processes.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading {\"level\":4} -->\n<h4 class=\"wp-block-heading\" id=\"unexpected-lessons-from-production-systems-impacting-the-foundations-of-distributed-computing-by-dahlia-malkhi-chainlink-labs\">Unexpected Lessons from Production Systems Impacting the Foundations of Distributed Computing by Dahlia Malkhi, Chainlink Labs<\/h4>\n<!-- \/wp:heading -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":937587,\"width\":300,\"height\":300,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/dahlia_malkhi_2.jpg\" alt=\"Dahlia Malkhi\" class=\"wp-image-937587\" width=\"300\" height=\"300\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Abstract:<\/strong><br>In this talk, I will share insights from distributed systems I worked on that led to breaking certain myths in distributed computing, including positive answers to the following questions:<br>\u2022 Can you build a <strong>permissioned<\/strong> blockchain with <strong>linear communication <\/strong>complexity<strong>, <\/strong>namely, the same communication complexity of Bitcoin merely spreading updates, but without the energy consumption?<br>\u2022 Can you scale-out distributed databases <strong>with<\/strong> a centralized coordinator?<br>\u2022 Can you geo-replicate data consistently <strong>without<\/strong> intersecting quorums?<br><strong>Bio:<\/strong><br>Dahlia Malkhi currently serves as a Distinguished Scientist at Chainlink Labs. Dr. Malkhi\u2019s research spans broad aspects of reliability and security of distributed systems, recently focused on blockchains and advances in financial technology. Her work over two decades resulted in over 150 publications as well as a strong impact on computing technology, notably HotStuff (driving the Diem blockchain core engine, the Aptos blockchain core engine), VMware blockchain, Flexible Paxos, CorfuDB, and the FairPlay project. Previously, Dr. Malkhi served as CTO, lead maintainer, and lead researcher of the Diem(Libra) project, founder and Principal Researcher at VMWare research, Partner Principal Researcher at Microsoft Research, tenured Associate Professor of the Hebrew University of Jerusalem, and senior researcher at AT&amp;T Labs.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading {\"level\":4} -->\n<h4 class=\"wp-block-heading\" id=\"bringing-structure-to-unstructured-data-with-an-ai-first-system-design-by-will-gaviria-rojas-coactiveai\">Bringing Structure to Unstructured Data with an AI-First System Design by Will Gaviria Rojas, CoactiveAI<\/h4>\n<!-- \/wp:heading -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":937590,\"width\":300,\"height\":300,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1639499845712.jpeg\" alt=\"Will Gaviria Rojas\" class=\"wp-image-937590\" width=\"300\" height=\"300\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Abstract:<\/strong><br>Today, over 80% of enterprise data is unstructured and this fraction is expected to rapidly increase with the proliferation of generative AI tools. However, doing anything meaningful with this unstructured content remains extremely challenging as traditional data systems have not adapted, and ad hoc machine learning approaches remain expensive to implement and difficult to scale. In this talk, I will present the pressing need to create AI-powered data systems for understanding unstructured data, share our experiences building these systems, and present key design considerations when building these systems for end-to-end applications.<br><strong>Bio:<\/strong><br>A former Data Scientist at eBay, Will has previously held various roles as a visiting researcher. His most recent work focuses on the intersection of AI and data systems, including performance benchmarks for data-centric AI and computer vision (e.g., DataPerf @ ICML 2022, the Dollar Street dataset @ NeurIPS 2022). His previous academic work spans from IoT electronics to design and performance benchmarking of deep learning in neuromorphic systems. Will holds a PhD in Materials Science from Northwestern University and a BS from MIT.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading {\"level\":3} -->\n<h3 class=\"wp-block-heading\" id=\"gong-show\">Gong Show<\/h3>\n<!-- \/wp:heading -->\n\n<!-- wp:list -->\n<ul><!-- wp:list-item -->\n<li><strong>Cal Poly Database and Data Science Work After COVID<\/strong>. <a href=\"http:\/\/users.csc.calpoly.edu\/~dekhtyar\/\" target=\"_blank\" rel=\"noreferrer noopener\">Alexander Dekhtyar<\/a>, Cal Poly<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Stanford @ NorCalDB Day<\/strong>. <a href=\"https:\/\/petereliaskraft.net\/\" target=\"_blank\" rel=\"noreferrer noopener\">Peter Kraft<\/a>, Stanford University<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>UC Berkeley @ NorCalDB Day<\/strong>. <a href=\"https:\/\/dsf.berkeley.edu\/jmh\/\" target=\"_blank\" rel=\"noreferrer noopener\">Joe Hellerstein<\/a> and <a href=\"https:\/\/people.eecs.berkeley.edu\/~adityagp\/\" target=\"_blank\" rel=\"noreferrer noopener\">Aditya Parameswaran<\/a>, UC Berkeley<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Resilient Journey in Building Fault-tolerant Systems<\/strong>. <a href=\"https:\/\/dakaikang.github.io\/\" target=\"_blank\" rel=\"noreferrer noopener\">Dakai Kang<\/a>, UC Davis<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Insights from Sketch-based Relational Query Optimization<\/strong>. <a href=\"https:\/\/faculty.ucmerced.edu\/frusu\/\" target=\"_blank\" rel=\"noreferrer noopener\">Florin Rusu<\/a>, UC Merced<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>UC Santa Cruz @ NorCalDB Day<\/strong>. <a href=\"https:\/\/people.ucsc.edu\/~palvaro\/\" target=\"_blank\" rel=\"noreferrer noopener\">Peter Alvaro<\/a>, UC Santa Cruz<\/li>\n<!-- \/wp:list-item --><\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:heading {\"level\":3} -->\n<h3 class=\"wp-block-heading\" id=\"panel-discussion-db-and-ai\">Panel Discussion: DB and AI<\/h3>\n<!-- \/wp:heading -->\n\n<!-- wp:heading {\"level\":4} -->\n<h4 class=\"wp-block-heading\" id=\"moderator\">Moderator:<\/h4>\n<!-- \/wp:heading -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":938778,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Fatma-Ozcan.png\" alt=\"Fatma Ozcan\" class=\"wp-image-938778\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Fatma \u00d6zcan<\/strong>, <strong>Google<\/strong><br>Fatma \u00d6zcan is a Principal Engineer at Systems Research@Google. Before that, she was a Distinguished Research Staff Member and a senior manager at IBM Almaden Research Center. Her current research focuses on platforms and infra-structure for large-scale data analysis, machine learning for databases, and democratizing analytics via NLQ and conversational interfaces to data. Dr \u00d6zcan got her PhD degree in computer science from University of Maryland, College Park, and her BSc degree in computer engineering from METU, Ankara. She has over 21 years of experience in industrial research, and has delivered core technologies into various IBM products. She has been a contributor to various SQL standards, including SQL\/XML, SQL\/JSON and SQL\/PTF. She is the co-author of the book \"Heterogeneous Agent Systems\", and co-author of several conference papers and patents. She received the VLDB Women in Database Research Award in 2022. She is an ACM Distinguished Member, and the vice chair of ACM SIGMOD. She has served on the board of trustees for the VLDB Endowment (2016-2022), and on the board of directors of CRA (2020-2023).<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading {\"level\":4} -->\n<h4 class=\"wp-block-heading\" id=\"panelists\">Panelists:<\/h4>\n<!-- \/wp:heading -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":938580,\"width\":200,\"height\":200,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Image-zoomed.jpg\" alt=\"Dipti Borkar\" class=\"wp-image-938580\" width=\"200\" height=\"200\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Dipti Borkar, Microsoft<\/strong><br>Dipti is a senior technology executive and entrepreneur with over 18 years of experience in cloud, open source and distributed data\/database technologies. She is Vice President &amp; General Manager at Microsoft where she is responsible for SaaS App Development, Strategic ISVs and Azure Databricks. She founded Ahana acquired by IBM in 2023 and created a cloud managed service for SQL on data lakes and was Chief Product Officer and Vice President of Cloud &amp; open-source engineering. She also served as the Chairperson of Presto Foundation, Community team.<br>Prior to Ahana, Dipti held VP roles at Alluxio, Kinetica, and Couchbase. At Alluxio, she was Vice President of Products and at Couchbase she held several leadership positions there including VP, Product Management &amp; Head of Global Solution Engineering. Earlier in her career Dipti managed development teams at IBM DB2 Distributed where she started her career as a database software engineer. Dipti holds a M.S. in Computer Science from UC San Diego, and an MBA from the Haas School of Business at UC Berkeley.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":940002,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Idan-Gazit-Headshot.jpg\" alt=\"Idan Gazit\" class=\"wp-image-940002\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Idan Gazit, GitHub<br><\/strong>Idan is a Senior Director of Research at GitHub Next. He is a hybrid designer-developer, and can usually be found geeking out about the Web, data visualization, typography, and color. Prior to GitHub, he led the Data UX team at Heroku, which built the human interfaces to Heroku's Postgres, Redis, and Kafka datastores. He lives in the East Bay with his family and surrounds himself with a rotating cast of half-finished projects.<br><br><br><br><br><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":938616,\"width\":200,\"height\":200,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/jure_leskovec-scaled.jpg\" alt=\"Jure Leskovec\" class=\"wp-image-938616\" width=\"200\" height=\"200\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Jure Leskovec, Stanford University<br><\/strong><a href=\"http:\/\/cs.stanford.edu\/~jure\" target=\"_blank\" rel=\"noreferrer noopener\">Jure&nbsp;Leskovec<\/a> is Professor of Computer Science at Stanford University, and a co-Founder of Stanford Data Science Initiative. He co-founded several machine learning start-ups and spent 6 years as Chief Scientist at Pinterest building AI systems. Leskovec pioneered the field of Graph Neural Networks and has successfully deployed them across many industrial use cases. Leskovec also co-authored PyG, the most widely-used graph neural network library. Leskovec's research area is machine learning and data science for complex, richly-labeled relational structures, graphs, and networks for systems at all scales, from interactions of proteins in a cell to interactions between humans in a society. Applications include commonsense reasoning, recommender systems, social network analysis, computational social science, and computational biology with an emphasis on drug discovery. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":939885,\"width\":200,\"height\":200,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/1612010808496.jpg\" alt=\"Edo Liberty\" class=\"wp-image-939885\" width=\"200\" height=\"200\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Edo Liberty, Pinecone<br><\/strong>Edo Liberty is the Founder and CEO of Pinecone, the managed database for large-scale vector search.<br>Until April 2019, Edo was a Director of Research at AWS and Head of Amazon AI Labs. The Lab built cutting-edge machine learning algorithms, systems, and services for AWS customers. The team built parts of SageMaker, Kinesis, QuickSight, Amazon ElasticSearch, Glue, Rekognition, DeepRacer, Personalize, Forecast, and other yet-to-be-released services.<br>Before AWS, Edo was a Senior Research Director at Yahoo and Head of Yahoo\u2019s Research Lab in New York. He worked on building horizontal machine learning platforms and improving applications such as online advertising, search, security, media recommendation, email abuse prevention, and many more.<br>Edo received his B.Sc in Physics and Computer Science from Tel Aviv University and my Ph.D. in Computer Science from Yale University. After that, he was a Postdoctoral fellow at Yale in the Program in Applied Mathematics. He is the author of more than 75 academic papers and patents about machine learning, systems, and optimization.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:image {\"align\":\"right\",\"id\":938613,\"width\":200,\"height\":200,\"className\":\"is-style-rounded\"} -->\n<figure class=\"wp-block-image alignright is-resized is-style-rounded\"><img src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/aditya_parameswaran_sq-2.jpg\" alt=\"Aditya Parameswaran\" class=\"wp-image-938613\" width=\"200\" height=\"200\" \/><\/figure>\n<!-- \/wp:image -->\n\n<!-- wp:paragraph -->\n<p><strong>Aditya&nbsp;Parameswaran, UC Berkeley<br><\/strong>Aditya&nbsp;Parameswaran is an Associate Professor at&nbsp;<a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fwww.berkeley.edu%2F&amp;data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=xIktH26Sh7Ebkm8g%2FOgZrRSanz870VKBToJ8ZBgMdPE%3D&amp;reserved=0\" target=\"_blank\" rel=\"noreferrer noopener\">UC<\/a><a href=\"https:\/\/www.berkeley.edu\/\" target=\"_blank\" rel=\"noreferrer noopener\"> <\/a><a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fwww.berkeley.edu%2F&amp;data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=xIktH26Sh7Ebkm8g%2FOgZrRSanz870VKBToJ8ZBgMdPE%3D&amp;reserved=0\" target=\"_blank\" rel=\"noreferrer noopener\">Berkeley<\/a>.&nbsp;Aditya&nbsp;co-directs the&nbsp;<a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fepic.berkeley.edu%2F&amp;data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=v6%2Fphvsg48a%2Bh3xqHq9ieksSmqwKq%2BgO880p3OjHJI8%3D&amp;reserved=0\" target=\"_blank\" rel=\"noreferrer noopener\">EPIC Data<\/a><a href=\"https:\/\/epic.berkeley.edu\/\" target=\"_blank\" rel=\"noreferrer noopener\"> <\/a><a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Fepic.berkeley.edu%2F&amp;data=05%7C01%7Cjesusca%40microsoft.com%7Cbd7a424f01e7420ad7bc08db4b73107d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638186729816686515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=v6%2Fphvsg48a%2Bh3xqHq9ieksSmqwKq%2BgO880p3OjHJI8%3D&amp;reserved=0\" target=\"_blank\" rel=\"noreferrer noopener\">Lab<\/a>, a lab targeted at low\/no-code data tooling with a special emphasis on social justice applications.&nbsp;Aditya&nbsp;also serves as the President of&nbsp;<a href=\"https:\/\/ponder.io\/\" target=\"_blank\" rel=\"noreferrer noopener\">Ponder<\/a>, a company he co-founded with his students based on popular data science tools developed at Berkeley.&nbsp;Aditya&nbsp;develops human-centered tools for scalable data science \u2014 making it easy for end-users and teams to leverage and make sense of their large and complex datasets. His visualization and data exploration tools have been downloaded&nbsp;millions of times.<\/p>\n<!-- \/wp:paragraph -->\n<!-- \/wp:msr\/content-tab -->\n\n<!-- wp:msr\/content-tab {\"title\":\"Posters\"} -->\n<!-- wp:list {\"ordered\":true} -->\n<ol><!-- wp:list-item -->\n<li><strong>AI4Reporters: Helping Reporters Cover the State Legislature.<\/strong> Thomas Gerrity, Foaad Khosmood, Kenny Lau, Alex Dekhtyar, Patrick Howe, Christine Robertson, and Lindsay Grace. Cal Poly<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Conditioned Sketches from Masked Models.<\/strong> Brian Tsan, Asoke Datta, Yesdaulet Izenov, and Florin Rusu. UC Merced<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Data Science Capstone Project: Comparative Analysis of Face Blurring Techniques for PathML.<\/strong> Sarah Keadle, Duncan Appelgarth, Jacob Cavanaugh, and Sarah Ellwein. Cal Poly<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Dissecting BFT Consensus: In Trusted Components we Trust!<\/strong> Suyash Gupta, Sajjad Rahnama, Shubham Pandey, Natacha Crooks, and Mohammad Sadoghi. UC Davis<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Good Plans Despite No Cardinalities?<\/strong> Asoke Datta, Brian Tsan, Yesdaulet Izenov, and Florin Rusu. UC Merced<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Environment-Aware Optimization for Geo-spatial Video Queries Preprocessing.<\/strong> Chanwut (Mick) Kittivorawong, Yongming Ge, Yousef Helal, and Alvin Cheung. UC Berkeley<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Optimizing Distributed Protocols with Query Rewrites.<\/strong> David Chu, Rithvik Panchapakesan, Shadaj Laddad, Chris Liu, Kaushik Shivakumar, Natacha Crooks, Joe Hellerstein, and Heidi Howard. UC Berkeley<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Optimizing Stateful Dataflow with E-Graphs.<\/strong> Shadaj Laddad, Tyler Hou, Conor Power, Mae Milano, Alvin Cheung, and Joe Hellerstein. UC Berkeley<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Optimizing Transactional Hit Rate for Web Caches.<\/strong> Audrey Cheng, David Chu, Terrance Li, Jason Chan, Natacha Crooks, Joe Hellerstein, Ion Stoica, and Xiangyao Yu. UC Berkeley<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Practical View-Change-Less Protocol through Rapid View Synchronization.<\/strong> Dakai Kang, Sajjad Rahnama, Jelle Helling, and Mohammad Sadoghi. UC Davis<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Resilient Consensus Sustained Collaboratively.<\/strong> Junchao Chen, Suyash Gupta, Alberto Sonnino, Lefteris Kokoris-Kogias, and Mohammad Sadoghi. UC Davis<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>ResilientDB: Global-Scale Sustainable Blockchain Fabric.<\/strong> Junchao Chen, Dakai Kang, Sajjad Rahnama, Suyash Gupta, Shesha Vishnu Prasad, Jinxiao Yu, Arindaam Roy, Divjeet Singh Jas, Wayne Wang, Julieta Duarte, Glenn Chen, Apratim Shukla, Priyal Soni, Kaustubh Shete, Gopal Nambiar, Tim Huang, Haskell Lark Macaraig, Steve Chen, Jared Givens, Saipranav Kotamreddy, Aditya Bej, and Mohammad Sadoghi. UC Davis<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>SkyPIE: A Fast &amp; Accurate Object Placement Oracle.<\/strong> Tiemo Bang, Chris Douglas, Shadaj Laddad, Alvin Cheung, Natacha Crooks, and Joe Hellerstein. UC Berkeley<\/li>\n<!-- \/wp:list-item -->\n\n<!-- wp:list-item -->\n<li><strong>Sub-optimal Join Order Classification by L1-error.<\/strong> Yesdaulet Izenov, Asoke Datta, Brian Tsan, and Florin Rusu. UC Merced<\/li>\n<!-- \/wp:list-item --><\/ol>\n<!-- \/wp:list -->\n<!-- \/wp:msr\/content-tab -->\n<!-- \/wp:msr\/content-tabs -->","tab-content":[],"msr_startdate":"2023-05-11","msr_enddate":"2023-05-11","msr_event_time":"","msr_location":"Mountain View, CA","msr_event_link":"","msr_event_recording_link":"","msr_startdate_formatted":"May 11, 2023","msr_register_text":"Watch now","msr_cta_link":"","msr_cta_text":"","msr_cta_bi_name":"","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720-960x540.jpg\" class=\"img-object-cover\" alt=\"overhead view of Microsoft Silicon Valley campus\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720-960x540.jpg 960w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720-1066x600.jpg 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720-655x368.jpg 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720-343x193.jpg 343w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720-640x360.jpg 640w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/04\/Silicon-Valley-campus_header_1920x720-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","event_excerpt":"NorCalDB Day is a single-day, workshop-style event where participants from academia and industry in Northern California meet to present ideas and discuss their research and experiences. In 2023, NorCalDB Day will be held at the Microsoft Silicon Valley Campus in Mountain View, on Thursday May 11, 2023. 8:00 - 9:00 AMRegistration and Light Breakfast9:00 - 9:15 AMIntroduction and Logistics9:15 - 10:00 AMKeynote: Benchmarking and Tuning Log-Structured Table Formats (slides (opens in new tab))Raghu Ramakrishnan, Microsoft10:00&hellip;","msr_research_lab":[],"related-researchers":[{"type":"user_nicename","display_name":"Jes\u00fas Camacho Rodr\u00edguez","user_id":40693,"people_section":"Related people","alias":"jesusca"},{"type":"user_nicename","display_name":"Subru Krishnan","user_id":33746,"people_section":"Related people","alias":"subru"},{"type":"user_nicename","display_name":"Avrilia Floratou","user_id":36080,"people_section":"Related people","alias":"avflor"},{"type":"user_nicename","display_name":"Yuanyuan Tian","user_id":40708,"people_section":"Related people","alias":"yuanyuantian"}],"msr_impact_theme":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-opportunities":[],"related-publications":[],"related-videos":[],"related-posts":[],"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/933432","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-event"}],"version-history":[{"count":61,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/933432\/revisions"}],"predecessor-version":[{"id":962436,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/933432\/revisions\/962436"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media\/933447"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=933432"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=933432"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=933432"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=933432"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=933432"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=933432"},{"taxonomy":"msr-program-audience","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-program-audience?post=933432"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=933432"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=933432"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}