OSDI '21 - HotCRP.com Furthermore, such performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. KEVIN combines a fast, lightweight, and POSIX compliant file system with a key-value storage device that performs in-storage indexing. Title Page, Copyright Page, and List of Organizers | Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. We built an FPGA prototype of the nanoPU fast path by modifying an open-source RISC-V CPU, and evaluated its performance using cycle-accurate simulations on AWS FPGAs. The NAL eliminates remote PM accesses to hot items without inducing extra local PM accesses. USENIX discourages program co-chairs from submitting papers to the conferences they organize, although they are allowed to do so. One important reason for the high cost is, as we observe in this paper, that many sanitizer checks are redundant the same safety property is repeatedly checked leading to unnecessarily wasted computing resources. This is the first OSDI in an odd year as OSDI moves to a yearly cadence. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. Camera-ready submission (all accepted papers): 15 Mars 2022. Lukas Burkhalter, Nicolas Kchler, Alexander Viand, Hossein Shafagh, and Anwar Hithnawi, ETH Zrich. Second, it innovates on the underlying cryptographic machinery and constructs a new private information retrieval scheme, FastPIR, that reduces the time to process oblivious access requests for mailboxes. In 2023 I started another two-year term on the . Timothy Roscoe is a Full Professor in the Systems Group of the Computer Science Department at ETH Zurich, where he works on operating systems, networks, and distributed systems, and is currently head of department. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation. We present NrOS, a new OS kernel with a safer approach to synchronization that runs many POSIX programs. We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. We also welcome work that explores the interface to related areas such as computer architecture, networking, programming languages, analytics, and databases. Ankit Bhardwaj and Chinmay Kulkarni, University of Utah; Reto Achermann, University of British Columbia; Irina Calciu, VMware Research; Sanidhya Kashyap, EPFL; Ryan Stutsman, University of Utah; Amy Tai and Gerd Zellweger, VMware Research. Editor in charge: Daniel Petrolia . Kernel code requires manual memory management and type-unsafe code and must efficiently handle complex, asynchronous events. We implement and evaluate a suite of applications, including MICA, Raft and Set Algebra for document retrieval; and we demonstrate that the nanoPU can be used as a high performance, programmable alternative for one-sided RDMA operations. This yielded 6% fewer TLB miss stalls, and 26% reduction in memory wasted due to fragmentation. We observe that, due to their intended security guarantees, SC schemes are inherently oblivioustheir memory access patterns are independent of the input data. The co-chairs may then share that paper with the workshops organizers and discuss it with them. For instance, the following are not sufficient grounds to specify a conflict with a PC member: they have reviewed the work before, they are employed by your competitor, they are your personal friend, they were your post-doc advisor or advisee, or they had the same advisor as you. Acm Ccs 2022 - Sigsac Petuum Awarded OSDI 2021 Best Paper for Goodput-Optimized Deep Learning Mothy received a PhD in 1995 from the Computer Laboratory of the University of Cambridge, where he was a principal designer and builder of the Nemesis OS. All papers will be available online to registered attendees before the conference. In addition, increasing CPU core counts further complicate kernel development. SOSP 2021 - Symposium on Operating Systems Principles For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, and Liyan Zheng, Tsinghua University; Yuanzhi Li, Carnegie Mellon University; Kaiyuan Rong and Yuanyong Chen, Tsinghua University; Zhihao Jia, Carnegie Mellon University and Facebook. PC members are not required to read supplementary material when reviewing the paper, so each paper should stand alone without it. A glance at this year's OSDI program shows that Operating Systems are a small niche topic for this conference, not even meriting their own full session. Table of Contents | DMons targeted optimizations provide 16.83% speedup on average (up to 53.14%), compared to a baseline that uses the highest level of compiler optimization. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. All submissions will be treated as confidential prior to publication on the USENIX OSDI 21 website; rejected submissions will be permanently treated as confidential. We present selective profiling, a technique that locates data locality problems with low-enough overhead that is suitable for production use. Her robot soccer teams have been RoboCup world champions several times, and the CoBot mobile robots have autonomously navigated for more than 1,000km in university buildings. To this end, we propose GNNAdvisor, an adaptive and efficient runtime system to accelerate various GNN workloads on GPU platforms. With an aim to improve time-to-accuracy performance in model training, Oort prioritizes the use of those clients who have both data that offers the greatest utility in improving model accuracy and the capability to run training quickly. This distinction forces a re-design of the scheduler. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. Please identify yourself as a presenter and include your mailing address in your email. One classical approach is to increase the efficiency of an allocator to minimize the cycles spent in the allocator code. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. Reviews will be available for response on Wednesday, March 3, 2021. See the USENIX Conference Submissions Policy for details. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. A hardware-accelerated thread scheduler makes sub-nanosecond decisions, leading to high CPU utilization and low tail response time for RPCs. Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees. Copyright to the individual works is retained by the author[s]. Precision Conservation: Linking Set-aside and Working Lands Policy Machine learning (ML) models trained on personal data have been shown to leak information about users. People often assume that blockchain has Byzantine robustness, so adding it to any system will make that system super robust against any calamity. For example, talks may be shorter than in prior years, or some parts of the conference may be multi-tracked. Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. The chairs will review paper conflicts to ensure the integrity of the reviewing process, adding or removing conflicts if necessary. All deadline times are 23:59 hrs UTC. Writing a correct operating system kernel is notoriously hard. Call for Papers - EuroSys 2022 For conference information, . OSDI '22 Technical Sessions | USENIX We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive class of datacenter applications: those that utilize many small Remote Procedure Calls (RPCs) with very short (s-scale) processing times. SOSP Conference - Home - ACM Digital Library His work has included the Barrelfish multikernel research OS, as well as work on distributed stream processors, and using formal specifications to describe the hardware/software interfaces of modern computer systems. Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. She has a PhD in computer science from MIT. If your accepted paper should not be published prior to the event, please notify production@usenix.org. Researchers from the Software Systems Laboratory bagged a Best Paper Award at the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021). HotCRP.com signin Sign in using your HotCRP.com account. Qing Wang, Youyou Lu, Junru Li, and Jiwu Shu, Tsinghua University. First, it enables a caller to push a message to a callee in two hops, using a new way of assigning mailboxes to users that resembles how a post office assigns PO boxes to its customers. He joined Intel Research at Berkeley in April 2002 as a principal architect of PlanetLab, an open, shared platform for developing and deploying planetary-scale services. Our evaluation shows that PET outperforms existing systems by up to 2.5, by unlocking previously missed opportunities from partially equivalent transformations. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 14-16, 2021. Marius is open-sourced at www.marius-project.org. Contact your program co-chairs, osdi21chairs@usenix.org, or the USENIX office, submissionspolicy@usenix.org. Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. EuroSys 2021 We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. We implement DeSearch for two existing decentralized services that handle over 80 million records and 240 GBs of data, and show that DeSearch can scale horizontally with the number of workers and can process 128 million search queries per day. The hybrid segment recycling chooses a proper block reclaiming policy between segment compaction and threaded logging based on their costs. Oort Accepted to Appear at OSDI'2021 | Mosharaf Chowdhury In some cases, the quality of these artifacts is as important as that of the document itself. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding, University of California, Santa Barbara. If the conference registration fee will pose a hardship for the presenter of the accepted paper, please contact conference@usenix.org. Lifting predicates and crash framing make the specification easy to use for developers, and logically atomic crash specifications allow for modular reasoning in GoJournal, making the proof tractable despite complex concurrency and crash interleavings. Session Chairs: Sebastian Angel, University of Pennsylvania, and Malte Schwarzkopf, Brown University, Ishtiyaque Ahmad, Yuntian Yang, Divyakant Agrawal, Amr El Abbadi, and Trinabh Gupta, University of California Santa Barbara. Memory allocation represents significant compute cost at the warehouse scale and its optimization can yield considerable cost savings. We have implemented a prototype of our design based on Penglai, an open-sourced enclave system for RISC-V. Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. Moreover, to handle dynamic workloads, Nap adopts a fast NAL switch mechanism. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. The key insight guiding our design is computation separation. Concretely, Dorylus is 1.22 faster and 4.83 cheaper than GPU servers for massive sparse graphs. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. This fast path contains programmable hardware support for low latency transport and congestion control as well as hardware support for efficient load balancing of RPCs to cores. Advisor: You have a past or present association as thesis advisor or advisee. Storm ensures security using a Security Typed ORM that refines the (type) abstractions of each layer of the MVC API with logical assertions that describe the data produced and consumed by the underlying operation and the users allowed access to that data. Performance experiments show that GoNFS provides similar performance (e.g., at least 90% throughput across several benchmarks on an NVMe disk) to Linuxs NFS server exporting an ext4 file system, suggesting that GoJournal is a competitive journaling system. Sponsored by USENIX in cooperation with ACM SIGOPS. In addition, CLP outperforms Elasticsearch and Splunk Enterprise's log ingestion performance by over 13x, and we show CLP scales to petabytes of logs. We also verified a simple NFS server using GoJournals specs, which confirms that they are helpful for application verification: a significant part of the proof doesnt have to consider concurrency and crashes. Although SSDs can be simplified under the current ZNS interface, its counterpart LFS must bear segment compaction overhead. The ZNS+ also allows each zone to be overwritten with sparse sequential write requests, which enables the LFS to use threaded logging-based block reclamation instead of segment compaction. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. We present application studies for 8 applications, improving requests-per-second (RPS) by 7.7% and reducing RAM usage 2.4%. Accepted paper for Luo Mai at OSDI 22 | InfWeb Additionally, there is no assurance that data processing and handling comply with the claimed privacy policies. These results outperform state-of-the-art HTAP systems by several orders of magnitude on transactional performance, while just incurring little performance slowdown (5% over pure OLTP workloads) and still enjoying data freshness for analytical queries (less than 20 ms of maximum delay) in the failure-free case. Weak Links in Authentication Chains: A Large-scale Analysis of Email Sender Spoofing Attacks Report - Systems Research Artifacts Calibrated interrupts increase throughput by up to 35%, reduce CPU consumption by as much as 30%, and achieve up to 37% lower latency when interrupts are coalesced. She has been recognized with many industry honors including induction into the National Academy of Engineering, the Inventor Hall of Fame, The Internet Hall of Fame, Washington State Academy of Science, and lifetime achievement awards from USENIX and SIGCOMM. For general conference information, see https://www.usenix.org/conference/osdi22. Petuum Awarded OSDI 2021 Best Paper for Goodput-Optimized Deep Learning Research Petuum CASL research and engineering team's Pollux technical paper on adaptive scheduling for optimized. When registering your abstract, you must provide information about conflicts with PC members. Academic and industrial participants present research and experience papers that cover the full range of theory . Consensus bugs are extremely rare but can be exploited for network split and theft, which cause reliability and security-critical issues in the Ethereum ecosystem. Samantha Vaive - Member Board Of Trustees - Lansing Community College Only two types of supplementary material are permitted: source code described in the paper and formal proofs sketched in the paper. Hence, CLP enables efficient search and analytics on archived logs, something that was impossible without it. Professor Veloso is the Past President of AAAI (the Association for the Advancement of Artificial Intelligence), and the co-founder, Trustee, and Past President of RoboCup. High-performance tensor programs are critical for efficiently deploying deep neural network (DNN) models in real-world tasks. We also propose two file system techniques for ZNS+-aware LFS. SOSP 2021 - Symposium on Operating Systems Principles If you are uncertain about how to anonymize your submission, please contact the program co-chairs, osdi21chairs@usenix.org, well in advance of the submission deadline. Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. However, memory allocation decisions also impact overall application performance via data placement, offering opportunities to improve fleetwide productivity by completing more units of application work using fewer hardware resources. This paper presents the design and implementation of CLP, a tool capable of losslessly compressing unstructured text logs while enabling fast searches directly on the compressed data. OSDI 2021 papers summary. The chairs may reject abstracts or papers on the basis of egregious missing or extraneous conflicts. SanRazor adopts a novel hybrid approach it captures both dynamic code coverage and static data dependencies of checks, and uses the extracted information to perform a redundant check analysis. OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. Our evaluation shows that DistAI successfully verifies 13 common distributed protocols automatically and outperforms alternative methods both in the number of protocols it verifies and the speed at which it does so, in some cases by more than two orders of magnitude. As a result, the design of a file system with respect to space management and crash consistency is simplified, requiring only 10.8K LOC for full functionality. Jaehyun Hwang and Midhul Vuppalapati, Cornell University; Simon Peter, UT Austin; Rachit Agarwal, Cornell University. USENIX Security '21 Summer Accepted Papers | USENIX PLDI 2019 - PLDI Research Papers - PLDI 2019 - SIGPLAN We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. See the Preview Session page for an overview of the topics covered in the program. Taking place in Carlsbad, CA from 11-13 July, OSDI is a highly selective flagship conference in computer science, especially on the topic of computer systems. For any further information, please contact the PC chairs: pc-chairs-2022@eurosys.org. Responses should be limited to clarifying the submitted work. USENIX new Date().getFullYear()>document.write(new Date().getFullYear()); Grants for Black Computer Science Students Application, Title Page, Copyright Page, and List of Organizers, OSDI '21 Proceedings Interior (PDF, best for mobile devices). We observe that scalability challenges in training GNNs are fundamentally different from that in training classical deep neural networks and distributed graph processing; and that commonly used techniques, such as intelligent partitioning of the graph do not yield desired results. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. OSDI'20: 14th USENIX Conference on Operating Systems Design and ImplementationNovember 4 - 6, 2020 ISBN: 978-1-939133-19-9 Published: 04 November 2020 Sponsors: ORACLE, VMware, Google Inc., Amazon, Microsoft Get Alerts for this Conference Save to Binder Export Citation Bibliometrics Citation count 96 Downloads (6 weeks) 317 Downloads (12 months) While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. Proceedings Cover | To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. As has been standard practice in OSDI and SOSP in recent years, we will allow authors to submit quick responses to PC reviews: they will be made available to the PC before the final online discussion and PC meeting. The blockchain community considers this hard fork the greatest challenge since the infamous 2016 DAO hack. Fortunately, we observe that the backups for high availability in modern distributed OLTP systems can be retrofitted to bridge the analytical queries and transactions in HTAP workloads. Further, Vegito can recover from cascading machine failures by using the columnar backup in less than 60 ms. We will look at various problems and approaches, and for each, see if blockchain would help. However, your OSDI submission must use an anonymized name for your project or system that differs from any used in such contexts. Zeph enforces privacy policies cryptographically and ensures that data available to third-party applications complies with users' privacy policies. Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning, Oort: Efficient Federated Learning via Guided Participant Selection, PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections, Modernizing File System through In-Storage Indexing, Nap: A Black-Box Approach to NUMA-Aware Persistent Memory Indexes, Rearchitecting Linux Storage Stack for s Latency and High Throughput, Optimizing Storage Performance with Calibrated Interrupts, ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction, DMon: Efficient Detection and Correction of Data Locality Problems Using Selective Profiling, CLP: Efficient and Scalable Search on Compressed Text Logs, Polyjuice: High-Performance Transactions via Learned Concurrency Control, Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing, The nanoPU: A Nanosecond Network Stack for Datacenters, Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator, Scalable Memory Protection in the PENGLAI Enclave, NrOS: Effective Replication and Sharing in an Operating System, Addra: Metadata-private voice communication over fully untrusted infrastructure, Bringing Decentralized Search to Decentralized Services, Finding Consensus Bugs in Ethereum via Multi-transaction Differential Fuzzing, MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation, Zeph: Cryptographic Enforcement of End-to-End Data Privacy, It's Time for Operating Systems to Rediscover Hardware, DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols, GoJournal: a verified, concurrent, crash-safe journaling system, STORM: Refinement Types for Secure Web Applications, Horcrux: Automatic JavaScript Parallelism for Resource-Efficient Web Computation, SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs, Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads, GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs, Marius: Learning Massive Graph Embeddings on a Single Machine, P3: Distributed Deep Graph Learning at Scale.