Datacenters have been using the "monolithic" server model for decades, where each server hosts a set of hardware devices like CPU and DRAM on a motherboard and runs an OS on top to manage the hardware resources. This monolithic server model fundamentally restricts datacenters from achieving efficient resource packing, hardware rightsizing, and great heterogeneity. Recent hardware and application trends such as serverless computing further call for a rethinking of the long-standing server-centric model. My answer is to "
disaggregate" monolithic servers into network-attached hardware components that host different hardware resources and offer different functionalities (e.g., a processor component for computation, a memory component for fast data accesses). I believe that after evolving from physical (DC-1.0) to virtual (DC-2.0), datacenters should evolve further into a disaggregated one (DC-3.0), where hardware resources can be allocated and scaled to the exact amount that applications use and can be individually managed and customized for different application needs. By not having servers, DC-3.0 disrupts designs and technologies in almost every layer in today's datacenters, from hardware and networking to OS and applications. My lab undertook pioneering efforts in building an end-to-end solution for DC-3.0 with a new OS, a new hardware platform, and a new network system.
This talk will focus on two systems that are central to the design of DC-3.0: 1) LegoOS, a new distributed operating system designed for managing disaggregated resources. LegoOS splits OS functionalities into different units, each running at a hardware component and managing the component's hardware resources. LegoOS enables the disaggregation and customization of OS functionalities, a significant step towards building DC-3.0's software infrastructure. 2) LegoFPGA, a new approach of using FPGA to efficiently manage and virtualize hardware resources. LegoFPGA offers a solution to co-design application, OS, and hardware functionalities and customize them for different hardware resources and application domains, an important step towards building DC-3.0's hardware infrastructure. With LegoOS and LegoFPGA, we demonstrate that separating core OS and hardware functionalities is not only feasible but can largely improve performance per dollar over the current datacenter monolithic server model.
Bio:
Yiying Zhang is an assistant professor in the School of Electrical and Computer Engineering at Purdue University. Her research interests span operating systems, distributed systems, computer architecture, and datacenter networking. She also works on the intersection of systems and programming language, security, and AI/ML. She won an OSDI best paper award in 2018 and an NSF CAREER award in 2019. Yiying’s lab is among the few groups in the world now that build new OSes and full-stack, cross-layer systems. Yiying received her Ph.D. from the Department of Computer Sciences at the University of Wisconsin-Madison under the supervision of Andrea and Remzi Arpaci-Dusseau and worked as a postdoctoral scholar at the University of California, San Diego before joining Purdue.
To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu, at least one week prior to the event.