required papers must be read and summarized before each class.
repeat papers were read for a previous class and do not need to be resummarized.
supplemental papers will be read, summarize, and present to the class by one student; you are encouraged to do a first pass (skim) of the paper.
reference papers are a good reference if you want more information on a topic. You do not need to read them.
Submit reviews via the HotCRP site. All papers are also available from this site. You will receive an email to your usc email address adding you as a "PC member" for this site.
The Maintenance of Duplicate Databases. reference
Paul R. Johnson and Robert H. Thomas.
IETF RFC #677, 1975.
How to Read a Paper. required (no summary)
Srinivasan Keshav.
CCR, 2007.
MapReduce: Simplified Data Processing on Large Clusters. required (no summary)
Jeffrey Dean and Sanjay Ghemawat.
OSDI, 2004.
Implementing Remote Procedure Call. reference
Andrew D. Birrell and Bruce Jay Nelson.
ACM TOCS, 1984.
Time, Clocks, and the Ordering of Events in a Distributed System. required
Leslie Lamport.
Communications of the ACM, 1978.
Defining Liveness. reference
Bowen Alpern and Fred B. Schneider.
Information Processing Letters, 1985.
Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial. required
Fred B. Schneider.
ACM Computing Surveys, 1990.
Time, Clocks, and the Ordering of Events in a Distributed System. repeat
Leslie Lamport.
Communications of the ACM, 1978.
All about Eve: Execute-Verify Replication for Multi-Core Servers. supplemental
Manos Kapritsos, Yang Wang, Vivien Quema, Allen Clement, Lorenzo Alvisi, and Mike Dahlin.
OSDI, 2012.
Chain Replication for Supporting High Throughput and Availability. required
Robert Van Renesse and Fred Schneider.
OSDI, 2004.
Object Storage on CRAQ: High-Throughput Chain Replication for Read-Mostly Workloads. supplemental
Jeff Terrace and Michael J. Freedman
USENIX Annual Technical Conference (ATC), 2009.
Detecting Failures in Distributed Systems with the Falcon Spy Network. supplemental
Joshua B. Leners, Hao Wu, Wei-Lun Hung, Marcos K. Aguilera, and Michael Walfish.
SOSP, 2011.
Impossibility of Distributed Consensus with One Faulty Process. (FLP) required
Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson.
Journal of the ACM, 1985.
On the minimal synchronism needed for distributed consensus. supplemental
Danny Dolev, Cynthia Dwork, and Larry Stockmeyer.
Journal of the ACM, 1987.
Consensus in the presence of partial synchrony. supplemental
Cynthia Dwork, Nancy A. Lynch, and Larry Stockmeyer.
Journal of the ACM, 1988.
Revisiting the relationship between non-blocking atomic commitment and consensus. supplemental
Rachid Guerraoui.
Distributed Algorithms, 1995.
Unreliable Failure Detectors for Reliable Distributed Systems. supplemental
Tushar Deepak Chandra and Sam Toueg.
Journal of the ACM, 1996.
Heartbeat: A timeout-free failure detector for quiescent reliable communication. supplemental
Marcos Aguilera, Wei Chen, and Sam Toueg.
Distributed Algorithms, 1997.
A Formal Model of Crash Recovery in a Distributed System. (3PC) required
Dale Skeen and Michael Stonebreaker.
IEEE Trans. Software Engineering, 1983.
The Part-Time Parliament. required (joint summary)
Leslie Lamport.
ACM TOCS, 1998.
Paxos Made Simple. required (joint summary)
Leslie Lamport.
ACM SIGACT News, 2001.
Viewstamped Replication Revisited. supplemental
Barbara Liskov and James Cowling.
MIT Tech Report, 2012.
(Originally described in PODC 1988 as Viewstamped Replication... by Oki and Liskov)
In Search of an Understandable Consensus Algorithm. (RAFT) supplemental
Diego Ongaro and John Ousterhout.
USENIX ATC, 2014.
Life under the Lens. strongly encouraged
Christos H Papadimitriou.
4pm in SAL 101.
There is More Consensus in Egalitarian Parliaments. (EPaxos) required
Iulian Moraru, David G. Andersen, and Michael Kaminsky.
SOSP, 2013.
Designing Distributed Systems Using Approximate Synchrony in Data Center Networks. (Speculative Paxos) supplemental
Dan R. K. Ports, Jialin Li, Vincent Liu, Naveen Kr. Sharma, and Arvind Krishnamurthy.
NSDI, 2015.
Mencius: building efficient replicated state machines for WANs. supplemental
Y. Mao, F. P. Junqueira, and K. Marzullo.
OSDI, 2008.
Practical Byzantine Fault Tolerance. (PBFT) required
Miguel Castro and Barbara Liskov.
OSDI, 1999.
Zyzzyva: Speculative Byzantine fault tolerance. supplemental
Rama Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong.
SOSP, 2007.
Prophecy: Using History for High-Throughput Fault Tolerance. supplemental
Siddhartha Sen, Wyatt Lloyd, and Michael J. Freedman.
NSDI, 2010.
The Next 700 BFT Protocols. supplemental
R. Guerraoui, N. Knezevic, V. Quema, and M. Vukolic.
Eurosys, 2010.
Making Byzantine fault tolerant systems tolerate Byzantine faults. (Aardvark) supplemental
Allen Clement, Marco Marchetti, Edmund Wong, Lorenzo Alvisi, and Mike Dahlin.
NSDI, 2009
In class.
Exam review.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. required
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan.
SIGCOMM, 2001.
A Scalable Content-Addressable Network. (CAN) supplemental
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker.
SIGCOMM, 2001.
Kademlia: A Peer-to-Peer Information System Based on the Xor Metric. supplemental
Petar Maymounkov and David Mazieres.
IPTPS, 2002.
Canon in G major: designing DHTs with hierarchical structure. supplemental
Prasanna Ganesan, Krishna Gummadi, and Hector Garcia-Molina.
ICDCS, 2004.
Democratizing content publication with Coral. supplemental
Michael J. Freedman, Eric Freudenthal, and David Mazieres.
NSDI, 2004.
Linearizability: A Correctness Condition for Concurrent Objects. required
M. P. Herlihy and J. M. Wing.
ACM TOPLAS, 1990.
How to make a multiprocessor computer that correctly executes multiprocess programs. (Sequential Consistency) required (no summary)
Leslie Lamport.
IEEE Trans. Computer, 1979.
Sequential consistency versus linearizability. supplemental
H. Attiya and J. L. Welch.
ACM TOCS, 1994.
Dynamo: Amazon's Highly Available Key-Value Store. required
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman,
Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels.
SOSP, 2007.
Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. (CAP proof) required (no summary)
Seth Gilbert and Nancy Lynch
ACM SIGACT News, 2002.
PNUTS: Yahoo!’s hosted data serving platform. supplemental
B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni
VLDB, 2008.
Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary. supplemental
Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno M. Preguiça, and Rodrigo Rodrigues.
OSDI, 2012.
Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS. required
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen.
SOSP, 2011.
Flexible Update Propagation for Weakly Consistent Replication. (Bayou) supplemental
K. Petersen, M. Spreitzer, D. Terry, M. Theimer, and A. Demers.
SOSP, 1997.
PRACTI Replication. supplemental
Mike Dahlin, Lei Gao, Amol Nayate, Arun Venkataramana, Praveen Yalagandula, and Jiandan Zheng
NSDI, 2006.
Stronger Semantics for Low-Latency Geo-Replicated Storage. (Eiger) supplemental
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen.
NSDI, 2013.
Spanner: Google's Globally Distributed Database. required
J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat, A. Gubarev, et al.
OSDI, 2012.
ACM TOCS, 2013
Transaction Management in the R* Distributed Database Management System. supplemental
C. Mohan, B. Lindsay, and R. Obermarck.
ACM TODS, 1986.
Transactional Storage for Geo-Replicated Systems. (Walter) supplemental
Yair Sovran, Russell Power, Marcos K. Aguilera, and Jinyang Li.
SOSP, 2011.
Extracting More Concurrency from Distributed Transactions. (Rococo) required
Shuai Mu, Yang Cui, Yang Zhang, Wyatt Lloyd, Jinyang Li
OSDI, 2014.
Salt: Combining ACID and BASE in a Distributed Database. supplemental
Chao Xie, Chunzhi Su, Manos Kapritsos, Yang Wang, Navid Yaghmazadeh, Lorenzo Alvisi, and Prince Mahajan.
OSDI, 2014.
Calvin: Fast Distributed Transactions for Partitioned Database Systems. supplemental
A. Thomson, T. Diamond, S.-C. Weng, K. Ren, P. Shao, and D. J. Abadi.
SIGMOD, 2012.
Scalable Atomic Visibility with RAMP Transactions. supplemental
Peter Bailis, Alan Fekete, Joseph M. Hellerstein, Ali Ghodsi, and Ion Stoica.
SIGMOD, 2014.
In the likely event we fall behind another class.
You choose your required paper for today. It must be a new paper for you, i.e., it is not a repeat of a paper you've summarized in this or a previous class.
The Google File System. required
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung.
SOSP, 2003.
MapReduce: Simplified Data Processing on Large Clusters. repeat
Jeffrey Dean and Sanjay Ghemawat.
OSDI, 2004.
The Chubby Lock Service for Loosely-Coupled Distributed Systems. required
Mike Burrows.
OSDI, 2006.
Bigtable: A Distributed Storage System for Structured Data. required
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wal- lach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber.
OSDI, 2006.
ACM TOCS, 2008.
Large-scale Incremental Processing Using Distributed Transactions and Notifications. (Percolator) required
Daniel Peng and Frank Dabek.
OSDI, 2010.
Thialfi: A Client Notification Service for Internet-Scale Applications. required
Atul Adya, Gregory Cooper, Daniel Myers, and Michael Piatek.
SOSP, 2011.
F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business. required
Jeff Shute, Mircea Oancea, Stephan Ellner, Ben Handy, Eric Rollins, Bart Samwel, Radek Vingralek, Chad Whipkey, Xin Chen, Beat Jegerlehner, Kyle Littlefield, and Phoenix Tong.
VLDB, 2013.
Spanner: Google's Globally Distributed Database. repeat
J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat, A. Gubarev, et al.
OSDI, 2012.
ACM TOCS, 2013.
You choose your required paper for today. It must be a new paper for you, i.e., it is not a repeat of a paper you've summarized in this or a previous class.
Scaling Memcache at Facebook. required
R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy,
M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani.
NSDI, 2013.
Tao: Facebook’s Distributed Data Store for the Social Graph. required
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo,
S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani.
USENIX ATC, 2013.
An Analysis of Facebook Photo Caching. required
Q. Huang, K. Birman, R. van Renesse, W. Lloyd, S. Kumar, and H. C. Li.
SOSP, 2013.
f4: Facebook’s Warm BLOB Storage System. required
Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu,
Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, and Sanjeev Kumar.
OSDI, 2014.
RIPQ: Advanced Photo Caching on Flash for Facebook. required
Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li.
FAST, 2015.
Wormhole: Reliable pub-sub to support geo-replicated internet service. required
Yogeshwer Sharma, Philippe Ajoux, Petchean Ang, David Callies, Abhishek Choudhary, Laurent Demailly, et al.
NSDI, 2015.
You choose your required paper for today. It must be a new paper for you, i.e., it is not a repeat of a paper you've summarized in this or a previous class.
SILT: a Memory-Efficient, High-Performance Key-Value Store. required
Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky.
SOSP, 2011.
Cache Craftiness for Fast Multicore Key-Value Storage. (MassTree) required
Yandong Mao, Eddie Kohler, and Robert T. Morris.
EuroSys, 2012.
MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. required
Bin Fan, David G. Andersen, Michael Kaminsky.
NSDI, 2013.
MICA: a Holistic Approach to Fast In-Memory Key-Value Storage. required
Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky.
NSDI, 2014.
Phase Reconciliation for Contended In-Memory Transactions (Doppel). required
Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris.
OSDI, 2014.
In class.