Attendance Record
At IST, professors are required to keep track of how many students attend each class. To help me with this task I will register or participation using this this link
Exam 1
The Guide
A small guide that aims at helping you to prepare for the exam.- Updated 24 October: dida-guide-2425-v2dot0.pdf
Exercises
Some exercises that illustrate the type of questions you can expect in the exam.
- Updated November 4: sample-questions-v4dot2.pdf
A Few Notes Regarding Each Class
- Class-01-2024-v1.pdf
- Class-02-2024-v1.pdf
- Class-03-2024-v1.pdf
- Class-04-2024-v1.pdf
- Class-05-2024-v1.pdf
- Class-06-2024-v2.pdf
- Class-07-2024-v2.pdf
- Class-08-2024-v3.pdf
- Class-09-2024-v2.pdf
- Class-10-2024-v1.pdf
- Class-11-2024-v2.pdf
- Class-12-2024-v1.pdf
- Class-13-2024-v1.pdf
- Class-14-2024-v1.pdf
Bibliography
- Distributed Systems, Concepts and Design, 5th Edition, Coulouris, Dollimore, Kindberg, and Blair, 2012
Additional Reading
Part 1: Consensus: From Synchrony to Asynchrony
1.1: Using consensus/ leader based consensus
- Introduction to Reliable and Secure Distributed Programming. Cachin, Guerraoui, and Rodrigues, Springer 2011
1.2: Paxos
- Introduction to Reliable and Secure Distributed Programming. Cachin, Guerraoui, Rodrigues, Springer 2011
- The Part-Time Parliament, Lamport, ACM Trans. Comput. Syst. Vol 16, N. 2, 1998
1.3: Multi-Paxos
- Paxos Made Simple, Lamport, 2001
- Paxos Made Live: An Engineering Perspective, Chandra, Griesemer, and Redstone, Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, 2007
1.4 Chubby and Zookeper
- The Chubby Lock Service for Loosely-Coupled Distributed Systems, Burrows, Proceedings of the 7th Symposium on Operating Systems Design and Implementation, 2006
- ZooKeeper: Wait-free coordination for Internet-scale systems, Hunt, Konar, Junqueira, and Reed, Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, 2010
- Curator and more on curator
Part 2: Dynamic Reconfiguration
2.1: View Synchrony
- Distributed Systems, Concepts and Design, 5th Edition, Coulouris, Dollimore, Kindberg, and Blair, 2012
- Reconfiguring Replicated Atomic Storage: A Tutorial, Aguilera, Keidar, Malkhi, Martin,Shraer, Bulletin of the EATCS: The Distributed Computing Column, October 2010 (just for curiosity, we will not be able to study this in detail)
- A History of the Virtual Synchrony Replication Model, Birman, in replication, Theory and Practice, Chapter 6, Lecture Notes in Computer Science book series (LNTCS,volume 5959)
- Chain Replication for Supporting High Throughput and Availability, van Reesse and Schneider, Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, 2004.
2.2: Reconfigurable Paxos
- Reconfiguring a State Machine, Lamport, Malkhi, and Zhou, SIGACT News, March 2010
- Vertical paxos and primary-backup replication, Lamport, Malkhi, and Zhou, PODC 2009.
- High Throughput Replication with Integrated Membership Management, Fouto, Preguiça, and Leitão, 2022 USENIX Annual Technical Conference, 2022 (just for curiosity, we will not be able to study this in detail)
2.3: Raft
- In Search of an Understandable Consensus Algorithm (Raft), Ongaro and Ousterhout, Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference, 2014.
- https://raft.github.io
- RAFT-bug.pdf
Part 3: Distributed Transactions
3.1: State-machine Database Replication
- Comparison of Database Replication Techniques Based on Total Order Broadcast, Wiesmann and Schiper, IEEE Transactions On Knowledge and Data Engineering, Vol 17, N, 4, April 2005
- Understanding Replication in Databases and Distributed Systems, Wiesmann, Pedone, Schiper, Kemme, and Alonso, Proceedings of the The 20th International Conference on Distributed Computing Systems, 2000
- AKARA: A Flexible Clustering Protocol for Demanding Transactional Workloads, Correia, Pereira, and Oliveira, Proceedings of the OTM 2008 Confederated International Conferences, 2008. (nice work but far from trivial)
- Blotter: Low Latency Transactions for Geo-Replicated Storage, Moniz, Leitão, Dias, Gehrke, Preguiça, and Rodrigues, Proceedings of the 26th International Conference on World Wide Web, 2017
3:2: Spanner and CockroachDB
- Spanner: Google's Globally Distributed Database, Corbett et al, ACM Trans. Comput. Syst., vol 31, N. 3, 2013
- Spanner's Concurrency Control, Malkhi and Martin, SIGACT News 2013.
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services, Jbaker, Bond, Corbett, Furman, Khorlin, Larson, Leon, Li, Lloyd, and Yushprakh, Proceedings of the Conference on Innovative Data system Research (CIDR) (2011).
- Paxos Replicated State Machines as the Basis of a High-Performance Data Store, Bolosky, Bradshaw, Haagens, Kusters, and Li, Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, 2011
- CockroachDB: The Resilient Geo-Distributed SQL Database. Rebecca Taft, Irfan Sharif, Andrei Matei, Nathan VanBenschoten, Jordan Lewis, Tobias Grieger, Kai Niemi, Andy Woods, Anne Birzin, Raphael Poss, Paul Bardea, Amruta Ranade, Ben Darnell, Bram Gruneir, Justin Jaffray, Lucy Zhang, and Peter Mattis. 2020. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20).
- Enabling the Next Generation of Multi-Region Applications with CockroachDB.Nathan VanBenschoten, Arul Ajmani, Marcus Gartner, Andrei Matei, Aayush Shah, Irfan Sharif, Alexander Shraer, Adam Storm, Rebecca Taft, Oliver Tan, Andy Woods, and Peyton Walters. 2022. In Proceedings of the 2022 International Conference on Management of Data (SIGMOD '22).
- Logical Physical Clocks and Consistent Snapshots in Globally Distributed Databases. Sandeep S. Kulkarni, Murat Demirbas, Deepak Madeppa, Bharadwaj Avva, and Marcelo Leone. Opodis 2014.
3.3: Transactional Causal Consistency
- Providing High Availability Using Lazy replication. Ladin, Liskov, Shrira, Ghemawat, ACM TOCS 1992 (lazyreplication.pdf)
- Stronger Semantics for Low-Latency Geo-Replicated Storage, Lloyd, Freedman, Kaminsky, Andersen, OSDI 2013
- Cure: Strong semantics meets high availability and low latency, Akkoorath, Tomsic, Bravo, Zhongmiao, Tyler, Bieniusa, Preguiça, and Shapiro, The 36th International Conference on Distributed Computing Systems, 2016
- Distributed transactional reads: the strong, the quick, the fresh & the impossible, Tomsic, Bravo, and Shapiro, Proceedings of the 19th International Middleware Conference, 2018.
- Wren: Nonblocking Reads in a Partitioned Transactional Causally Consistent Data Store, Spirovska, Didona and Zwaenepoel, The 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018.
- Transactional Causal Consistency for Serverless Computing, Wu, Sreekanti, and Hellerstein, Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020
Part 4: Peer-to-peer
4.1: Unstructured P2P and Gossip
- Distributed Systems, Concepts and Design, 5th Edition, Coulouris, Dollimore, Kindberg, and Blair, 2012
- Gossip-based peer sampling, Jelasity, Voulgaris, Guerraoui, Kermarrec, and van Steen, ACM Trans. Comput. Syst., Vol 25, N. 3, 2007
- HyParView: a membership protocol for reliable gossip-based broadcast, Leitao, Pereira and Rodrigues, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2007
- Epidemic Broadcast Trees, Leitão, Pereira, and Rodrigues, Proceedings of the 26th IEEE International Symposium on Reliable Distributed Systems, Beijing, China, 2007.
4:2: Structured P2P: Chord, Pastry, Kademlia
- Distributed Systems, Concepts and Design, 5th Edition, Coulouris, Dollimore, Kindberg, and Blair, 2012
- Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications, Stoica, Morris, Karger, Kaashoek, and Balakrishnan, SIGCOMM 2001
- Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems, Rowstron and Druschel, IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), 2001
- Kademlia: A Peer-to-Peer Information System Based on the XOR Metric, Maymounkov and Mazières, International Workshop on Peer-to-Peer Systems, 2002
4: 3: Dynamo (Key-Value Store), OceanStore (File System) e Scribe (Pub-Sub)
- Dynamo: Amazon's Highly Available Key-value Store, DeCandia et al., Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, 2007.
- Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service, Elhemali et al. USENIX Annual Technical Conference. July 11–13, 2022 • Carlsbad, CA, USA
- Maintenance-Free Global Data Storage. Sean Rhea, Chris Wells, Patrick Eaton, Dennis Geels, Ben Zhao, Hakim Weatherspoon, and John Kubiatowicz. 2001. IEEE Internet Computing 5, 5 (September 2001)
- OceanStore: an architecture for global-scale persistent storage. John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels, Ramakrishan Gummadi, Sean Rhea, Hakim Weatherspoon, Westley Weimer, Chris Wells, and Ben Zhao. SIGPLAN Not. 35, 11 (Nov. 2000), 190–201.
- Scribe: a large-scale and decentralized application-level multicast infrastructure, Castro, Druschel, Kermarrec, and Rowstron, IEEE Journal on Selected Areas in Communications, vol. 20, n. 8, Oct. 2002
- Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS, Lloyd, Freedman, Kaminsky, and Andersen, Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, 2011
- Scalable Consistency in Scatter, Glendenning, Beschastnikh, Krishnamurthy, and Anderson, Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, 2011
- Chain Replication in Theory and in Practice, Fritchie, Proceedings of the 9th ACM SIGPLAN Workshop on Erlang, 2010
- CATS: a linearizable and self-organizing key-value store, Arad, Shafaat, and Haridi, SOCC '13: Proceedings of the 4th annual Symposium on Cloud Computing, 2013