This subject reviews the fundamental theory of distributed systems and analyzes the main modern solutions in this field. It begins by identifying the problems associated with sharing data between several physical machines, describing how a distributed system can be modeled, discussing the techniques to synchronize various nodes communicated by a network, studying what techniques can be used to replicate data as well as their advantages and disadvantages, and ends up analyzing the modern highly scalable distributed systems.
Titular Professors
Professors
- Knowledge of object-oriented programming.
- Knowledge of the inner working of an operating system.
- Knowledge of mutual exclusion and algorithms.
The subject has the following objectives:
- Identify the main features of a distributed system.
- Understand a scientific article about distributed systems.
- Assimilate the main techniques used in distributed systems.
- Understand the algorithms that lead to the design of scalable distributed systems.
- Introduction to Distributed systems and its fundamentals.
- Shared-nothing vs shared-memory architectures: Multithread programming and Distributed applications.
- Models & clocks: Logical clocks, Direct dependency clocks, and Vector clocks.
- The mutual exclusion problem in distributed architectures: Token-based algorithm, Lamport's bakery algorithm, and Ricart & Agrawala's algorithm.
- Communication primitives and strategies: Message passing & RPCs, and gRPC & binary encoding with Protocol Buffers.
- Data replication techniques: Eager replication, Lazy replication, Primary copy, Update everywhere, and CAP Theorem.
- Fault models: Byzantine and Crash/stop.
- Failure tolerance and recovery policies: Leader election - Raft
- Consistent hashing.
- Modern challenges - Paper reading.
This is an eminently hands-on subject that combines theoretical contents and lectures with exercises or micro-assignments aimed at consolidating the knowledge acquired by the students, therefore the subject is taught entirely in the laboratory.
All exercises proposed, without exception, must be delivered and passed in order to be able to pass the subject. Active participation in class discussions, answering questions during lectures will contribute 5% to the final grade. Regular attendance and engagement are essential for successful completion of this component.
If exercises are delivered before the date of the examination of the ordinary evaluation and the corresponding interviews are passed, no exams will be needed to pass. The final grade is calculated using the following formula:
Subject final grade = 95% Exercises average + 5% Attendance
Otherwise, it will be necessary to pass a final exam. When all exercises are approved and the exam has a score equal to or greater than 4, the final grade will be calculated as: Subject final grade = 55% Exam + 40% Exercises average + 5% Attendance.
The following aspects will be assessed:
- Practical Assignments: The ability to design scalable distributed systems and correctly apply the main techniques used in distributed architectures within a hands-on laboratory setting.
- Paper Reading: The capacity to comprehend and analyze a scientific article about modern challenges in distributed systems.
- Final Exam: The assimilation and correct identification of the main features, fundamental theory, and models of a distributed system.
- Vijay K. Garg. 2007. Concurrent and Distributed Computing in Java. IEEE Press, Piscataway, NJ, USA.
- Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, Benjamin Reed: ZooKeeper: Wait-free Coordination for Internet-scale Systems. USENIX Annual Technical Conference 2010.
- Fernando Pedone, Matthias Wiesmann, André Schiper, Bettina Kemme, Gustavo Alonso: Understanding Replication in Databases and Distributed Systems. ICDCS 2000: 464-474.
- Leslie Lamport: The Part-Time Parliament. ACM Trans. Comput. Syst. 16(2): 133-169 (1998).
- James C. Corbett et al.: Spanner: Google's Globally Distributed Database. ACM Trans. Comput. Syst. 31(3): 8 (2013).
- Giuseppe DeCandia et al.: Dynamo: Amazon's highly available key-value store. SOSP 2007: 205-220.
- Tom White: Hadoop The Definitive Guide: Storage and Analysis at Internet Scale (4. ed., revised & updated). O'Reilly 2015, ISBN 978-1-491-90163-2, pp. I-XXV, 1-727.
-