In this course we will study the design and implementation of reliable distributed systems.
Reliable distributed storage
Distributed data processing frameworks
A little about me, Greg Benson
A little about you
Read academic/industry research papers - seminal research
Write paper evaluations
Discuss papers in class
Implement concepts from papers in projects (2)
One final group project: Implement a version of Spark
One midterm and one final (they will cover material from papers and class discussions)
Class: 45 mins class, 10 min break, 40 mins class
What is a distributed system?
Multiple networked coordinated computers, servers, or processes
Examples: Email, Dropbox, Replicated MongoDB, others?
Provide physical isolation for security
Tolerate failures using replication
Improve performance via parallel computation using multiple CPUs, disks, and increased network bandwidth
To solve problems that can't be solved on a single computer (not enough disk or memory). Examples?
Very difficult to build working distributed systems
Hard to debug
Hard to get right
Hard to make fast
Testing is hard
Don't distribute unless you have to
Time vs Space
Replicate or Recompute (or some combination of both)
Motivating Example: MongoDB
- MongoDB is a documented-oriented "NoSQL" database.
- A MongoDB database consists of "collections"
- A collection consists of documents
- A document is like a JSON object
- Each document in a collection can have a different schema.
- Usually some fields are consistent across all documents in a collection.
- Consistent fields are needed for indexes
- MongoDB can run a standalone mongod server or several mongod servers in replica set.
- One replica is elected as the master.
- If the master fails, then a new master is elected.
The leader election problem
- Multiple processes/servers need to coordinate.
- One of the servers needs to be elected as the leader.
- If a leader fails a new leader is elected.
- Servers can fail during the election process.
Distributed Systems Concepts
- Official Python Documentation
- The Python Tutor:
- How to Think Like a Computer Scientist (used in CS 110 at USF)
- Fast Lane to Python
- Python Quick Guide
- Python Reference