Lectures‎ > ‎



Course overview
Course details
Some motivation

Course Overview

In this course we will study the design and implementation of reliable distributed systems.
  Reliable distributed storage
  Distributed data processing frameworks


A little about me, Greg Benson
A little about you

Course Mechanics

Read academic/industry research papers - seminal research
Write paper evaluations
Discuss papers in class
Implement concepts from papers in projects (2)
One final group project: Implement a version of Spark
One midterm and one final (they will cover material from papers and class discussions)
Class: 45 mins class, 10 min break, 40 mins class

Course Syllabus

Distributed Systems

What is a distributed system?
  Multiple networked coordinated computers, servers, or processes
  Examples: Email, Dropbox, Replicated MongoDB, others?

  Provide physical isolation for security
  Tolerate failures using replication
  Improve performance via parallel computation using multiple CPUs, disks, and increased network bandwidth
  To solve problems that can't be solved on a single computer (not enough disk or memory). Examples?

The problem
  Very difficult to build working distributed systems
  Hard to debug
  Hard to get right
  Hard to make fast
  Testing is hard
  Don't distribute unless you have to

  Time vs Space
  Replicate or Recompute (or some combination of both)

Motivating Example: MongoDB

  • MongoDB is a documented-oriented "NoSQL" database.
  • A MongoDB database consists of "collections"
  • A collection consists of documents
  • A document is like a JSON object
  • Each document in a collection can have a different schema.
    • Usually some fields are consistent across all documents in a collection.
    • Consistent fields are needed for indexes
  • MongoDB can run a standalone mongod server or several mongod servers in replica set.
  • One replica is elected as the master.
  • If the master fails, then a new master is elected.
The leader election problem
  • Multiple processes/servers need to coordinate.
  • One of the servers needs to be elected as the leader.
  • If a leader fails a new leader is elected.
  • Servers can fail during the election process.

Distributed Systems Concepts

Python Overview

  • Types
  • Expressions
  • Statements
  • Syntax
  • Functions
  • Control
  • Lists
  • Dicts
  • Classes