CPSC 426/526: Building Decentralized Systems, Fall 2012
[an error occurred while processing this directive]
- MW 2:30-3:45 PM, Room 200 AKW
- Bryan Ford,
Room 411 AKW, 432-1055
- Office hours: Mon & Wed 4:00-5:00 PM, or by appointment.
- Teaching Assistant:
- Ennan Zhai,
Room 404 AKW, 415-5957
- Office hours: Tue & Thu 2:00-4:00 PM, or by appointment.
- (Ennan will be in the office (not zoo) during the office hours.)
- URL: http://zoo.cs.yale.edu/classes/cs426/2012
This course explores practical principles and techniques
for building decentralized systems.
What is a decentralized system?
Basically, a distributed system that works although no one is in charge.
For purposes of this course,
a distributed system is a set of computers
that are physically distributed but can communicate via some form of network.
A decentralized system is a particular kind of distributed system:
namely, a set of computers that are under the control
of multiple separate owners or administrative authorities, not just one,
who wish to use their networked computing facilities to communicate,
organize, or share computing resources,
even if not all the machines' owners are guaranteed
to trust or “play nicely” with each other.
The course takes a hands-on approach,
emphasizing learning by doing.
The course's lectures will present and discuss challenges,
known techniques, and open questions
in decentralized system design and implementation.
Lectures will often be driven by examination
of real decentralized systems with various purposes
in widespread use the past or present,
such as UseNet, IRC, FreeNet, and Tor.
Throughout the course we will explore
fundamental security and usability challenges
such as decentralized identification and authentication,
denial-of-service and Sybil attacks,
and maintenance of decentralized structures undergoing rapid changes (churn).
During the semester,
students will develop a small but usable peer-to-peer communication application
that reflects a few of the important design principles and techniques
to be explored in the course,
such as gossip, social trust networks, distributed hash tables,
and byzantine consensus algorithms.
The labs will designed so that solutions can initially be tested individually
on private, virtual networks running on one machine (e.g., a zoo machine),
then tested collectively by attempting
to make different students' solutions interoperate on a real network.
There is currently no formal textbook for this course.
The Internet is your textbook.
We will offer informal lecture notes, various reading assignments,
and pointers to relevant information on the Web:
see the schedule and the
reference page in particular.
Beyond these informal materials,
you will be expected to exercise—and sharpen—your
“information scavenging” skills and creativity
in order to figure out how to accomplish the tasks you are assigned.
Although you are not required to buy any textbook for this course,
you may find it very useful to have on hand,
or reference in the library,
one or more standard distributed systems textbooks such as:
Introduction to Systems Programming and Computer Organization,
or with CS 426 instructor's permission,
comparable C/C++ systems programming experience.
This course is still under development
and all parts of the syllabus should be viewed as tentative
and subject to change.
See the schedule page for details,
but covered topics are likely to include:
- Communicating: Networking Foundations.
Addressing, forwarding, routing.
Connection-oriented versus connectionless modes.
Client/server versus peer-to-peer communication.
Firewalls, NATs, traversal.
- Gossip: a foundation for decentralized systems.
UseNet: technical, security, and social lessons.
Randomized rumor-mongering and anti-entropy algorithms.
- Communicating Securely: Basic Cryptographic Tools.
Symmetric-key encryption. Hash functions, message authentication.
Diffie-Hellman key exchange.
Decentralized identity via public-key encryption, digital signatures.
- Trust and Reputation.
Authorities, trust networks.
Sybil attacks and defenses.
- Finding Stuff in a Big World: Naming and Lookup.
Just ask around: request flooding.
Hierarchical directories and landmark structures.
Distributed hash tables.
- Following a Moving Target.
Location services, reference points, forwarding.
- Anonymous Communication.
Onion routing, mix networks.
Voting, verifiable shuffles, homomorphic encryption.
- Fireproofing Alexandria: Decentralized Storage.
Parity, erasure coding.
- Content Distribution.
Opportunistic caching (FreeNet).
Content integrity: hash trees, hash file systems.
Swarming downloads: BitTorrent.
- Gaining perspective.
Spam, malicious content.
Review/moderation and reputation systems.
Leveraging social networks (Peerspective).
Balancing local and global viewpoints.
- Decentralized Collaboration.
Network file systems, version management. Consistency.
Disconnected operation, weak consistency, conflict resolution.
- Distributed Consensus.
Paxos. Accountability (PeerReview). Byzantine fault tolerance.
- Mobile Code and Agents.
Privacy: trusted computing, fully homomorphic encryption.
Franchises, Hosting Incentives.
Decentralized virtual organizations.
You will be using the Intel Linux PCs
in the Zoo
You may access them either locally on the third floor of Watson Hall,
or remotely via the following command,
which will log you into a randomly-chosen Zoo machine
in order to balance load on the cluster:
To access these PCs, you can either directly login from their consoles
in the Zoo, or just remotely login from other
machines across the campus.
If you plan to take the course for credit, you should get an account
on these machines in the first week. Please also visit the following web
site to create a cs426 class directory (or just to sign up for a zoo account):
Do not allow anyone else to use your accounts for any purpose.
They are for your use alone, and you are responsible for any misuse.
Your passwords control access to your accounts and should be kept secret.
Your grade will be calculated as follows:
These weights are subject to minor adjustments during the semester.
- Lecture preparation homework and in-class participation: 15%
- Midterm exam: 15% (open books, open notes, no electronics)
- Final exam: 20%
- Programming labs: 50%
Some lectures will have associated preparatory homework assignments,
labeled PREP: in the schedule.
Homeworks will typically involve
and answering a few questions about assigned reading material.
Written homework assignments must be turned in
at the beginning of the associated lecture;
late homework will be accepted, with no penalty, only during
the first two-week “shopping period.”
The scheduled midterm and final exams are open books/notes,
but no electronics (eg. laptops).
Unless prior arrangements are made,
a grade of zero will be recorded for a missed exam.
Programming, like composition, is an individual creative process.
Individuals must reach their own understanding of the problem
and discover a path to its solution. During this time,
discussions with friends are encouraged.
However, when the time comes to write the code that solves the problem,
such discussions are no longer appropriate:
each student's code must be the work of the members of that student alone
(although you may ask teaching assistants or lab assistants
for help in debugging).
In your coding you are encouraged to adopt ideas suggested
by classmates or other reference sources,
but must carefully acknowledge the sources of those ideas
in your own code and/or documentation.
Do not, under any circumstances, copy
another student's code. Writing code for use by another or using
another's code in any form violates the University's academic regulations and
will be dealt with harshly.
You will be using the Git
version control system to manage source code in your programming labs
and to hand in assignments,
as will be laid out in Lab 1.
Attendance at lectures is expected but will not be recorded. Students are,
however, fully responsible for all material presented in lectures, even
if some of it does not appear in the "official" lecture notes.
Class attendance is recommended strongly.
Lecture notes will be made available,
though they are by no means guaranteed to be a complete record of the class
and cannot substitute for class attendance.
The best way to contact the instructor and
the TA is by electronic mail.
To get help quickly,
your best bet is to send email to
where it will be seen only by the instructor and TA,
firstname.lastname@example.org, where your
message will also be forwarded to every student in the class.
Use of the whole-class mailing list is encouraged
especially in the case of clarifications or debugging questions,
since it is likely that other teams will be encountering
the same or similar difficulties that you are
and may offer the quickest answer.
All the course-related information will be kept on the web
Copyright (c) 2000-2010
Department of Computer Science,