CPSC 426/526: Building Decentralized Systems, Fall 2013
[an error occurred while processing this directive]
- MW 1:00-2:15 PM, Room 200 AKW
- Bryan Ford,
Room 411 AKW, 432-1055
- Office hours: Mon & Wed 2:30-3:30 PM, or by appointment.
- Teaching Assistant:
- Ennan Zhai,
Room 404 AKW, 415-5957
- Office hours: Tue & Thu 2:30-4:30 PM, or by appointment.
- (Ennan will be in the office (not zoo) during the office hours.)
- URL: http://zoo.cs.yale.edu/classes/cs426/2013
This course explores practical principles and techniques
for building decentralized systems.
What is a decentralized system?
Basically, a distributed system that works although no one is in charge.
For purposes of this course,
a distributed system is a set of computers
that are physically distributed but can communicate via some form of network.
A decentralized system is a particular kind of distributed system:
namely, a set of computers that are under the control
of multiple separate owners or administrative authorities, not just one,
who wish to use their networked computing facilities to communicate,
organize, or share computing resources,
even if not all the machines' owners are guaranteed
to trust or “play nicely” with each other.
The course takes a hands-on approach,
emphasizing learning by doing.
The course's lectures will present and discuss challenges,
known techniques, and open questions
in decentralized system design and implementation.
Lectures will often be driven by examination
of real decentralized systems with various purposes
in widespread use the past or present,
such as UseNet, IRC, FreeNet, and Tor.
Throughout the course we will explore
fundamental security and usability challenges
such as decentralized identification and authentication,
denial-of-service and Sybil attacks,
and maintenance of decentralized structures undergoing rapid changes (churn).
During the semester,
students will develop a small but usable peer-to-peer communication application
that reflects a few of the important design principles and techniques
to be explored in the course,
such as gossip, social trust networks, distributed hash tables,
and byzantine consensus algorithms.
The labs will designed so that solutions can initially be tested individually
on private, virtual networks running on one machine (e.g., a zoo machine),
then tested collectively by attempting
to make different students' solutions interoperate on a real network.
There is currently no formal textbook for this course.
The Internet is your textbook.
We will offer informal lecture notes, various reading assignments,
and pointers to relevant information on the Web:
see the schedule and the
reference page in particular.
Beyond these informal materials,
you will be expected to exercise—and sharpen—your
“information scavenging” skills and creativity
in order to figure out how to accomplish the tasks you are assigned.
Although you are not required to buy any textbook for this course,
you may find it very useful to have on hand,
or reference in the library,
one or more standard distributed systems textbooks such as:
Introduction to Systems Programming and Computer Organization,
or with CS 426 instructor's permission,
comparable C/C++ systems programming experience.
This course is still under development
and all parts of the syllabus should be viewed as tentative
and subject to change.
See the schedule page for details,
but covered topics are likely to include:
- Communicating: Networking Foundations.
Addressing, forwarding, routing.
Connection-oriented versus connectionless modes.
Client/server versus peer-to-peer communication.
Firewalls, NATs, traversal.
- Gossip: a foundation for decentralized systems.
UseNet: technical, security, and social lessons.
Randomized rumor-mongering and anti-entropy algorithms.
- Communicating Securely: Basic Cryptographic Tools.
Symmetric-key encryption. Hash functions, message authentication.
Diffie-Hellman key exchange.
Decentralized identity via public-key encryption, digital signatures.
- Trust and Reputation.
Authorities, trust networks.
Sybil attacks and defenses.
- Finding Stuff in a Big World: Naming and Lookup.
Just ask around: request flooding.
Hierarchical directories and landmark structures.
Distributed hash tables.
- Following a Moving Target.
Location services, reference points, forwarding.
- Anonymous Communication.
Onion routing, mix networks.
Voting, verifiable shuffles, homomorphic encryption.
- Fireproofing Alexandria: Decentralized Storage.
Parity, erasure coding.
- Content Distribution.
Opportunistic caching (FreeNet).
Content integrity: hash trees, hash file systems.
Swarming downloads: BitTorrent.
- Gaining perspective.
Spam, malicious content.
Review/moderation and reputation systems.
Leveraging social networks (Peerspective).
Balancing local and global viewpoints.
- Decentralized Collaboration.
Network file systems, version management. Consistency.
Disconnected operation, weak consistency, conflict resolution.
- Distributed Consensus.
Paxos. Accountability (PeerReview). Byzantine fault tolerance.
- Mobile Code and Agents.
Privacy: trusted computing, fully homomorphic encryption.
Franchises, Hosting Incentives.
Decentralized virtual organizations.
You will be using the Intel Linux PCs
in the Zoo
You may access them either locally on the third floor of Watson Hall,
or remotely via the following command,
which will log you into a randomly-chosen Zoo machine
in order to balance load on the cluster:
To access these PCs, you can either directly login from their consoles
in the Zoo, or just remotely login from other
machines across the campus.
If you plan to take the course for credit, you should get an account
on these machines in the first week. Please also visit the following web
site to create a cs426 class directory (or just to sign up for a zoo account):
Do not allow anyone else to use your accounts for any purpose.
They are for your use alone, and you are responsible for any misuse.
Your passwords control access to your accounts and should be kept secret.
The above weights are subject to minor adjustments during the semester.
- Lecture participation (10% of grade)
- No formal roll call, but participation important
Some lectures will have associated
preparatory homework assignments,
labeled PREP: in the schedule.
- Homeworks will typically involve
and answering a few questions
about assigned reading material.
- Written homework assignments must be turned in
at the beginning of the associated lecture.
- Late homework will be accepted, with no penalty,
the first two-week “shopping period.”
No late homework accepted after shopping period.
- Grade based on homeworks and class participation
- Midterm exam (20% of grade)
- In-class: see schedule
- The scheduled midterm is open books/notes,
but no electronics (eg. laptops).
- Unless prior arrangements are made,
a grade of zero will be recorded for a missed exam.
- Labs (50% of grade)
- You will build a small, working peer-to-peer application.
- The protocols will be specified by us;
will have to supply the implementation.
- All programming labs to be completed individually
except where noted.
- In the spirit of building decentralized systems,
your code will need to interoperate with
the (distinct) solutions of other students.
- We will provide a framework
enabling you to test the interoperability
of your solution with the reference solutions
and those of other students.
(More details later.)
- We will code in the C++ language using the
which is already installed on the Zoo.
- Final project in the last 2–3 weeks:
teams of up to 3 allowed,
and each team gets its choice of project (with approval).
- All labs must be turned in to pass the course
- Labs typically due Fridays at 11:59PM.
- 8 free late days throughout course.
- But no credit for any single lab more than 3 days late
(typically meaning Monday at 11:59PM).
- Final project (20% of grade)
- You will need to define your own final project,
get it approved by the instructors,
implement it, and demo it to the class
during the final exam period.
There will be no regular, written final exam.
Programming, like composition, is an individual creative process.
Individuals must reach their own understanding of the problem
and discover a path to its solution. During this time,
discussions with friends are encouraged.
However, when the time comes to write the code that solves the problem,
such discussions are no longer appropriate:
each student's code must be the work of the members of that student alone
(although you may ask teaching assistants or lab assistants
for help in debugging).
In your coding you are encouraged to adopt ideas suggested
by classmates or other reference sources,
but must carefully acknowledge the sources of those ideas
in your own code and/or documentation.
Do not, under any circumstances, copy
another student's code. Writing code for use by another or using
another's code in any form violates the University's academic regulations and
will be dealt with harshly.
Academic integrity is a core institutional value at Yale. It means, among other
things, truth in presentation, diligence and precision in citing works and
ideas we have used, and acknowledging our collaborations with others. In view
of our commitment to maintaining the highest standards of academic integrity,
the Graduate School Code of Conduct specifically prohibits the following forms
of behavior: cheating on examinations, problem sets and all other forms of
assessment; falsification and/or fabrication of data; plagiarism, that is, the
failure in a dissertation, essay or other written exercise to acknowledge
ideas, research, or language taken from others; and multiple submission of the
same work without obtaining explicit written permission from both instructors
before the material is submitted. Students found guilty of violations of
academic integrity are subject to one or more of the following penalties:
written reprimand, probation, suspension (noted on a student’s transcript) or
dismissal (noted on a student’s transcript).
You will be using the Git
version control system to manage source code in your programming labs
and to hand in assignments,
as will be laid out in Lab 1.
Attendance at lectures is expected but will not be recorded. Students are,
however, fully responsible for all material presented in lectures, even
if some of it does not appear in the "official" lecture notes.
Class attendance is recommended strongly.
Lecture notes will be made available,
though they are by no means guaranteed to be a complete record of the class
and cannot substitute for class attendance.
The best way to contact the instructor and
the TA is by electronic mail.
To get help quickly,
your best bet is to send email to
where it will be seen only by the instructor and TA,
firstname.lastname@example.org, where your
message will also be forwarded to every student in the class.
Use of the whole-class mailing list is encouraged
especially in the case of clarifications or debugging questions,
since it is likely that other teams will be encountering
the same or similar difficulties that you are
and may offer the quickest answer.
All the course-related information will be kept on the web
Copyright (c) 2000-2010
Department of Computer Science,