Paper review: Freenet: A Distributed Anonymous Information Storage and Retrieval System

Reviewer: Hanlin D. Qian

This paper addresses some problems that traditional file storage systems have. Some of these problems are outlined below:

  1. Files are stored in a few hosts, which is a central point of failure. Denial of service attacks or simply system failures can make it difficult to maintain a robust system.
  2. There is very little user privacy. A user who stores information cannot conceal his identity or the information itself. A node that provides storage space cannot conceal its identity without hindering its ability to deliver the stored information.
  3. Malicious third parties can alter stored information.
The design principles of Freenet address all of these problems. They focus most importantly on the issue of security. Below are some design goals:
  1. Anonymity for both producers and consumers of information
  2. Deniability for storers of information
  3. Resistence to attempts by third parties to deny access to information
  4. Efficient dynamic storage and routing of information
  5. Decentralization of all network functions

Below are some important contributions and specifics of the Freenet Protocol:

  1. Every request and insert has a transaction ID, a hops-to-live counter, and a depth counter. A request is forwarded over a limited number of nodes, prescribed by the hops-to-live and the depth counter.
  2. Replication of data is achieved because when a request is found, then the data is replicated over every node along the path of the request. Since there is no "original" copy of data, every copy of the data is equally valid.
  3. If an insert results in a collision, the original copy of the file (not the new file) is propagated over the insert path. This prevents malicious insertions to disturb existing data.
  4. The hops-to-live is incremented and decremented with random probability so that he original sender of the request can be obscured.
  5. The request key is encrypted as a hash, and so is the file contest itself. Therefore the nodes storing someone else's files cannot access the stored information.
  6. Each node maintains a routing table of only limited number of nodes.
  7. When the storage capacity of a node is exceeded, the files are replaced using the Least Recented Used (LRU) algorithm. Therefore, stored files are not guaranteed to live forever. Frequent requests of the file is needed to maintain it on the network.

I give this paper a rating of 4 for significant contribution. I think this distributed peer-to-peer model is a significant improved over current server based central file storage systems.

There are many limitations to Freenet, however. The design goal of security of Freenet makes the retrieval of stored information difficult. The paper mentions several ways of classifying data into tree-like directory structures. These directories are hard to update and difficult to maintain. The paper also suggests many layers of encryption and indirection to get around some nomenclature issues; however, this approach is complicated and not scalable. There lacks an efficient keyword search system. (Using file indirection for keyword searches is hard to manage and bulky.) Ofcourse, I believe that anonymity can ease of file retrieval is a tradeoff, and a compromise needs to be achieved. One can't have the best of both. Another limitation is that files need to be accessed to be kept on the storage network. This design issue can be seen as a feature, since it relieves the responsibility of cleaning out old files. However, it is also not too practical if the user just wants to store some files for backup purposes and only needs to access them when needed.

The methodology of the paper is convincing. However, I think the simulations done are too simple. I also would like to know how well Freenet adapts to network congestion. The paper mentions that a node can time out if it does not receive a reply from a downstream node after a period of time, but how does one set the timer?

Future work including figuring out a better naming structure for Freenet and better information retrieval methods, like keyword searches. Also, a better way to locate nodes may be useful. Also, what are some incentives for people to use Freenet? Are people willing to give up free hard drive space in return for such a service? These are just some of the questions that need to be addressed.