Freenet 0.7

From
Revision as of 15:26, 13 October 2007 by Pollex (talk | contribs) (→‎References)
Jump to navigation Jump to search

Freenet is a Peer-to-Peer network with main focus on security and anonymity (for both authours and consumers of information). It is based on a paper(pdf) by Ian Clarke. In 2005 Freenet was completely rewritten because of a new design concept that was to be implemented - the Darknet. This new version of Freenet, namely version 0.7, is not compatible to any older versions. This article deals with Freenet 0.7 exclusively.



What is Freenet ?

The freenet is a network with the goal to provide as much anonymity as possible while still scaling well to large networks. In order to do that, it uses certain concepts of network/graph theory. These are:

  • The Darknet Concept

and

  • The Small World Concept

. It is vitally important to understand these concepts before trying to understand how the freenet works. So let us first discuss these Concepts...

The Darknet Concept

A Darknet is a Network of Nodes (People, Computer, Cells, ...) in which certain statements hold. These are:

  • each node has a list of other nodes it trusts (friends)
  • the friends relation is symmetric (at least as far as we know)
  • traffic only flows between friends
  • nodes have no specific knowledge of non-friend-nodes

These Statements are solely concerned with the privacy of users and pose strong restrictions on communication. They usually result in either isolated networks (meaning, that there is NO way for non-friends to exchange data) or very bad scalability (requests are breadth-first-searches which scale very badly).

Freenet tries to avoid these problems by using theories of another concept. Which is the small world

The Small World Concept

The small world concept is a property in graphs (it holds especially in social networks). It was first introduced with the Milgram-Experiment. The small world property states that the number of edges of a certain length (in some arbitrary distance-measure, in our case distance in the key-space) is inversely proportional to the length. This basically means, that the number of long edges is small while the number of short edges is high.

a small World
Not a small World

Given a small world it is possible to track a certain node in few steps by using the long edges to traverse long distances and the short nodes to exactly localize the node in the keyspace.

Philosophy

Freenet has the freedom of informationexchange as it's primary goal. Legal and moral concerns that are connected with free and untraceable exchange of data, like sharing of illegal music- or videofiles, or even contents contemptuous of the human rights, are secondary to that goal. Ian Clarke argues, that they are in fact mutually exclusive :

"You cannot guarantee freedom of speech and enforce copyright law. It is for this reason that Freenet, a system designed to protect Freedom of Speech, must prevent enforcement of copyright."

- Ian Clarke, Freenet Philosphy

Matthew Toseland, the main developer of Freenet, also states that :

"Legality is irrelevant, the whole point of a darknet is that it is hidden and has a reasonable chance of survival despite running a node being illegal."

- Matthew Toseland Discussion on the possibility of censorship in Freenet, Freenet mailing list archives

Freenet is designed to be used in countries where free exchange of information is illegal. If there was any way of censoring Freenet, it would mean that it has failed to fulfill it's goals.

Another important point is, that Freenet regards itself as a research project. It is not known if Freenet ever fulfills its initial goals and might yet take some time to find out.

Anonymized Peer-to-Peer

Freenet is a Peer-to-Peer (p2p) network, meaning that it has no central component at all. Participating computers (nodes) either communicate directly with each other or the messages are relayed by other nodes in between. All nodes equal. A p2p network can be characterized by how it implements the following five aspects :

  • Input: How is information submited
  • Transmission: How is the information handled by the network
  • Storage: How is information stored on an individual node
  • Database: How can information be found again
  • Output: How is information retrieved

Each of these aspects can give away information about the users of the network. Freenet tries to provide all this functionality while maintaing anonymity for the user in every aspect. This means :

  • Author anonymity - The author of any information cannot be traced after submiting his content to the network.
  • Data robustness - Once information has been submitted to the network, it cannot be changed or removed on porpose, not even by the true author himself. Also access to that information cannot be restricted, transmissions cannot be filtered.
  • Deniability - Information is stored on different nodes. A node has no control, wich information will be stored in it's storage. The owner of a node cannot be held responsible for the contents on his computer.
  • Reader anonymity - Requests for information cannot be traced back to their origin. A user may search for information without fear of surveilance.

Freenet does not (yet) anonymize the fact of participation itself. However it regards itself as an experimental Stegonet. A stegonet is a network, wich cannot be detected. It cannot be know who participates and where and when messages are sent. The word is a composition of the words Steganography, meaning the hiding of messages, rather than encrypting them, and Network. There is an ongoing discussion in the Freenet mailing lists on whether that is even possible. Another approach being discussed is the Sneakernet. This is an approach where information is not transmitted over the internet or some fixed network, but rather by mobile devices such as cellphones or even portable harddrives. This would abviously lead to very high latency, but might be virtually undetectable. This idea is, however, far from being implemented yet.

Goals and Use cases

The main goal of Freenet (0.7) is to provide a way to safely exchange information in a "hostile environment", such as an oppressive dictatorship. Most important in that respect is as much anonymity as possible. This includes:

  • Author-anonymity: It is impossible to trace the originator of any document
  • Sender/Distributer-anonymity: It is impossible to trace the one that is sharing/spreading/distributing a document
  • Viewer-anonymity: It is impossible to find out who read/downloaded a document

. Additionally Freenet wants to provide the following features, that don't relate direcly to anonymity:

  • Robustness against content removal, while still retaining
  • the ability to update data and
  • allow user input

.

To further illustrate these goals we decided to wrap them in use-cases which apply to certain circumstances that might appear in reality. These are:

Qi: The Newsprovider

Qi

  • lives in the US,
  • has a lot of Friends in China,
  • publishes Copies of News-sites to the freenet for his friends,
  • is interested in viewer-anonymity for his friends and
  • doesn't want Chinese government to delete or manipulate his data

.

Anonymous reading

Since Qi and his friends share a ring of trust, he'll publish the data through one of his friends. If one of his friends wants to read the data he'll most likely ask Qis node directly (if available) or go through one of his friends if that one is "closer" to the data. Thus retrieval of data is only known by direct friends. Since they are trustworthy by definition your anonymity is protected.

Once you can't trust your friends anymore your anonymity is spoiled.

Freenet China

The freenet china does exactly do that: copying content of western and chinese, censored newssites to the freenet.

Shawn: Author-anonymity

Shawn

  • lives in the US too,
  • likes to rip audio/video content,
  • publishes the his rips on the freenet,
  • is highly interested in author-anonymity

.

Insert content anonymously

  1. generate a key for the content
  2. do a (special insert-)search using the (generated) key and a HTL (pretty much a TTL)
  3. key found ? FAILURE : SUCCESS
    1. ONFAILURE: Original content is send to you, thus spreading it on the retrieval-path
    2. ONSUCCESS: send the object reversely along retrieval-path the thus, spreading content

Technical Implementation

Filekeys

To store a file in Freenet it needs to be associated with some kind of key. From this key one can get the Reference (URI) of that file within the network. There are two main requirements to Freenet file references. First they need to allow for file authentication. Specifically, any node needs to be able to verify, that a certain file actually belongs to the reference it claims. This has to work without knowing the contents of the file. During the retrieval of data files may be transmitted through an arbitrary number of intermediate nodes, each of which may choose to cache the file for faster satisfaction of future requests. Therefore, it is desirable, that each node may verify the authenticity of any file that runs through it.

The second requirement is encrypted requests. If anyone requests a certain reference (and thus the data associated with that reference), no one else should be able to tell, what the contents of that file are.

Any user that wants to retrieve a file from the network always needs the complete key. For searching the file however, you only need the reference. The difference is that the reference does not contain decryption information for the file. So anyone who just knows the reference of a file cannot see its contents.

There are five types of filekeys. Keyword-signed keys (KSK), Signed Subspace Keys (SSK) and Content Hash Keys (CHK) are the three main keytypes. Updatetable Subspace Keys (USK) and Revocable Subspace Keys (RSK) are just mechanisms wrapped around SSKs for convenience. It follows a detailed description of some of the keytypes. All files are encrypted symmetrically. Anyone requesting a file needs to know this symmetric encryption key. This holds for all keytypes except KSKs. This is so that any node that transmits or stores the file can plausibly deny all knowledge of its contents.

We only explain the concept of these keytypes here. Since Freenet is under development and very poorly documented, we are not able to provide reliable information on details like wich hashfunctions are used, or wich encryption algorithms they employ. The best way to get such information is to check the sourcecode.

Signed-Subspace Keys

A Singed-Subspace Key is made up of three parts :

  • a public/private keypair
  • a descriptive string
  • a symmetric encryption key

First the file is encrypted using the symmetric encryption key. Next the encrypted file is signed using the private part of the generated keypair. Then both the public part of the keypair and the descriptive string are hashed, the hashvalues are concatenated, and that string is then hashed again to yield the hash under which the file will be inserted into the network along with the public part of the keypair (so that nodes can verify the file). Thus, these two make up the file reference. If a person wants to retrieve the file, they need to know the descriptive string, the public part of the keypair and the encryption key. Obviously, SSKs cannot be communicated easily. This is an open problem. Either you publish all necessary data under a Keyword-Signed Key, which has apparent security risks, or you distribute the key data through other (possibly unsafe) channels.

SSKs are the most secure of the keytypes. The signing of the files prevents evil nodes from manipulating their contents. As will be explained in the section Storing, not even the true author of a file may change its contents once it has been inserted under a certain key. This is one of the points where the tradeoffs that Freenet makes in order to provide high security become apparent. If a file contains a Freesite (Freenets equivalent to Websites), the URL of that Site changes, everytime it is updated. A workaround to that weakness are the USKs. Since the key is hashed before it is published, no node can draw any conclusions on the content of the file just from knowing it's key.

Keyword Signed Keys

These keys are no more than a descriptive string. The file is encrypted using a key that is deterministically generated from that string, so anyone who knows the string can decrypt the file. The same mechanism is used for signing the file. These keys are very unsafe and should not be used to store vital or dangerous information, because since both encryption and signature key are known to anyone who knows the string, anyone can change the content of the file. Their main advantage is, that they are human-readable. In the case of KSKs the file key also is the reference.

Content Hash Keys

A Content Hash Key (CHK) is really only the hashvalue of the (encrypted) file it belongs to plus the encryption key. These are of course not human readable. The reference in that case is only the hashvalue. Assuming that the hashfunction is collisionfree, no signature mechanism is needed for these keys, since the reference itself is the signature and can easily be verified.

The idea is, that large files get split into many small parts, each of which is inserted under its CHK. Then another file, containing a list of all these CHKs is inserted under a Signed-Subspace Key.

Updateable Subspace Keys

These keys are a wrapper around SSKs. They are intended to resolve the problem that file inserted under SSKs cannot be changed/updated. They look like SSKs with a version-number at the end. Basically the client allways searches for the site with the highest version-number.

Storing and caching

Each Freenet node provides some harddisc space to the network. Half of that is used as cache, the other half as permanent storage. As will be explaind in the section on the retrieval of data, all traffic allways tracks back the path of the original request. Every node caches every bit of data it propagates in it's cache using a LRU-policy. Thus, popular content is spread throughout the network and performance increases for such files.

Node positioning

To understand how the storing of file works, it is important to understand the positioning of each node in the reference space. The reference space is the one-dimensional space of all possible reference values. Each node has a position in this reference space. Initially this position is given at random. The only way to change that is for two nodes to switch positions. As was explained above, it is an important property of a small world that each node has many short connections and few long ones. So the two node may decide to switch their positions if it minimizes the overall product of their respective edge-length. This idea was originally proposed by Oscar Sanderg in his paper "Distributed routing in Small World Networks"(pdf). It is important to realize that switching positions in the keyspace does not affect the links of a node. After switching eahc node will stil have the same neigbours as before, only the distances to them will have changed. Also, the data a node has stored becomes virtually useless, because requests for these files would be routed to the old position of the node. Therefore swaping must not happen too often, or the overall performance of the network will decrease.

This algorithm has some security implications when implemented. Every node only has knowledge of its local peers. In order to initialize a swap, it will send a swap request wich will be randomly routed for a certain number of hops. If the node at the other end is currently able to swap (it must not be processing any requests), it will say so, else the swap fails here. Then both nodes have to agree on weather the swap would be usefull. The could do that by exchanging their edge products and their locations. Thus each node could calculate its new edge product and decide weather it is beneficial to swap. The problem with that protocol is, that it is easy to lie and enforce swaps, wich could be exploited for several attacks. In order to make it harder to forge information, the protocol currently implemented exchanges the positions of the nodes and all their peers. This is more secure with respect to enforcing swaps, but it lays open parts of the network topology, wich weakens security.

Sotring of files

As was mentioned above, each node simply stores those files, that have a key wich is nearer to its position than to that of any of its neighbours. Inserting a file is done in two steps. First, the node searches for the file. If the file is found, then there is a collision in the reference space and the insert is rejected. Also, all nodes that routed the search-requests will cache the original file, thus resulting in spreading the information that an attacker may have tried to overwrite. So the insert only happens if the file was not found. The node that terminated the requests did so because none of its neighbours was closer to the reference that itself (this is explained in more detail in Retrieval). This means that this node is the best choice to store the file we want to insert, because there other nodes will most likely search for that reference. So, the file we want to commit is transfered to that node. Note that all communication is always done along the path of the original requests. Shortcuts would violate the prinicple of the Darknet.

Retrieval

Based on the fact that each node has a position in the reference space and stores only those files, to wich it is nearer that any of its neighbours, Freenet implements a simply greedy routing algorithm for search requests. A request for a key is sent with a certain hops-to-live-limit in order to ensure that it terminates at some point. Every node routes the request to the neighbour with the position nearest to the requested reference. If a node does not have a neighbour that is nearer to the requested reference that itself, it does not route the requests any further, but instead sends it back with an answer ("have file", "don't have file").

References

  • I. Clarke, O. Sandberg, B. Wiley and T.W. Hong, "Freenet: A Distributed Anonymous Information Storage and Retrieval System", Workshop on Design Issues in Anonymity and Unobservability, 2000 [1](pdf)
  • O. Sandberg, "Distributed Routing in Small World Networks", 2005 [2](pdf)
  • I. Clarke, O. Sandberg, Lecture on Freenet 0.7, 22nd Chaos Computer Congress The Talk, [3](video)
  • I. Clarke, O. Sandberg, Lecture on Freenet 0.7, 22nd Chaos Computer Congress - The Slides, [4](bz2 archive)
  • I. Clarke, O. Sandberg, Lecture on Freenet 0.7, 22nd Chaos Computer Congress - The Java Demo, [5](bz2 archive)