This project will develop, test and pilot technologies to deliver on
the promise of the URL, or "Uniform Resource Locator": that
information placed online can remain there, even amidst network or
endpoint disruptions. The project's approach is to enable operators
of Web servers to enter easily into mutual aid arrangements, such as
mirroring other participants' content and having the deed
reciprocated, so that the failure of any one participant to remain
online allows others to preserve what was there. This project is
designed to improve the resiliency and robustness of the Web in a
wide variety of Internet contexts around the world, offering an
alternative route to Web content in the event of intentional
blocking, hacking or denial of service attacks, or unintentional
hosting or server failures. The project's ultimate aim is to make
for a more robust and stable Web, from Silicon Valley to London to
Beijing to Tehran.
Problem Statement and Project Goals
Across the globe, content on the Web relies increasingly on just a handful of centralized entities to host that information. With greater online centralization comes greater vulnerability, whether the centralization is public or private. Corporations that host others’ content are subject to pressure by repressive governments and criminal or terrorist threats. If Amazon’s Web hosting services were to go offline tomorrow, millions would be affected. In more restrictive Internet environments such as China, the Internet can be shut off bluntly at just a few state-controlled choke points or by more fine-grained filtering methods by private companies. Non-state actors—such as pro-government sympathizers—also increasingly play a role in disrupting content by shutting it down where it is based, in particular through DDOS and hacking attacks. [1] Existing circumvention technologies provide no defense against this frequent form of Internet control.
Members of the Berkman Center propose to pilot a unique, alternative method for accessing content that is suppressed by DDOS attacks or technical filtering. We propose to develop and test a mutual aid system that allows communities of existing users who host content (e.g. bloggers, human rights groups or independent media sites) to mirror each other’s content. [2] To implement this project we plan to create server-side, and as necessary, client-side software extensions for one or two popular Web browsers and server software packages. We will robustly test and facilitate expert peer-review of the proposed technical fixes.
The end goal of the project is to implement these tested and robust
protocols with a pilot community of users who agree to mirror each
other’s content (“mirror as you link”) and to then—based on
additional assessment, review, and learning—devise strategies for
further roll-out and adoption on a larger scale. The final pilot
community will include foreign-language content hosts and
participants from countries where the Internet is censored or
otherwise restricted. The research and development team is prepared to address the many
human and political issues inherent in building these pilot
communities and to form the organizational relationships required to
do so. However, we will include such participants only if expert
security review and project testing can ensure their safety.
Ultimately, this project is intended to create a set of new
protocols that are applicable in any country and by a range of
different possible users.
Technical Approach
This project will build on top of, or extend, current Web protocols to allow for an elective partnership among content providers to mirror remote content on their local hosts. The mirrors would be capable of serving the content should their peer host be unavailable for viewing by Internet users for any reason. The elective aspect of this service builds upon the established power of mutual aid and partnership on which the Internet was founded and on which it has grown. Because of this, it is important to architect efficiency in the infrastructure so that the tool will be used without undue burden on peer hosts or slowing down user access.
This model seeks to reduce or entirely remove the onus from the average user, and instead relies on the actions of the more highly motivated and technologically proficient groups and individuals who create and host sensitive content, such as human rights groups and independent media sites. The average readers or users surfing the Web would only need to download a Web browser plug-in, and the user experience would not be slowed down or otherwise limited. This is a unique approach to accessing filtered or otherwise blocked content and does not duplicate any existing circumvention technology. This focus on a supply-side response to Internet disruption and censorship responds to obstacles to greater adoption of alternative circumvention tools, as documented in our prior study of circumvention tools and adoption behavior. We estimate that currently only 3% of Internet users in countries that engage in substantial filtering use circumvention tools. The actual number is likely considerably less. [3]
By mirroring content, this project also allows these groups to fight pervasive DDOS attacks, against which current circumvention tools are no use, as content will be hosted on multiple servers. This approach also allows access to content blocked through standard technical filters. Finally, if these protocols are adopted on a larger scale, they will make the Internet itself more robust, ensuring fewer dead or missing links and greater access to content for all Internet users.
A Decentralized Solution to the Broken Link
The work envisioned in this proposal would focus first on designing and implementing a decentralized response to inaccessible Web pages, whether these Web pages are down because of intentional blocking, malicious attacks, unintentional connectivity problems, or just a broken link. The decentralized version of this approach will be based on the cooperative efforts of content hosts to mirror each others’ content in order to offer an alternative path to a given news story, blog post or video in the event that one of the hosts goes down. The accessibility of the content is determined from the vantage point of the consumer. It is not necessary to determine the reason that the inaccessible content generator is down; we are most interested in the fact that the consumer cannot access the resources and in providing an alternative route to the same content.
The envisioned opt-in system will be constructed on a cooperative community of content providers, including hosts who wish to have their content mirrored and other hosts who are willing to mirror this content. The motivations for wishing to have one’s content mirrored could be several-fold: concern over being blocked by their government or being subject to a distributed denial of service attack; concern over the reliability of housing content on one hosting platform or server; or more generally the desire to have a more permanent and distributed archive of one’s content. Similarly, there are a number of possible motivations for those hosts that would mirror the content of others: participants may be interested in contributing to efforts to resist censorship or in diminishing the impact of malicious attacks on Web sites; participants may want an archive of materials to which they link to ensure the availability of sources for their readers; or participants may be interested in more broadly contributing to increasing the robustness of the Internet.
The link-and-mirror system begins with a content provider, creator or host (site A in Figure 1) that signals to others its desire or willingness to have its content mirrored – and in turn its willingness to do so for others. This digital flag would provide for accurate control of what content should be added to the mirrors and would allow the system to acquire metadata on the content, such as date of publication, date of mirror, size of data and URL.


Figure 1
Once the content is marked with the flag, other participants in the system will then have the option to mirror the content. When a mirroring host (site B in Figure 1) links to site A, the system's software notes the flag and sets up the mirror. The mirror is operational once the system's software on site B can provide a complete mirror of the content on site A. The operational status of the system could be confirmed by altering the visual appearance of the flag on site A to indicate that site A's content is now mirrored by the system. A content consumer browsing the resources on site B follows the link to site A. If site A is inaccessible on the network at the time that the consumer is requesting the site, the reader would normally have no recourse as the site would be offline completely. In the link-and-mirror system, the consumer's browser would fail over gracefully to the mirrored content and the consumer would still have access to the resources and information contained on site A, though now accessing the mirrored version of the content hosted on site B.
From the consumer's point of view, the failover to use the system's mirrored content can happen in a multitude of ways. Ideally this would happen in a manner that is both seamless and transparent for the consumer. A notification would likely be necessary to inform the consumer that the site they requested is currently inaccessible and that the content they requested is being served to them from the system’s mirrors. This notification would also serve as a mechanism to increase awareness about the system—to both prospective mirroring hosts and those that wish to be mirrored—and will therefore help to increase participation in the system. The system is self-reinforcing: the greater number of participants in the system, the greater the chance that one of the mirrors is accessible to the consumer.
We envision that many mirroring hosts would mirror content they link to. The design of the system would seek to make mirroring standard and easy to implement when generating links to other sites. A possible default option for participants would be to mirror all content, to which they link, that has a mirror flag. The mirroring hosts could also mirror content to which they have no links, such as an archive.
Scalability of the system is of primary importance. This requires matching relatively simple but robust technological tools with participant preferences and behavioral aspects. The simplest version of the system would be based on reciprocity—those that seek out mirroring hosts would also provide mirroring to others. This could be based on communities with personal relationships, communities that coalesce around certain themes and topics, or participants that contribute to the system without any particular relationship to other content providers or their content. A possible outcome is an emergent ecosystem of mirroring hosts and content providers where participants engage in various levels that will vary by their personal preferences and access to computing resources (Figure 2). Evaluating and understanding the incentives for individual participation in this system are an important aspect of the first phase of the project such that this knowledge can be incorporated into the design and architecture of both the technical and social aspects of the system.

Figure 2
The technical implementation approaches evaluated in the first phase of this project will include extensions that could be applied to popular server-side applications (such as Apache) and client- side browsers. Neither of these potential focal areas would preclude the exploration of other architectural approaches. The goal of the technical implementation is to fully incorporate a solid understanding of the behavioral aspects of the system and to limit if not eliminate the burden on Internet users to implement the system.
Complementary Centralized Components of the System
The scalability and expected strength and resilience of this system would be in large part contingent on it being a decentralized system. Both the resource needs for individual participants and the vulnerability to malicious attacks are mitigated by broad participation. However, there could also be complementary elements to the system that are more centralized which could contribute to the scalability, coverage and utility of the system. More centralized archives, akin to the Internet Archive or Google's caching feature of its search engine, could act as larger peers in the system and could elect to proactively retrieve the content by searching for mirror-me flags across a broad range of Web sites. This could help with preserving and propagating content that might be rendered inaccessible before smaller participants in the system respond to mirroring requests.
There are many questions and considerations to take into account with this more centralized use case, some technical and some policy oriented. For the system to be able to send consumers to valid mirrors the system would have to maintain a list of valid mirrors. This could be done by each mirror sharing a list of friendly peers or by having a distributed clearinghouse of friendly peers. For a pilot among academic parties, plaintext communication and a simple list of friendly peers would be sufficient. However a full review of security implications will be essential to make the service robust and trustworthy when deployed under real-world conditions.
Special focus should be directed towards mechanisms to retrieve mirrored content from other servers without divulging the locations of these mirror servers. Encryption, diversion, dummy requests and distributing fragments to multiple mirrors could all play a part in the method used to obscure the locations of the mirrors.
Distributing the fragments across the valid mirrors could also have the benefit of distributing the load across the peers. The system would have to keep in mind that the aggregate of the whole would be too expensive for any one site to shoulder. The system could limit the hardship of the mirrors by allowing limits on storage and bandwidth. In addition, expiration of the content that is mirrored from a certain date can help reduce peer load while pruning stale content.
One of the key objectives for year one will be to determine the drawbacks and benefits of a server-side only approach, versus one that uses both server and client-side software extensions. Under an ideal scenario, only a server-side module would be required because it does not require users to take the second step of installing a browser plug-in. However, it is not clear how a participant would mirror content if it did not host its own content. Further, an average Web surfer in a more restrictive Internet environment would not be able to take advantage of the link and mirror system unless he or she had a browser plug-in installed. There are a number of questions such as this that we will address through research and interviews with technologists and potential users in the first year of the project.
A Collaborative Approach to Development and Implementation
The mutual aid system will only be as successful as its network of participants. The research team will convene expert advisors and voluntary participants to assist, inform and evaluate the project in each of these phases, including organizations with knowledge of the issues faced by foreign communities. Potential implementations will be developed using open protocols and in a participatory manner. Our technology will be open to peer-review and will build on our experience gained through our evaluations of existing circumvention technologies as well as our research on DDOS attacks, surveillance and technical filtering. We will have security experts involved at every stage of the project to ensure that participants are kept safe and to make the service robust and trustworthy.
[1] Ethan Zuckerman, Hal Roberts, Ryan McGrady, Jillian York, and John Palfrey, “Distributed Denial of Service Attacks Against Independent Media and Human Rights Sites,” Berkman Center for Internet & Society, December 2010, available at: http://cyber.law.harvard.edu/sites/cyber.law.harvard.edu/files/2010_DDoS_Attacks_Human_Rights_and_Media.pdf
[2] Jonathan Zittrain, “The Fourth Quadrant,” 78 Fordham L. Rev. 101 at 109 (2010); and Jonathan Zittrain, “A Mutual Aid Treaty for the Internet,” The Brookings Institution, Future of the Constitution Series No. 8, January 2011, available at: http://www.brookings.edu/papers/2011/0127_internet_treaty_zittrain.aspx
[3] Hal Roberts, Ethan Zuckerman, Jillian York, Robert Faris, and John Palfrey, “2010 Circumvention Tool Usage Report,” Berkman Center for Internet & Society, October 2010, available at: http://cyber.law.harvard.edu/sites/cyber.law.harvard.edu/files/2010_Circ...
Last updated August 27, 2012