Documentation of Internet Filtering in Saudi Arabia
Zittrain* and Benjamin Edelman**
Berkman Center for Internet & Society
Harvard Law School
[ Overview - Specific Blocked Pages - Analysis & Summary Statistics - Conclusions ]
Abstract: The authors connected to the Internet through proxy servers in Saudi Arabia and attempted to access approximately 60,000 Web pages as a means of empirically determining the scope and pervasiveness of Internet filtering there. Saudi-installed filtering systems prevented access to certain requested Web pages; the authors tracked 2,038 blocked pages. Such pages contained information about religion, health, education, reference, humor, and entertainment. See highlights of blocked pages. The authors conclude (1) that the Saudi government maintains an active interest in filtering non-sexually explicit Web content for users within the Kingdom; (2) that substantial amounts of non-sexually explicit Web content is in fact effectively inaccessible to most Saudi Arabians; and (3) that much of this content consists of sites that are popular elsewhere in the world.
A 2001 Council of Ministers Resolution prohibits users within the Kingdom of Saudi Arabia from publishing or accessing certain content on the Internet. The government's Internet Services Unit (ISU) operates the high-speed data links that connect the country to the international Internet; while Saudi internet users may subscribe to any of a number of local internet service providers, all Web traffic is apparently forwarded through a central array of proxy servers at the ISU, which implements Internet content filtering roughly in line with parts of the Resolution. If a user's requested URL is found on the Saudi blacklist, the user is directed to a page that explicitly informs him or her that access to the site has been denied. The ISU administrative web site explains the implementation of the government's content filtering regime, presents the reasoning behind it, and lets Saudi internet users request that a particular site or URL be blocked or unblocked. Citing to the Qur'an as a basis, the government describes its task with filtering as "preserv[ing] our Islamic values, filtering the Internet content to prevent the materials that contradict with our beliefs or may influence our culture."
In addition to detailing Saudi blocking of sexually explicit content, the ISU web site lists as bannable "pages related to drugs, bombs, alcohol, gambling and pages insulting the Islamic religion or the Saudi laws and regulations." Such non-sexually explicit sites are said to be blocked only upon the direction of security bodies within the Saudi government. The ISU describes its policy as filtering only the "absolute minimum possible number of web pages possible to fulfill its duties."
As with most filtering regimes, whether implemented at the client, ISP, or government level, no list is made available of the sites blocked. We therefore sought to collect and distribute a list of blocked sites and pages -- a list that is large in absolute terms even if small relative to the size of the Internet and to the total amount of blocked content, and a list that is diverse even if not perfectly representative of all blocked content. Such a list allows us and others to begin to assess the nature and scope of filtering within Saudi Arabia, with particular attention to non-sexually explicit Web sites rendered inaccessible there. Having requested some 64,557 distinct web pages and found 2,038 to be blocked, we conclude that Saudi Arabia indeed blocks a range of web content beyond that which is sexually explicit. For example, we found blocking of at least 246 pages indexed by Yahoo as Religion (including 67 about Christianity, 45 about Islam, 22 about Paganism, 20 about Judaism, and 12 about Hinduism). We also found blocking of 76 pages within Yahoo's humor categories, 70 within music categories, and 43 within movies, and we found 13 blocked pages about homosexuality. Taken as a whole, the Saudi government's stated blocking criteria are quite broad, making it difficult to assess whether the blocking of a given site is consistent with the criteria. However, a look at the list beyond sexually explicit content yields some insight into the particular areas the Saudi government appears to find most sensitive.
In future work, the authors intend to expand analysis to Internet filtering systems in other countries and to generate URLs to test based on queries invoked in the local language. Sign up to receive updates. The authors are also developing a distributed application for use by Internet users worldwide in testing, analyzing, and documenting respective Internet filtering regimes. Get more information and sign up to get involved.
Specific Pages Found to be Blocked
With the permission and cooperation of ISU staff, we obtained access to the ISU's proxy servers from May 14 to May 27, 2002. During that time we requested 64,557 distinct URLs drawn from various web indices, and we were able to determine which specific Web pages among them were blocked from within Saudi Arabia. We found that entire sites could be filtered, or individual pages within them.
Filtering of Sexually Explicit Content
A preliminary round of testing examined 795 distinct URLs containing sexually explicit images. These URLs had been used as the basis for a portion of one author's expert testimony in American Library Association v. United States, 201 F.Supp.2d 401 (E.D.Pa., 2002). An expert for the plaintiffs had generated this list by collecting all 797 results from Google in response to an October 2001 Web search using the search criteria "free adult sex," less two pages removed because they were found not to include sexually explicit images. Of these 795 pages, 685 (86.2%) were blocked while 110 (13.8%) were accessible.
Filtering of Other Content
Our main testing examined 63,762 web pages drawn from categories other than sexually explicit content. These pages were extracted from selected areas of the Yahoo Directory (detailed below); from Google's "Similar Pages" feature (requesting pages similar to pages in certain Yahoo categories); and from ordinary Google searches. Of the tested pages, a total of 1,353 were found to be blocked. Some of these blocked pages may fit the second half of Saudi Arabia's stated blocking profile ("related to drugs, bombs, alcohol, gambling, and pages insulting the Islamic religion or the Saudi laws and regulations"), a small number may actually be sexually explicit, while still others may be examples of overblocking, i.e. blocking of pages beyond Saudi Arabia's stated blocking criteria.
Given the large number of pages blocked, we have organized our listing of specific blocked pages into highlights (a subset of blocked pages that are well known or otherwise of possible interest) followed by the full list. Where available, each page's listing includes its HTML title as well as META keywords and description, its Yahoo Directory and Google Directory classifications, and information about past snapshots of the page available in the Internet library archive.org. These details are as retrieved in June 2002.
Specific web pages blocked in Saudi Arabia
Highlights of blocked pages - pages that are well known or otherwise of particular interest
Complete list of 1,353 pages, sorted alphabetically by URL
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z numbers
Raw testing results (.ZIP file, >800KB)
For the duration of our limited access to the ISU proxy server system, we retested pages initially found to be blocked in order to determine whether blocking continued over time and whether ISU staff ever reversed decisions to block certain content. Our testing indicated that four blocked pages became unblocked during the course of testing: swim-n-sport.com (an online swimsuit catalog) was blocked on May 14, 19, and 22, but was accessible in testing of May 24 and 27. The front page and one additional page of theonion.com (an online humor magazine) were found to be blocked on May 19 and May 22, but they too were accessible in testing of May 24 and 27. Finally, warfarerecords.net (a pay-per-click search engine) was blocked in testing of May 14, 19, and 22, but it was also accessible on May 24 and 27. Our inference from these results is that ISU staff may periodically revisit blocked site logs to restore access to certain blocked sites; however, given the small number of sites unblocked during the sampled time period, we are uncertain of the prevalence of this procedure.
Analysis & Summary Statistics
The blocked web pages cover a wide variety of substantive areas. To get a better sense of the types of pages blocked, we have organized the blocked sites within the Yahoo hierarchy where possible. For each blocked URL, Yahoo categories were obtained by entering the blocked URL into Yahoo's ordinary search interface.
Of the 884 web pages with at least one listing in Yahoo's web directory, pages were included in the Yahoo categories as reported in the following tables:
Blocked pages by Yahoo category - collapsible outline (requires Internet Explorer)
Count of pages found to be blocked in Saudi Arabia
by Yahoo top-level category, second-level category, third-level category, and fourth-level category
Among the specific blocked pages are the following categories of content:
Among the pages tested were many thousands not affected by the Saudi filtering system. We attempted to access many sites based on our initial knowledge of what content is blocked in other countries worldwide and of what content might be of particular concern to the Saudi Arabian government. We found that news sites, US government sites, and Israeli government sites (excluding the Israel Defense Force) could all be viewed as usual. We also found that the overwhelming majority of education sites remained accessible.
Conclusions and Future Work
Since our listing of blocked pages is not and cannot be perfectly representative of content blocked in Saudi Arabia, it is difficult to draw sweeping conclusions about the Saudi blocking system. On the basis of the blocked sites we have found, we do conclude (1) that the Saudi government maintains an active interest in filtering non-sexually explicit Web content from users within the Kingdom; (2) that substantial amounts of non-sexually explicit Web content is in fact effectively inaccessible to most Saudi Arabians; and (3) that much of this content consists of sites that are popular elsewhere in the world.
Use of others' work to assist filtering. The ISU reports that it delegates to its filtering software provider the preparation of a list of pornographic sites to be blocked. Should the ISU choose to "farm out" such work, those reviewing sites or creating filtering lists can be anywhere in the world and still, from a technical perspective, effectively implement their blocks within Saudi Arabia. Such delegation also accords with a New York Times account from November 2001 which described the competition among nearly a dozen mostly American software companies to provide content filters and reported that Secure Computing's Smartfilter was currently in place. ("Companies Compete to Provide Internet Veil for the Saudis," New York Times, November 19, 2001. Archived at websense.com.) Accordingly, it is likely that the Saudi Arabian blocking system inherits whatever categorization errors are made by the current provider of proxy and filtering software; some such errors are documented in one author's previous Sites Blocked by Internet Filtering Programs. While ISU's "filtering procedure" page reports that Saudi Arabia blocks sexually-explicit content on the basis of determinations made by its filtering software provider, reviewing the list of specific blocked pages suggests that the ISU may also have engaged categories of the filtering program that pertain to drugs and to personal home pages. Smartfilter includes both of these categories in its control list.
Indeed, the Yahoo categories that provided the basis for a portion of our queries to the Saudi proxy servers could themselves be used to help determine sites and pages for blocking. However, review of Yahoo-listed sites blocked suggests that there has been no wholesale adoption by the Saudi filterers of Yahoo categories listing Web pages within sensitive substantive areas.
Effectiveness of the Web filtering regime. The significance of the contents of the Saudi filters depends in part on the robustness of the filtering system against those who seek to bypass it. One common method of bypassing a filtering system is via independent, non-filtered proxy servers that can intermediate access requests. For example, a Saudi user might request from megaproxy.com that megaproxy.com give the user a copy of some blocked page; if the Saudi user can access megaproxy, this approach ordinarily bypasses Saudi filtering since megaproxy's Internet access is unfettered by Saudi network policy. However, the Saudi filtering system blocks access to megaproxy.com as well as a large number of other well-known proxy servers, suggesting that Saudi filtering administrators are well aware of this loophole and have sought to close it. Such "loophole" sites include not just proxy servers but also privacy protection systems and web page translators; further testing shows that such services are also blocked in Saudi Arabia.
Since the best-known methods of circumventing filters are blocked in Saudi Arabia, our sense to date is that the Saudi filtering system is likely relatively effective in constraining the information accessed by most Saudis. At the same time, we expect that the tech-savvy users can devise new methods to circumvent blocking. However, should savvy users share their methods with many additional users, Saudi network staff would likely work to close newly-exposed loopholes; we therefore conclude that filtering is likely to remain effective over time. In addition, since Saudi network staff can review access logs of accepted web requests, even expert Internet users can never fully know whether a given circumvention method will yet yield an investigation or even criminal sanctions by Saudi network staff. It remains unknown whether other methods of circumventing filtering -- peer-to-peer applications, for example -- are successful or even usable on the Saudi network. The authors' tests were limited to ordinary http requests lodged on default port 80 of the desired Web pages.
Popularity of sites blocked. The significance of the Saudi blocking system depends in part on the relative popularity of blocked sites; if blocked sites would be frequently accessed by Saudis (if accessible to them at all), the blocking is in a certain sense more constraining than if the blocked sites would be of little interest. Certain of the sites found to be blocked seem to be quite popular without specific reference to localized surfing variations, as measured by the number of inbound links from other Web pages. Google reports that 48,700 distinct links point to pages at the ivillage.com Women's Network (all of which appears to be blocked); 18,100 to the cards.webshots.com eCards site; 15,300 to the terra.es Spanish-language portal; 13,100 to the theonion.com humor magazine; and 9,470 to the systransoft.com translator. Furthermore, archive.org change-tracking histories report that many of the blocked sites change frequently; the rollingstone.com magazine site was found to offer at least 461 distinct front pages between 1997 and 2001; the hecklers.com comedy site, 263; the brutal.com news site, 150. While Saudi Internet users may seek access to sites other than those most linked by Internet authors worldwide and other than those that change most frequently, these link and change counts suggest that at least some of the blocked sites are of substantial interest to Internet users.
Future work might seek to investigate some or all of the following issues:
* Jack N. and Lillian R.
Berkman Assistant Professor of Entrepreneurial Legal Studies, Harvard Law School.
** J.D. Candidate, Harvard Law School, 2005.
Support for this project was provided by the Berkman Center for Internet & Society at Harvard Law School.
Last Updated: September 12, 2002 - Sign up for notification of major updates and related work.