Internet Filtering in China in 2004 2005
However, the filtering system can be bypassed inserting an ampersand (&) into the HPPT GET
request, such as search?&q=cache , allowed access to Google's cache.
176
We could not access Web sites
with sensitive keywords, such as falun, in their URLs. The filtering mechanism appears specifically
designed to target Google's cache, since caches of other popular search engines, such as Yahoo!, worked
properly.
177
Although China no longer blocks Google entirely, a Chinese user will have a very different
experience when using the search engine for some queries due to the state's filtering practices. Accessing
Google's cache is a well known method of ad hoc circumvention of Internet censorship, and China's
filtering mechanism seems designed specifically to close this loophole.
I. Filtering by Chinese Search Engines Baidu and Yisou
In July 2004, Reporters Sans Frontieres admonished Google and Yahoo! as complicit in China's
filtering practices based on the companies' holdings in two domestic Chinese search engines, Baidu.com
and Yisou.com.
178
We researched RSF's claims and confirmed that Baidu and Yisou filter by keyword and
remove certain search results from their lists, but found that some keyword searches were blocked by
China's gateway filtering and not the search engines themselves.
We searched Baidu and Yisou for various sensitive keywords, such as free Tibet and falun. By
refining our searches to look specifically within URLs as well as for page content, we concluded that the
search engines index sensitive sites, but do not list them among search results. We posit that the search
engine crawlers that compile results may be able to index content despite China's filtering perhaps by
operating from a remote location outside China since the crawlers did index some sensitive sites. We
also found that some cached versions of sensitive sites were sporadically available, leading us to conclude
that filtering occurs upstream, at the Internet infrastructure level.
Interestingly, our Baidu and Yisou testing provided important insight into the mechanics of
China's Web filtering. When a user requests a banned keyword, the filtering system terminates that user's
connection to the destination server by sending a TCP RST (reset) packet to the user, followed by
advertising a TCP ZeroWindow size.
179
This technique uses TCP's flow control feature to prevent the
user's computer from transmitting additional data to the destination server (such as the Baidu.com search
engine). This disconnection persisted for prolonged periods despite multiple attempts to reconnect.
We confirmed partially Reporters Sans Frontieres' claims that the search engines Baidu and
Yisou, with which Google and Yahoo! have investment relationships, filter the Web content they return
when users search for certain sensitive keywords. However, this is only part of a set of complex,
176
See the enumeration report documenting our test results at
http://www.opennetinitiative.net/bulletins/006/googlecacheservers mod.html.
177
See the enumeration report documenting our test results at
http://www.opennetinitiative.net/bulletins/006/othersearchengines.html.
178
Reporters Sans Frontieres, Google Yahoo Market Battle Threatens Freedom of Expression, at
http://www.rsf.org/article.php3?id_article=11031 (July 26, 2004).
179
See OpenNet Initiative, Probing Chinese Search Engine Filtering, at
http://www.opennetinitiative.net/bulletins/005/#res (Aug. 19, 2004); see generally Von Welch, A User's Guide to TCP
Windows, at http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html (last updated June 19, 1996).
49