kau logo

How the Great Firewall of China is Blocking Tor

chinese filtering infrastructure This study investigated how the Great Firewall of China (GFC) is blocking the Tor anonymity network. Tor is an overlay network which provides its users with anonymity on the Internet. A more detailed explanation is available on the project website or in the design paper.

A large number of so-called entry guards and bridge relays serve as the entry points to the network. If these entry points are not reachable, a user finds herself unable to connect to the Tor network. Since Tor is used more and more to circumvent censorship systems, countries such as China are trying to block these very entry points.

According to recent reports, China's firewall is now able to dynamically recognise Tor usage and block the respective relays and bridges. The diagram to the right illustrates how the block works. In a nutshell, 1) the firewall searches for a bunch of bytes which identify a network connection as Tor. If these bytes are found, 2) the firewall initiates a scan of the host which is believed to be a bridge. In particular, 3) the scan is run by seemingly arbitrary Chinese computers which connect to the bridge and try to “speak Tor” to it. If this succeeds, the bridge is blocked.

Effective countermeasures build on a sound understanding of the filtering in place. For this reason, this study was conducted with the goal to reveal and understand the inner workings of the blocking infrastructure. The contributions of this study are threefold:

  1. We reveal how Chinese users are hindered from accessing the Tor network.
  2. We conjecture how China's Tor blocking infrastructure is designed.
  3. We discuss evasion strategies.

This web site contains the published papers, the developed software and the gathered data. For a short summary of our work, you can have a look at the media articles in MIT Technology Review, V3.co.uk and The Verge.

Papers

We first published a technical report about our findings which was then followed by a peer-reviewed workshop paper. We recommend to read the workshop paper which is listed below as it contains several updates and corrections over the original technical report. The ;login: article below contains less technical details and is easier to read.

Software

All of the code listed below is licensed under the GPLv3.

Data

A large part of our study consisted of the analysis of data we gathered by attracting numerous Chinese scanners. We configured all our bridges to be private (i.e., only known to ourselves) and to listen to randomly chosen TCP high ports. That way, we can be sure that our data only contains automated scanners and no real users. The raw data is available below.

Filename Description
scanning_connections.csv Connection data of all scanners which were found connecting to our bridge. The file contains four columns: IP address, source port, UNIX timestamp when the scanner connected and the sent data.
scanners_asn.txt The autonomous system numbers of the scanning IP addresses. The ASN was resolved using Team Cymru's IP to ASN Lookup tool.
scanners_reverse_lookup.txt All valid reverse DNS lookups of the observed scanners. The lookups were done using Google's open DNS server 8.8.8.8.

Contact

Feel free to contact Philipp using phw at nymity dot ch.
You can encrypt e-mails by using this OpenPGP public key.