Project 3: Building a BitTorrent Client


Overview:

In this assignment, your job is to design and implement a simple BitTorrent (BT) client that reads a .torrent file containing metadata for the file to be downloaded, connects to a tracker as well as multiple peers, and is able to upload and download to seeds and peers simultaneously. This means you will have to manage the state of several connections and work with multiple threads. Your client will connect to seeds, peers, and leeches to download a single video file (.mov) from that peer. When the file is finished downloading, you will save it to the hard disk. All communication will be done over TCP; and should also accept incoming connections from other students' clients.You will not have to worry about choking any peers, but you will have to make sure to unchoke them and keep them up to date on what pieces your client has downloaded. You only have to download a single file from the torrent file.

Design Reviews:

To make sure that you have a working design before you attempt to implement it, you should meet with a TA to review your design. These meetings will be held around ten days before the due date, and should take around 20-25 minutes. Please bring a copy of your design for the TA to read. In addition, you will turn in your final design document reflecting the actual implementation one day after the phase is due (might not change from the design review document if you design well).

Submitting the Project:

Your project should be submitted by email as either a .zip or .tar file., and the "main" starting method should be in a file called mybtclient.java.

Assignment Outline:

  1. Resources for this project .
  2. Java and torrent files
  3. BitTorrent client operation
  4. Bencoding
  5. Communicating with the Tracker
  6. Communicating with Peers
  7. Helpful Hints and Programming Style
  8. Tasks
  9. Design Questions

Resources for this Project:

It is strongly recommended that you bookmark or download the Sun Java 1.5.0 API , as well as read the following pages:

Java and Torrent Files:

The following files will allow you to concentrate on the networking portion of the project and without having to worry about how to bencode or "unbencode" the torrent file and some of the communication.

BitTorrent Client Operation:

Your BitTorrent client should basically do the following:

  1. Take as a command-line argument the name of the .torrent file to be loaded and the name of the file to save the data to. For example:
    java mybtclient project3.torrent myvideo.wmv
  1. Open the .torrent file and parse it using the Bencoder class to decode the data. 
  2. Allocate the total space for the file (named by the second command-line argument) on the disk before starting the download of the file.
  3. Open a TCP socket to the tracker at the IP address and port specified by the TorrentFile object and send an HTTP GET request to the tracker.
  4. Decode the response from the tracker and extract the IP adresses and ports of the list of peers. You must extract these IP addresses from the list -- hard-coding it is not acceptable.
  5. Open TCP connections to the peers and begin downloading simultaneously from several of them.
  6. Download each piece of the file, verify its SHA-1 hash against the hash stored in the .torrent file, and then writing the appropriate pieces into the file. The first time you begin the download, you need to contact the tracker and let it know you are starting download.
  7. After a piece is downloaded and verified, notify the peers that you have completed the piece. You can keep track of valid pieces using a java.util.BitSet.
  8. Other peers should be able to request pieces that you have verified from your client, and your client should send those pieces to the peers, according to the BT protocol.
  9. Repeat steps 7-8 (using the same TCP connections) for the rest of the file. Periodically, your client should update its status to the tracker by repeating steps 4-6 (using the same TCP connections). The update period should be no less than the interval returned by the tracker. This may change during execution, so it should be updated each time the tracker is contacted (if the interval value is excessively large, you may cap it at 180 seconds). 
  10. Make sure your client has uploaded at least 10% of the file size before closing your program.
  11. When the file is finished, contact the tracker and send it the completed event and properly close all connections.

In addition to the above basic run-through of your program's execution, your program should be able to detect input from the user at any point (you can decide what the input should be), and allow the user to close the program, suspending download of the file. The user should be able to restart the program at a later time and resume downloading. Any non-verified pieces/blocks should be discarded before suspending, and the program should also save how much it has uploaded and downloaded for that torrent. How you choose to implement saving the state of your program is up to you.

Bencoding (Pronounced "Bee Encoding"):

Bencoding a method of encoding binary data. Tracker responses and inter-peer communication are bencoded, according to the BT protocol as shown in the following list (taken from http://www.bittorrent.org/protocol.html)

Communicating with the Tracker: 

Your BT client uses the tracker's IP address and port number supplied by the TorrentFile object to contact the tracker, and send it an HTTP_GET request with the following key/value pairs. Note that these are NOT bencoded, but must be properly escaped [this list is taken from http://www.bittorrent.org/protocol.html ]:

The response from the tracker is a bencoded dictionary and contains two keys:

Communicating with Peers:

Handshaking between peers begins with character nineteen (decimal) followed by the string 'BitTorrent protocol'. After the fixed headers are 8 reserved bytes which are set to 0. Next is the 20-byte SHA-1 hash of the bencoded form of the info value from the metainfo (.torrent) file. The next 20-bytes are the peer id generated by the client. The info_hash should be the same as sent to the tracker, and the peer_id is the same as sent to the tracker. If the info_hash is different between two peers, then the connection is dropped.

After the handshake, messages between peers take the form of <length prefix><message ID><payload> , where length prefix is a 4-byte big-endian value and message ID is a single decimal character. The payload depends on the message. Please consult either of the BT-related resources for detailed information describing these messages. Below is a list of messages that need to be implemented for this project.

The bitfield message may only be sent immediately after the handshaking sequence is completed, and before any other messages are sent. It is optional, and need not be sent if a client has no pieces.

The bitfield message is variable length, where X is the length of the bitfield. The payload is a bitfield representing the pieces that have been successfully downloaded. The high bit in the first byte corresponds to piece index 0. Bits that are cleared indicated a missing piece, and set bits indicate a valid and available piece. Spare bits at the end are set to zero.

Below is an example of what would take place between two peers setting-up a connection and starting sharing.

  1. The local host opens a TCP Socket to the remote peer and sends the handshake packet. The local host then listens for the remote peer to respond.
  2. Upon receiving the handshake packet and verifying the info_hash, the remote peer responds with a similar handshake packet (except with its peer_id). The remote peer then listens for the local host to send a bitfield or other packet. The remote host can send a bitfield packet to the local host at this time.
  3. Upon receiving the handshake and verifying the info_hash, the local host then (optionally) sends a bitfield packet which tells the remote peer which pieces it has downloaded and verified so far.
  4. If the local host is interested in what the remote peer has downloaded, then it sends an interested packet, otherwise it sends an uninterested packet. If the remote peer is interested in what the local host has downloaded, then it sends an interested packet, otherwise it sends an uninterested packet.
  5. When the local host, or the remote peer, is ready to download/upload to the other, it will send an unchoke packet.

Please note that clients will, and your client should, ignore any request packets received while a remote peer is choked. A client should only upload to another client if the connection is unchoked AND the remote peer is interested. This means that a remote peer will not reply to the local hosts's request packets unless you have expressed interest AND the remote peer has sent an unchoke packet. If the remote peer sends a choke packet during data transfer, any outstanding requests will be discarded and unanswered - they should be re-requested after the next unchoke packet.

Helpful Hints and Programming Style:

Writing maintainable code is just as important as writing correct code. Consequently, you will be graded on your style. Below is a list of suggestions to make your code easier to understand and maintain:

                          /*
               Precondition: b is non-zero.
               Postcondition: returns a / b.
              */
             double divide(double a, double b) {
               return (a / b);
                      }               int i = 0;     //i is an integer and the initial value is 0

is not useful and unnecessarily clutters the code.

Tasks:

  1. (5%) Correctly setting up the TCP sockets/connections
  2. (20%) Correctly interfacing with the tracker -- Retrieving the peer list, periodically updating the tracker, updating the tracker when quitting the application, and updating the tracker after completing the download.
  3. (20%) Correctly interfacing with  peers -- Handshaking with peers, maintaining local state, sending and receiving keep-alive messages.
  4. (20%) Correctly downloading from at least 2 peers simultaneously
  5. (20%) Correctly uploading to at least 2 peers simultaneously
  6. (15%) Correctly downloading and verifying the file

Design Questions:

Your design document should include the following:

Please answer the following questions in your design document (due one day after the code for this phase is due).

  1. How long did you originally think this project would take you to finish?
  2. How much time did you actually spend on this project?
  3. What parts of the project were the most challenging and which were the easiest?
  4. Did you have trouble finding resources for the project? 
  5. Were any parts of the project confusing to you? 
  6. Any additional feedback about the program -- please provide constructive/useful comments or insights.  

This assignment is derived from one developed at Rutgers University for CS 352.