Navigation auf uzh.ch
VIOLA measures the BitTorrent Network in a distributed manner. In contrast to existing BitTorrent measurement systems (cf. Section II), which typically take snap-shots of the overlay network, VIOLA is able to monitor a large number of swarms over an extended period of time. VIOLA is deployed on one master node, which gives instructions to slave nodes deployed on smaller machines. The data gathered by slaves is returned to the master and stored in a database. VIOLA discovers torrents from a torrent portal and starts to measure those torrents discovered.
A paper using this data was presented at NOMS 2016.
If you have questions contact Andri Lareida
The data model used in persisting the collected data follows the actual objects measured (see Fig. 1). The relational database consists of three tables TORRENTS, ANNOUNCE RESULT, and PEERS. The TORRENTS table contains information about the torrent itself, such as title and size. Furthermore, it contains meta data used in measurement, such as the ACTIVE flag defining if a torrent should still be measured. The ANNOUNCE RESULT stores announce meta data from announces executed by slaves, e.g., the IP address of the slave executing the announce, the tracker the reply was received from, or the number of seeders and leechers reported by that tracker. Since a torrent is identified by the info hash, it is used as a foreign key to link the announce data to the torrent meta data. Finally, actual IP addresses of peers returned from the tracker are stored in the PEERS table, which contains among others IP address, port number, AS number, and the country code, which are resolved through geo IP databases from Maxmind.
Start | 07.04.2015 19:00 |
End | 20.04.2015 11:00 |
Slaves | 10 |
Interval | 20 min |
Index | Kickass Torrents |
Category | Movies |
The measurement period started at 19:00 hours on April 7 and lasted until 11:00 hours on April 20. The number of VIOLA slaves used was 10, which were all located at the premises of the University of Zurich. The announce interval — the time in which each slave queries trackers of each torrent — was 20 minutes. New torrents were discovered from the Kickass Torrents portal, and only torrents released after the start of the measurement were considered.
The data set is available as 3 comma separate values (CSV) files, the download is about 50 GB large. A description of the files columns can be found in the following tables:
INFO_HASH | String | Hexadecimal representation of the 160 bit info hash. |
TORRENT_TITLE | String | Title of the download |
TORRENT_SIZE | String | Size of the download in bytes |
TORRENT_TRACKER_COUNT | String | Number of trackers used. |
TORRENT_COMMENT | String | Comment from the torrent portal. |
PUBLISH_DATE | Number | The timestamp on which the torrent was first published on the portal. |
MAGNET_URI | String | The magnet link for this torrent. |
TIME_ADDED | Number | The timestamp from which on the torrent was measured. |
TIME_DEACTIVATED | Number | Timestamp of the time when VIOLA stoppped measuring the torrent. |
TORRENT_LINK | String | Download link to download the torrent file. |
ID | Number | Used to link peer rows to announce result. |
INFO_HASH | String | Hexadecimal representation of the 160 bit info hash. |
TRACKER_URI | String | URI string of the tracker that replied this result. |
INTERVAL_NUMBER | Number | Count of the request round of the VIOLA system. |
ANNOUNCE_COMPLETED | Boolean | Inidcated if the announce request was completed successfully. |
SEEDERS | Number | The number of seeders in the swarm as reported by the tracker. |
LEECHERS | Number | The number of leechers in the swarm as reported by the tracker. |
TOTAL_PEERS | Number | Sum of reported seeders and leechers which equals the swarmsize. |
RETURNED_PEERS | Number | Number of IP addresses returned by the tracker. |
SLAVE_IP | String | IP address of the machine that queried the tracker. |
SLAVE_PORT | Number | Port number of the connection used by the slave to contact the master. |
TIMESTAMP | Number | Timestamp from the moment when this announce response was stored to disk. |
ID | Number | Used to link peer rows to announce result. |
INFO_HASH | String | Hexadecimal representation of the 160 bit info hash. |
TIMESTAMP | Number | Timestamp from the moment when this announce response was stored to disk. |
HEX_IP_HASH | String | Hexadecimal representation of the hashed and well salted IP address. |
PORT | Number | Portnumber of the peer. |
ASNUMBER | Number | Number of the Autonomous System (AS). |
CONTINENT | String | Continent code. |
COUNTRY | String | Country code. |
CITY | String | City name. |
The data collected by VIOLA combines aspects of different measurement systems. Those results can now be achieved with one single measurement run. Two examples of data contained in the set are given here.
Fig.6 provides a detailed insight into the composition of the largest swarm fast7, as reported by trackers. A swarm consists of seeders — peers that have the complete file — and leechers — peers that are still downloading the file. It took three days after release of the torrent until the number of seeders and leechers broke even. The amount of seeders is constantly increasing, while the number of leechers decreases after the initial peak. This means that leechers become seeders and do not immediately leave the system after they completed their download. Furthermore, the total number of peers increases again after April 15. Peers show an altruistic behavior and free riding is not a problem in this case.
Fig. 4 depicts the number of unique IP addresses measured per continent over the 24 hours of April 13. Although, India had the most unique IP Addresses on this day, Europe in total had more. The time zone patterns are clearly visble, even for those continents with few IP addresses, e.g., Oceania (OC) and South America (SA). North America (NA) and SA are very much in sync with their peak at 04:00 hours, followed by Asia (AS) and Europe (EU). Europe, spanning 3 hours in time difference, has the narrowest peak while Asia, spanning 9 hours, has a very smooth peak. NA and the other continents with even fewer peers show smooth transitions as well.