(a) Unbiased Gnutella
(b) Gnutella with Oracle
Figure 2: Visualization of Gnutella overlay topology
Table 1: Number of exchanged Gnutella message types
This leads us to conclude that consulting the oracle for neighbor- hood selection, during bootstrapping stage as well as file-exchange stage, leads to significant increase in localization of P2P traffic.
After extensive simulations on general overlay graphs and Gnu- tella system, we now confirm these results by modifying P2P clients, namely Gnutella, to take advantage of the oracle service in a con- troled setting, a Testlab.
networks very efficiently, by traversing lesser overlay hops, which is reflected in Table 1. Thus information is propagated with lesser message hops, lower delays and reduced network overhead.
Localization of content exchange: The negotiation traffic traverses within the set of connected Gnutella nodes, but the actual file content exchange happens outside the Gnutella network, using the standard HTTP protocol. When a Gnu- tella node gets multiple QueryHits for its search query, it chooses a node randomly and initiates an HTTP session with it to down- load the desired file content. Since the file content is often bulky, it is prudent to localize this traffic as well, as it relates directly to user experience. In the above experiments, we use the oracle to bias only the neighborhood selection. In other words, when a node comes online, it consults the oracle and sends connection requests to an oracle-recommended node selected from its Hostcache. How- ever, while choosing a node from the QueryHits, it so far did not consult the oracle. We now analyse how much of the file content exchange remains local in this case and how much one can gain if one consults the oracle again at this stage.
We observe that the intra-AS file exchange, which is 6.5% in the unbiased case, improves slightly to 7.3% in case of oracle with list size 100, and to 10.02% in case of oracle with list size 1000.
We then further modify the neighborhood selection, so that a node consults the oracle again at the file-exchange stage, with the list of nodes from whom it gets the QueryHits. After this change, we notice that 40.57% of the file transfers now occur within an AS. In other words, 34% of file content, which is otherwise available at a node within the querying node’s AS, was previously downloaded from a node outside the querying node’s AS.
Using 5 routers, 6 switches, and 15 computers, we configure four different 5-AS topologies: ring, star, tree and random mesh. Each router is connected to 3 machines, and each machine runs 3 instances of Gnutella software, where one is an ultrapeer and the other two are leaf nodes. Thus, we have a network of 45 Gnutella nodes, each running the GTK-Gnutella software . A router is taken as an abstraction of an AS boundary.
We modify the source code of the Gnutella nodes, so that when a node wishes to join the network, it sends the contents of its Host- cache to the oracle. The Hostcache of each node is filled with a random subset of the network nodes’ IP addresses. The oracle is a central machine accessible to all Gnutella nodes, and running the oracle’s neighbor selection algorithm. When it gets a list of IP ad- dresses from a node, it ranks the list according to AS hops distance. Hence, the Gnutella node joins another node within its AS if such a node is present in its Hostcache, else it joins a node from the nearest AS.
We experiment with two schemes of file distribution. In the uni- form scheme, each node shares 6 files each. In the variable scheme, each ultrapeer shares 12 files, half the leaf nodes shares 6 files each, and the remaining leaf nodes share no content. We thus have 270 unique files with real content.
We run two sets of experiments: unbiased Gnutella and Gnutella using oracle. We generate 45 unique search strings, one for each node, and allow each node to flood its search query in the net- work. Each node searches for the same query string in both the experiments. We then calculate the total number of Query and QueryHit messages exchanged in the network and analyze whe- ther biased neighbor selection leads to any unsuccessful content search which was otherwise successful in unbiased Gnutella. We
ACM SIGCOMM Computer Communication Review
Volume 37, Number 3, July 2007