X hits on this document





6 / 10

Mean Node Degree





Mean Path Length 4.1 4.2 4.3 4.4 4.5

Mean AS Distance

1.8 2.0


0.8 1.0

Intra−AS connections


2K 4K 6K 8K10K



50 100 Oracle list size





50 100 Oracle list size





50 100 Oracle list size





50 100 Oracle list size



(a) P2P node degree

(b) Overlay path length

(c) Underlay AS distance

(d) Intra-AS P2P connections

Figure 1: Error plots showing comparison of metrics with increasing size of Oracle list.

We have coded the Gnutella protocol within the packet level network simulator SSFNet [40]. The Scalable Simulation Frame- work (SSF) [36] is an open standard for simulating large and com- plex networks. Written in Java, it supports discrete-event simula- tions. SSF Network models (SSFNet) are Java models of differ- ent network entities, built to achieve realistic multi-protocol, multi- domain Internet modeling and simulation at and above the IP packet level of detail. These entities include Internet protocols like IP, TCP, UDP, BGP and OSPF, network elements like hosts, routers, links, and LANs, and their various support classes. The network topologies are defined using the Domain Modeling Language (DML), and the SSFNet class instances autonomously configure and instantiate themselves by querying these DML configuration files. The coding for the lower layers of the IP stack is thus pro- vided by SSFNet, while we implement the Gnutella protocol as an SSFNet application [40].

We modify the neighbor selection procedure of Gnutella to take advantage of the oracle [41]. Normally, when a Gnutella node con- nects to the network, it gets a list of popular Gnutella node ad- dresses in its Hostcache [42], which is a locally maintained Gnutella hosts list, typically containing a few hundred IP addresses. The node chooses a random subset of the Hostcache, and initiates Gnu- tella peerings with these selected nodes. We modify this procedure so that the Gnutella node sends the contents of its Hostcache (list of IP addresses) to the oracle, which then picks a node within the querying node’s AS if it exists, or a random node otherwise. The node then establishes a Gnutella peering with this oracle-preferred node. This way, we influence the neighborhood selection of Gnutella network, to choose a peer within the AS if it exists. Moreover when a Gnutella node receives query results for its search requests, it again consults the oracle to select the nearest node from whom it then downloads the file content.


Simulation setup

Gnutella nodes itself, and stops accepting incoming connections from other nodes, once it is connected to 45 nodes, be they leafs or ultrapeers. Each node shares between 0 and 100 files, uniformly distributed.

To take churn in P2P systems into account, each node remains online for a minimum of 1 and a maximum of 1500 seconds. Once a node goes off-line, it may become online again after a time period between 1 to 300 seconds. For a start, we take these time periods as uniformly distributed but we are in the process of migrating to more precise distributions, as recently revealed in [39]. Furthermore, a leaf node must be online for at least 600 seconds before it can serve as an ultrapeer. At any given point of time in our simulations, we fi n d t h a t 2 0 2 5 % n o d e s a r e o f f - l i n e 3 , a n d a q u a r t e r o f t h e o n l i n nodes are functioning as ultrapeers. e

We ran multiple simulations for arbitrary lengths of time and found that the startup phase of the simulation lasts for about 500 seconds. After 5000 seconds of simulation time, the summary statistics do not show significant changes. Therefore we run our simulations for 5000 seconds.



We first analyze the Gnutella network graph according to the metrics explained in Section 2, followed by an evaluation of some Gnutella specific metrics like scalability of network, number of messages exchanged, localization of file content exchange and vi- sualization of topology.

We run three different experiments on five different topology in- stances with roughly the same number of search queries and the following parameters for the Gnutella nodes:

  • Cache size = 1000, without oracle

  • Cache size = 100, with oracle for neighbor selection

  • Cache size = 1000, with oracle for neighbor selection

The topologies are derived using the methodology explained in Section 2.2. The network consists of a total of 25 ASes and 1000 nodes. More specifically it consists of 1 level-1 AS, 8 level-2 ASes and 16 level-3 ASes. We place 360 nodes within the level-1 AS, 40 nodes within each level-2 AS, and 20 nodes within each level-3 AS. Within each AS, all the nodes are connected in a star topology to an intra-AS router. Each node in level-1 AS has a 1 Gbit network interface, each node in level-2 AS has a 100 Mbit network interface, while each node in level-3 AS has a 10 Mbit network interface. The links between level-l and level-2 ASes have a delay of 2 ms, while the links between level-2 and level-3 ASes have a delay of 10 ms. Each AS has 2 routers, one for the intra-AS node connections, and one for the inter-AS connections between different ASes. Thus, we have a topology with 25 ASes, 50 routers and 1000 nodes running the Gnutella protocol.

Note that in our implementation, each Gnutella node sends the contents of its Hostcache to the oracle, which ranks the list of IP addresses according to proximity from the querying node. In other words, the above three cases correspond to experiments with oracle list size of 1, 100, and 1000 respectively. The success rates of the search queries are similar.

To explore the influence of consulting the oracle on the network topology we visualize, in Figure 2 [41], the Gnutella overlay topol- ogy, for the unbiased case and the biased case with oracle list size 1000. At a particular instant in time, we sample the Gnutella over- lay topology, display all the online nodes in the graph, and join two nodes with an edge if there exists a Gnutella peering between them at this point of time. Then, using the visualization library yWorks [44], we convert both the graphs into a structured hierarchi- cal format. The resulting graph structures are displayed in Figure 2.

Each leaf node can have between 2 to 4 connections to ultra- peers, while each ultrapeer initiates at least 10 connections to other

3 This is more aggressive as compared to other studies, e.g., [43] which assume that only half the nodes churn.

ACM SIGCOMM Computer Communication Review


Volume 37, Number 3, July 2007

Document info
Document views12
Page views12
Page last viewedWed Oct 26 00:34:33 UTC 2016