that is equal to its edge bandwidth, and the destinations of the flows are weighted according to their bandwidth. The flow conductance C measures how well the network can handle this scenario, or more formally, the flow conductance is equal to the inverse of the largest value of so that there is a feasible multicommodity flow solution f o r t h e d e m a n d s d v , w i n G . I t i s e a s y t o s h o w t h a t f o r a n y n e t w G, 0 ≤ ≤ 1, and the larger is, the better is the network. As an example, for uniform link bandwidths the flow conductance of the n × n-mesh is (1/n) and the flow conductance of the hypercube o r k

of dimension n is (1/log n).

Interestingly, one can significantly lower the number of inter-AS edges without losing much on the flow conductance. Suppose we have m peers with bandwidth b that can have a maximum degree of d. Consider a class of networks G(n) of degree d and size n with monotonically increasing flow conductance C(n). Connecting the m peers by G(m) gives a network with flow conductance C(m). Suppose now that every peer can establish only one inter-AS edge with bandwidth b/2, and the remaining bandwidth can be used for intra-AS edges. In this case, let us organize the peers into cliques of size d −1 within the ASes (we assumed that the number of peers in each AS is a multiple of d − 1) and interconnect the cliques so that they form G(m/(d − 1)). Then it is not difficult to see that the resulting network has a flow conductance of 2C(m/(d − 1)). Hence, compared to arbitrary networks we lose a factor of at most two.

Summary: We propose measures that are useful for P2P systems and our re- sults demonstrate that it is possible to have a highly local topology with an AS diameter and a flow conductance that is comparable to the best non-local topologies. Hence, worst-case communica- tion scenarios can be handled by local topologies (i.e., topologies with many intra-AS connections) essentially as well as by non-local topologies. In addition, we expect local topologies to be far better cost-wise for serving P2P traffic in practice than non-local topolo- gies, which we aim to validate through experiments.

2.2

# Simulation Topologies

The simulation results can be heavily influenced by the topolo- gies used. Hence, we make the basis for our simulations the cur- rent AS topology of the Internet [31, 32], as it can be derived from the BGP routing information. We use BGP data from more than 1,300 BGP observation points including those provided by RIPE NCC, Routeviews, GEANT, and Abilene. This includes data from more than 700 ASes as on November 13, 2005. Our dataset con- tains routes with 4,730,222 different AS-paths between 3,271,351 different AS-pairs. We derive an AS-level topology from the AS- paths. If two ASes are next to each other on a path, we assume that they have an agreement to exchange data and are therefore neigh- bors. We are able to identify 58,903 such edges. We identify level- 1 providers by starting with a small list of providers that are known to be level-1. An AS is added to the list of level-1 providers if the resulting AS-subgraph between level-1 providers is complete, that is, we derive the AS-subgraph to be the largest clique of ASes including our seed ASes. This results in the following 10 ASes being referred to as level-1 providers: 174, 209, 701, 1239, 2914, 3356, 3549, 3561, 5511, 7018. While this list may not be complete, all found ASes are well-known level-1 providers. There are 7,994 ASes that are neighbors of a level-1 provider, which we refer to as level-2. All other 13,174 ASes are grouped together into the class level-3. We thus identify 21,178 ASes in all.

As it is not known how many P2P nodes are in each AS, and we may want to study smaller subsets to be able to compute the com-

## ACM SIGCOMM Computer Communication Review

34

plex graph properties in reasonable time, we randomly subsample the AS-topology by keeping all level-1 ASes and their interconnec- tions, and selecting a fraction of the level-2 and level-3 ASes while keeping their proportion the same as in the original data. Hereby, we first select the level-2 ASes and keep their interconnections. Only then do we select the level-3 ASes from among the ASes that are reachable in our subgraph.

Most level-1 ASes traditionally are expected to serve more cus- tomers than level-2 and level-3 ASes [33, 34]. At the same time there are more level-3 than level-2 than level-1 ASes. Thus we distribute the P2P clients among the ASes in the following ad-hoc manner: a P2P node has equal probability to pick an AS from each level. This results in a 1/3 : 1/3 : 1/3 split of the nodes among the AS levels. This way a level-1 AS serves many more P2P nodes than a level-3 AS. All the topologies used in our experiments have been derived in this manner by randomly subsampling the AS topology derived from the BGP table dumps. Indeed, sensitivity analysis of our results show that if we move more peers to level-2, level-3 ASes the results improve even more.

3.

# OVERLAY / UNDERLAY GRAPH PROPERTIES

In this section, we first evaluate how the use of the oracle changes the graph properties of the P2P overlay topology. Later, in Sec- tions 4 and 5 we explore the interactions of the two routing sys- tems, the impact of churn on the topology, and the benefits of the oracle for satisfying queries. For this purpose we in this section use a general graph simulator as it allows us to explore large topolo- gies. Namely, we rely on a simulation environment, the Subjects environment [35], that is very light-weight, such that we can run experiments on large topologies with many P2P nodes.

The Subjects environment is developed for the design of highly robust distributed systems and provides us with support for oper- ations on general overlay graphs. It is based on C++ and consists of three basic types of entities: subjects, objects, and relay points. Subjects are the base class for processes (that are used to emulate nodes in the overlay network), objects are the base class for mes- sages exchanged between subjects, and relay points are used by the subjects in order to establish connections to each other so that objects can be exchanged. In our experiments, the Internet class spawns multiple AS classes, each of which then spawns a number of overlay node classes. These nodes then establish peering con- nections with each other by exchanging messages (objects), and the relay points serve as an abstraction of network ports. The way these entities are set up ensures that subjects have a firm control on who can send information to them so that the consent and control principle can be strictly enforced.

For our evaluation we consider five graphs, each with 300 ASes and 4,372 P2P nodes, which results in an average of 14.6 nodes per AS. Each graph consists of 4 level-1 ASes, 100 level-2 ASes and 196 level-3 ASes. We place 375 nodes within each level-1 AS, 15 nodes within each level-2 AS, and 7 nodes within each level-3 AS. Increasing the number of nodes in the level-2, level-3 ASes only helps our case.

We establish P2P neighbor relationships by randomly picking one of the P2P nodes and let it establish a neighborship either

unbiased: to a single randomly chosen P2P node or biased: to one from a list of candidates.

The unbiased case corresponds to a P2P protocol with arbitrary neighbor selection, while the biased case corresponds to a P2P node giving a list of potential neighbors to the oracle, and the oracle

Volume 37, Number 3, July 2007