# III.A Random Chunk-based Switching

# In the random chunk-based switching, the file of inter-

est is divided into many small chunks. A user downloads chunks sequentially one at a time. Whenever a user com- pletes a chunk from its current source, the user randomly selects a new source and connects to it to retrieve a new chunk. In order to analyze its performance, we assume that Wald’s equation holds as in (2) for each given path. A file of size F is divided into m chunks of equal size, and let t_{j }be the download time for chunk j. Then, the total d o w n l o a d t i m e , T c h u n k , i s T c h u n k = m j = 1 t j . S i n c e chunk randomly chooses one of N sources (with equal e a c h

probability), the expected download time will be

E{T_{chunk }

}=

m

j=1

E{t_{j}} =

m

j=1

1

N

## N

i=1

# F/m

c_{i }

=

F . H(c)

(10) The result in (10) is identical to the download time given in (5) where a user downloads the entire file from an ini- tially randomly chosen source. Downloading the entire file from one randomly chosen source or switching sources for different chunks makes no difference. Hence, the ran- dom chunk-based switching would not give us better per- formance in terms of per-user average download time in a multi-source network.

So far in this section, we have assumed that the service capacities in the network are heterogeneous, and Wald’s equation holds true for a given path, i.e., the service ca- pacity is either constant or i.i.d. over time. In this case, the random chunk-based switching does not give us any benefit. However, if the service capacity is correlated over time for each given path, switching sources may still help reduce the correlations. When there exist strong corre- lations, if we get stuck in a source with very low service capacity, it is likely that the service capacity from that source remains low for a while. Thus, instead of wait- ing until we finish downloading a fixed amount of data (chunk or file), we may want to get out of that bad source after some fixed amount of time. In other words, we can also think about random switching based on time. In the subsequent section, we will investigate the performance of this random switching based on time, and show that it outperforms all the previous schemes in the presence of heterogeneity of service capacities over space and tempo- ral correlations of service capacity over each path.

# III.B Random Periodic Switching

In this section, we propose a very simple, distributed algorithm and show that it effectively removes correla- tions in the capacity fluctuation and the heterogeneity in space, thus greatly reducing the average download time. Figure 2 shows a simplified model with multiple down- loading users. We assume that each user can obtain a list of available sources through some search algorithm. As our algorithm will be implemented at each downloading peer in a distributed fashion, without loss of generality,

4

Figure 2: System model for file download operation in P2P networks

we only focus on a single downloader throughout this sec- tion.

In our model, there are N possible paths (source peers) for a fixed downloader. Let C_{i}(t) (t = 0, 1, 2, . . . and i = 1, 2, . . . , N) denote the available capacity during time slot t over path i (the connection between the fixed down-

loader and be a path

the i^{th }

source peer). Let U(t) ∈ {1, 2, . . . , N}

selection

function

for

the

downloader.

If

U(t) = i, this indicates that the i and the available capacity it the time slot t. For example, ure 2 represents a realization of

downloader selects path receives is C_{i}(t) during the solid arrow in Fig- the path selection func-

tion that

U (t) each

at time C_{i}(t) is

t for a fixed downloader C. We assume stationary in t and C_{i}(t) over different

paths i = 1, 2, . . . , N are independent.^{† }We that they have different distributions, i.e., are different for different i (heterogeneity).

however allow E{C_{i}(t)} = c_{i }For any given

i, the available capacity C_{i}(t) is As before, when each connection ity of being chosen, the average

correlated over time t. has the same probabil- service capacity of the

network is given by A(c) =

1 N

N i=1

c_{i}.

In this setup, we can consider the following two schemes: (i) permanent connection, and (ii) random pe- riodic switching. For the first case, the path selection function does not change in time t. When the search- ing phase is over and a list of available source peers is given, the downloader will choose one of them ran- domly with equal probability. In other words, U(t) = U where U is a random variable uniformly distributed over {1, 2, . . . , N}. For example, if the downloader chooses u (u ∈ {1, 2, . . . , N}) at time 0, then it will stay with that path permanently (U(t) = u) until the download com- pletes.

Time (correlated)

## Sources (Independent)

C C_{1}(t) 2 ( t )

U(1)=1

U(3)=2

U(4)

....

U(t)

### C_{N}(t)

U(2)=N

# Figure 3: The operation of path selection function U(t)

†

We note that different paths (overlay) may share the same link

at the network core, but still, the bottleneck is typically at the end of network, e.g., access network type, or CPU workload, etc. Thus, the independence assumption here is reasonable.