Degree-dependent intervertex separation in complex networks

We study the mean length $\ell(k)$ of the shortest paths between a vertex of degree $k$ and other vertices in growing networks, where correlations are essential. In a number of deterministic scale-free networks we observe a power-law correction to a logarithmic dependence, $\ell(k) = A\ln [N/k^{(\gamma-1)/2}] - C k^{\gamma-1}/N + ...$ in a wide range of network sizes. Here $N$ is the number of vertices in the network, $\gamma$ is the degree distribution exponent, and the coefficients $A$ and $C$ depend on a network. We compare this law with a corresponding $\ell(k)$ dependence obtained for random scale-free networks growing through the preferential attachment mechanism. In stochastic and deterministic growing trees with an exponential degree distribution, we observe a linear dependence on degree, $\ell(k) \cong A\ln N - C k$. We compare our findings for growing networks with those for uncorrelated graphs.


I. INTRODUCTION
The main objects of interest of the physics of complex networks [1,2,3,4,5,6] are extremely compact, infinite dimensional nets-so called small worlds. The basic measure of the compactness of a network is the mean intervertex distance or the mean intervertex separation, that is, the mean length of the shortest path between a pair of vertices, ℓ. (The path runs along edges, each edge has the unit length.) Physicists often use another term for this characteristic-the diameter of a network, although in graph theory the term network diameter is reserved for the maximal separation of a pair of vertices in a net.
A network shows the small-world effect if its mean intervertex distance slowly increases with the network size (the total number of vertices in a network, N ), slower than any power-law function of N . This is in contrast to finite dimensional objects, where the mean intervertex distance grows as N 1/d , d being the dimension of an object. (We discuss sparse networks.) By definition, a small world is a network with the small-world effect. Note that this definition is not related to the presence of loops in a network. Small worlds may be loopy or clustered networks, or they may be without loops-trees.
The mean intervertex distances in networks were extensively studied both in the framework of empirical research [7] and analytically [8,9,10,11]. The typical size dependence of the mean intervertex separation is logarithmic, ℓ(N ) ∝ ln N . However, the mean intervertex distance is an integrated, coarse characteristic. One may be interested in a more delicate issue-the position of an individual vertex in a network. Recently Holyst et al. [12], have considered the question: how far are vertices of specific degrees from each other? They have shown that in uncorrelated networks, the mean length of the shortest path between vertices of degrees k and k ′ is ℓ(k, k ′ ) ∼ = D + A ln N − A ln(kk ′ ), where D is independent of N , k, and k ′ , and the coefficient A depends only of the mean branching ratio of the network. Note the coincidence of the coefficients of ln N and ln(k, k ′ ) in this result. The authors of paper [12], also calculated ℓ(k, k ′ ) of networks with nonzero clustering though without degree-degree correlations. In this case, they have arrived at the same expression as above but with coefficients of ln N and ln(k, k ′ ) additionally depending on the clustering. In the present paper we present our observations for another (though related) characteristic-the mean length of the shortest paths from a vertex of a given degree k to the remaining vertices of the network, ℓ(k). This quantity is related to ℓ(k, k ′ ) in the following way: and so In simple terms, we reveal the smallness of a network from the point of view of its vertex of a given degree. Our objects of interest are growing (and so inevitably correlated) networks. The basic property of most of the natural networks is a heavy-tailed degree distribution, so that vertex degrees are distributed over a wide range in contrast to classical random graphs. This motivates the study of ℓ(k) in networks with various complex distributions of connections. The question is: How strong the variation of ℓ(k) may be? One should note that this characteristic was measured recently in Ref. [13] in several networks, and noticeable variations of ℓ(k) were found. We observe nontrivial dependences ℓ(k) for networks with power-law and exponential degree distributions. We mostly consider growing networks, where correlations between the degrees of vertices are important, but for comparison, also discuss uncorrelated networks. In our study we use convenient deterministic growing graphs and compare our observations with simulations of stochastic models of growing networks.
In Sec. II we list our main observations, so that readers not interested in details may restrict themselves to the first two sections. Section III contains the discussion of the ℓ(k) dependence in uncorrelated networks for the sake of comparison. In Sec. IV we explain in detail how the results were obtained and describe particular cases. In Sec. V we make a few remarks on the degree-dependent intervertex separation in various networks and discuss relations of this quantity to centrality measures used in sociology [14,15].

II. MAIN OBSERVATIONS
For the purpose of the analytical description of ℓ(k) we use simple deterministic graphs. Deterministic small worlds were considered in a number of recent papers [16,17,18,19,20,21,22,23,24,25,26] and have turned out to be a useful tool. (We called these networks pseudofractals. Indeed, at first sight, they look as fractals. However, they are infinite dimensional objects, so that they are not fractals.) These graphs correctly reproduce practically all known network characteristics. We use a set of deterministic scale-free models with various values of the degree distribution exponent γ, P (k) ∝ k −γ (see Fig. 1). We consider deterministic graphs with γ in the range between 2 and ∞, where a graph with γ = ∞ has an exponentially decreasing (discrete) spectrum of degrees.
In the studied scale-free deterministic graphs, in a wide range of the graph sizes, the mean separation of a vertex of degree k from the remaining vertices of the network is found to follow the dependence: The constants A and C (as well as the sign of C) depend on a particular network. In stochastic growing scale-free networks, we observe a dependence ℓ(k, N ) shown in Figure 2. This figure demonstrates the results of the simulations of networks growing by the preferential attachment mechanism with a linear preference function [27]. While the dependence on ln N is linear practically in the entire range of observation, ℓ(k) vs. ln k is of a more complex form (see Fig. 2). The derivative dℓ(k)/d ln k is non-zero at k = 1 and at large degrees, ℓ(k) is fitted by a linear function of ln k with a larger slope. One should note that in all growing networks considered in this paper, new connections cannot emerge between already existing vertices. These networks are often called "citation graphs".
In the specific point γ = 3, correlations between the degrees of the nearest neighbors in these graphs are anoma-lously low. In this situation, the main contribution to ℓ(k) reduces to ℓ(k) ∝ ln(N/k), which coincides with the result for equilibrium uncorrelated networks (see the next section).
Formula (3) fails at γ → ∞. E.g., it cannot be applied for networks with an exponential degree distribution. In growing trees with this distribution, we observe the dependence: where the constants A and C depend on a network. In particular, we found that this law is exact in deterministic graphs (trees) with an exponential degree distribution [e.g., graph (e) in Fig. 1] at least up to very large sizes. Moreover, we observed the same dependence in a simulated stochastically growing tree with random attachment. In this tree (with an exponential degree distribution), at each time step, a new vertex is attached to a randomly selected vertex of the net. The result of the simulation of this network is shown in Fig. 3(a). In both the networks-graph (e) in Fig. 1 and the corresponding stochastic net with random attachment-the slope of the degree dependence turned out to be −1/2. More generally, if in a growing tree of this kind, at each step, n new vertices become attached to a vertex, the slope of the degree dependence equals −1/(n + 1) [see Fig. 3 All networks that we studied, had the generic property: in the large network limit. As is natural, the maximum value of ℓ(k) is attained at the minimal degree of a vertex in a network, and vise versa, the minimum value of ℓ(k) is attained at the maximum degree.

III. ℓ(k) OF AN UNCORRELATED NETWORK
The configuration model [28,29,30,31] is a standard model of an uncorrelated (equilibrium) random network. In simple terms, these are maximally random graphs with a given degree distribution. In the large network limit, they have relatively few loops and almost surely are trees in any local environment of a given vertex. The mean intervertex distance ℓ in these networks is estimated in the following way, Ref. [8] (see also Refs. [9,11]). The mean number of m-th nearest neighbors of a vertex is where z 1 = k is the mean number of the nearest neighbors of a vertex, i.e., the mean degree. z 2 = k 2 − k is the mean number of the second nearest neighbors of a vertex. z 2 /z 1 is actually the branching coefficient. By using formula (6), one can get ℓ: Similarly, for the mean number of m-th nearest neighbors of a vertex of degree k, we have ...
... (e) A deterministic graph with an exponentially decreasing spectrum of degrees [18]. At each step, a new vertex is attached to each vertex of the graph. In all these graphs, a mean intervertex distance grows with the number N of vertices as ln N .
So, the estimate is k(z 2 /z 1 ) ℓ(k)−1 ∼ N and thus Here we neglected an additional constant independent of N and k which would be excess precision. The relation (7) is evident. It also may be obtained strictly by using the Z-transformation technique: φ 1 (x) = φ(x)/z 1 is the Z-transformation of the distribution of the number of edges of an end vertex of an edge with excluded edge itself. φ(x) is the Ztransformation of the degree distribution of the network: (9) is a direct consequence of the following features of the configuration model: (i) the network has a locally tree-like structure, (ii) vertices of the network are statistically equivalent, (iii) correlations between degrees of the nearest neighbor vertices are absent. Relation (9) together with φ 1 (1) = φ(1) = 1 readily leads to relation (8).
Note that expression (8) also follows from the mentioned result of Holyst et al., Ref. [12], that is ℓ(k, k ′ ) ≈ ln[N/(kk ′ )]/ ln(z 2 /z 1 ) for the configuration model. Substituting this result into formula (1) and ignoring terms independent of N and k immediately gives expression (8). In its turn, substituting expression (8) into formula (1) leads to a standard formula for the configuration model: One point should be emphasized. In the configuration model, the logarithmic size dependence of the (degreeindependent) mean intervertex distance ℓ(N ) ∼ ln N is valid only for degree distributions with a finite second moment k 2 . If k 2 diverges as N → ∞, ℓ(N ) grows slower than ln N . One can see that the result (8) may be generalized to any given form ℓ(N ) of the size-dependence of the mean intervertex distance. In this general case, the degree-dependent separation is expressed in terms of the function ℓ(N ), namely, ℓ(k, N ) ∼ ℓ(N/k).
FIG. 2: Degree-dependent mean intervertex separation in a random scale-free network (tree) growing through the mechanism of preferential attachment. At each time step a new vertex is added. It becomes attached to a vertex selected with probability proportional to the sum of the degree of this vertex and a constant A -"additional attractiveness" [27]. Here we use A = 1. (a) ℓ(k) vs. log 10 k for networks of N = 1000, 3000, 10 000, 30 000, 100 000, and 300 000, vertices. Each of the first four curves were obtained after 50 runs, while for the networks of 100 000 and 300 000 vertices, 20 and 5 runs were used, correspondingly. Binning was made at large degrees, which allowed us to reduce noise. The inset demonstrates that in this network, the difference ℓ(k = 1) − ℓ(k) does not depend on the size N . In the inset, for the sake of clearness we do not show lines connecting points. The dashed lines highlight two limiting behaviors. As k approaches its minimal value k = 1, ℓ(k = 1) − ℓ(k) ≈ 1.0 log 10 k ≈ 0.43 ln k for all studied network sizes, while at large degrees, ℓ(k = 1) − ℓ(k) ≈ const + 4.1 log 10 k ≈ const + 1.8 ln k. (b) The dependence of ℓ(k = 1) on log 10 N . For comparison, a line with a slope 3 is shown.

IV. DERIVATIONS
In this section we study a degree-dependent intervertex separation in the deterministic graphs of Fig. 1. Graphs (a) -(d) have a discrete spectrum of vertex degrees with a power-law envelope. Graph (e) has a discrete spectrum of vertex degrees with an exponential envelope. We also list some basic characteristics of these graphs. We stress that the main structural characteristics (clustering, degree- degree correlations [32,33,34,35,36,37], etc.) of these deterministic networks are quite close to those of their stochastic analogs (see [17]).
(A) Graph (a) in Fig. 1.-This graph was proposed in Ref. [2] and extensively studied in Ref. [17]. The growth starts from a single edge (t = 0). At each time step, each edge of the graph transforms into a triangle. Actually, we have a deterministic version of a stochastic growing network with attachment of a new vertex to a randomly chosen edge, see Ref. [38]. The number of vertices of the graph is N t = 1+(3 t +1)/2. (t = 0, 1, 2, . . . is the number of the generation.) In the large network limit, the mean degree of the graph is k → 4.
Degrees of the vertices in the graph take values k(s) = 2 s , s = 1, 2, . . . , t. The spectrum of degrees has a powerlaw envelope. This spectrum corresponds to a continuum scale-free spectrum P (k) ∝ k −γ with exponent γ = 1 + ln 3/ ln 2 = 2.585 . . .. Note that this network has numerous triangles, which suggests high clustering. In more detail, by definition, the average clustering coefficient of a vertex of degree k is Here, c(k) is the number of triangles attached to a vertex of degree k, and k means the averaging over all vertices of degree k. One can see that in this graph (as well as in its stochastic version) [Indeed, by construction, the number of triangles attached to a vertex of degree k in the graph is k − 1. So, C(k) = (k − 1)/[k(k − 1)/2] = 2/k.] This gives, for the mean clustering, while the standard clustering coefficient (transitivity), i.e., the density of loops of length 3 in a network, approaches zero in the infinite network limit, C = 0. Note the difference between the finite mean clustering of the network and its zero clustering coefficient. In principle, one may derive an exact analytical expression for the degree-dependent separation by using recursion relations and the Z-transformation technique. However, these calculations turn out to be cumbersome. Instead, here we only check that some analytical formula for ℓ(k) is valid in a sufficiently large number of generations of a deterministic graph, up to, say, t ∼ 10 or 12. So, we confirm a guessed expression in networks of sizes up to N ∼ 10 5 . In fact, we implement the following approach: (ii) by using this array of numbers, guess the form of ℓ t (s); (iii) check this result by computing directly ℓ t (s) for several extra generations of the graph.
There are few computations in stage (i): we have to find only t values of ℓ t (s) in a t generation of a graph. For sufficiently small networks, these values can be found even without a computer.
Step (ii) also turns out to be rather easy since we already know the structure of the analytical expressions for a mean intervertex distance in these networks (see Ref. [17]).
Step (iii) may be performed by using a computer to count paths. This approach is based on our experience with problems on these graphs and was checked in Ref. [17] for related quantities. Our guess actually exploits underlined recursion relations without revealing them. Nonetheless, we can only claim that the analytical expressions, obtained in this way, are valid at the studied generations of our deterministic graphs. In principle, there exists a (small) chance that at some higher generation (or generations), these formulas fail. Thus, the results of this section should be considered only as observations of ℓ(k) for a set of networks of a modest size.
In this way, we get This formula is valid for t ≥ 1. We checked it up to t = 12, which corresponds to N t = 265 722. We also checked that this formula leads to the known exact formula for the mean intervertex distance ℓ for any t and so that N [17]. An asymptotic form of this expression is ℓ(k, N ) = 4 9 ln 3 ln N − 2 9 ln 2 ln k− k γ−1 6N + 4 9 ln 2 ln 3 + 10 9 +. . .
(15) at large N , where N is the total number of vertices in the graph. This leads to formula (3).
(B) Graph (b) in Fig. 1.-This graph was proposed in Ref. [18]. At each time step, each edge of the graph transforms in the following way: each end vertex of the edge gets a new vertex attached [see Fig. 1, graph (b), instant 0 → instant 1]. This graph is very similar to graph (a). In particular, the exponent of its degree distribution is the same, γ = 1 + ln 3/ ln 2 = 2.585 . . .. The difference is that the graph is a tree, so the mean degree k → 2 as N → ∞.
The total number of vertices in the graph is N t = 3 t + 1. The vertices have degrees k(s) = 2 s , where s = 0, 1, 2, . . . , t. In the same way as for graph (a), we find the expression which is observed starting with t = 0. This leads to the asymptotic relation that is, to formula (3). The minimum value of ℓ(k) is ℓ min = ℓ(k = 2 t ) ∼ = t/3, where t ∼ = ln N/ ln 3. The maximum value is ℓ max = ℓ(k = 1) ∼ = 2t/3, i.e., again, we arrive at relation (5).
(C) Graph (c) in Fig. 1.-At each step, (i) a new vertex becomes attached to each end vertex of each edge of this graph and, simultaneously, (ii) a new vertex becomes attached to each vertex of the graph. This produces a growing deterministic scale-free tree with exponent γ = 3, which is a deterministic analog of the Barabási-Albert model [39,40] (for exact solution of the stochastic model, see Refs. [27,32,41]).
The number of vertices in the graph is N t = 1 + (4 t+1 − 1)/3. Their degrees take values k(s) = 2 s − 1, s = 1, 2, 3, ..., t + 1. The observed degree-dependent separation is Asymptotically, this is for k, N ≫ 1 (note that the maximum degree of a vertex in this graph is k max ∼ N 1/2 ). This leads to expression (3) with γ = 3, which coincides with result (8) for uncorrelated networks. This is an understandable coincidence. Indeed, correlations between degrees of the nearest neighbor vertices in this deterministic graph, as well as in the Barabási-Albert model are anomalously week. So, the result must be close to that for an uncorrelated network.
(D) Graph (d) in Fig. 1.-At each step, (i) a pair of new vertices is attached to ends of each edge of the graph plus (ii) two new vertices are attached to each vertex of the graph. This results in the value of the γ exponent greater than 3, γ = 1 + ln 5/ ln 2 = 3.322 . . ..
The important feature of the expressions for ℓ(k, N ) in deterministic scale-free networks with γ = 3 were nonequal coefficients of ln N and ln k. For comparison we have measured ℓ(k, N ) in a random growing scale-free network growing through the mechanism of preferential attachment with a linear preference function [27]. At each time step, a new vertex emerges and becomes attached to a vertex chosen with probability proportional to the sum of its degree and a constant A. Exponent γ = 3 + A. We use A = 1, so that γ = 4. The resulting degree-dependent separations are shown in Fig. 2(a) for networks of up to 300 000 vertices. One can see in the inset that in these random networks, the difference ℓ(k = 1, N ) − ℓ(k, N ) is independent of N in contrast to the deterministic graphs (a)-(d). Furthermore, [ℓ(k = 1, N ) − ℓ(k, N )]/ log 10 k ≈ 1.0 as log 10 k approaches zero [i.e., dℓ(k, N )/d ln k ≈ −0.43]. However, at large k, we find a linear dependence on log 10 k with a larger slope, namely 4.1 [i.e., dℓ(k, N )/d ln k ≈ −1.8]. In its turn, ℓ(k = 1, N ) is well fitted by a linear dependence on log 10 N with a slope approximately 3.1, see Fig. 2 The difference in these slopes -4.1 and 3.1 -is in sharp contrast to uncorrelated networks. The ratio of these slopes, 1.3 is close to what we had for deterministic graphs according to Eq. (3) with γ = 4 substituted, namely, (γ − 1)/2 = 1.5. Moreover, Fig. 2(a) shows that for each network size, ℓ max ≈ 2ℓ min , as was observed in deterministic graphs.
One should note that the contribution ∼ k γ−1 /N to ℓ(k, N ) for the deterministic graphs, is noticeable only in a narrow neighborhood of k max , if results are presented in the form ℓ(k, N ) vs. ln k. On the other hand, the linear dependence ℓ(k, N ) on ln k is realized in a much wider range of ln k. In Eq. (15)-graph (a), it is valid for all degrees up to nearly k max , and in Eqs. (17), (19), and (21)-graphs (b), (c), and (d), respectively, this law is observable for k ≫ 1. It is in this region that we compared the rations of the coefficients of ln k and ln N in deterministic and stochastic growing scale-free networks.
(E) Graph (e) in Fig. 1.-At each time step, a new vertex becomes attached to each vertex of the graph. The growth starts with a single vertex (t = −1). The total number of vertices in the graph is N t = 2 t+1 . The degree distribution is exponential. One can check that the number of vertices of degree k at time t is N t (k ≤ t) = 2 t+1−k , N t (k = t + 1) = 2 (t is assumed to be greater than −1). By using the above described procedure, we find the exact expression: This formula shows that the linear dependence on degree is valid for any k. For the large graphs we have which confirms formula (4). In this graph, ℓ min ∼ = ln N/(2 ln 2) ∼ = ℓ max /2 which coincides with relation (5).
Graph (e) has a close stochastic analog-a tree, where at each step, a new vertex is attached to a randomly chosen vertex. It is easy to obtain the asymptotic expression for the mean shortest path length ℓ(N ) in this network. Let us consider even more general model. Let at each time step, n new vertices be attached to a randomly selected vertex. Then the total number of vertices N grows as N t ∼ = nt. For the total length of the shortest paths between vertices in the network at time t + 1 one can wright: The first term on the right-hand side of this equation is the total length of the shortest paths in the network at time t. The second term is the increase of this total length due to the attachment of n new vertices to a randomly chosen vertex. The factor 1/N t is due to the random choice. The term 1 · n is the sum of the paths connecting the new vertices to their "host". The term 2 · n(n − 1)/2 is the total length of the paths between the new vertices. The last term in the large parentheses is the sum of the lengths of the paths connecting the n new vertices and the N t − 1 old vertices distinct from the vertex receiving new connections. In the large network limit, Eq. (24) is readily reduced to the following one: and so we have independently of n. The calculation of ℓ(k) is a more difficult problem. So, for comparison, we present here only the result of the simulation of this stochastic network. Figure 3(a) demonstrates that the dependence ℓ(k) in the stochastically growing network is a linear function with the same slope −1/2 as in the deterministic small world (e) in Fig. 1.
We also considered more general deterministic graphs of this type, where n new vertices become attached to each vertex of a network at each time step. The resulting dependence ℓ(k) is a linear function but with slope −1/(n + 1). Figure 3(b) shows that ℓ(k) of the corresponding stochastically growing networks has the same form. We also checked that ℓ(k = 1, N ) ≈ 2 ln N , as in expression (26) for ℓ(N ).

V. DISCUSSION AND SUMMARY
Several points should be emphasized: (i) One can estimate a typical value of the correction term in formula (3). At the maximum degree k max ∼ N 1/(γ−1) , this term is of the order of k γ−1 max /N ∼ const. This should be compared to ln[k 1/(γ−1) max ∼ ln N ]. (ii) One should indicate that law (4), i.e., a linear dependence ℓ(k), was obtained only for growing trees with an exponential degree distribution. In non-tree growing networks with random attachment (at each time step, a new vertex becomes attached to several randomly chosen vertices), we observed a non-linear dependence.
(iii) The relative width of the distribution of the intervertex distance in infinite small worlds approaches zero [9,17]. In other words, vertices of an infinite small world are almost surely mutually equidistant. This circumstance does not allow one to measure ℓ(k) in an infinite network with the small-world effect. However, even in very large real-world networks (e.g., in the Internet [34]), the distribution of the intervertex distance is still broad enough. So, in real networks, ℓ(k) is a measurable characteristic.
(iv) The degree-dependent mean intervertex distance may be considered as a measure of "centrality" of a given degree vertex in a network. How does this characteristic relate to other centrality characteristics [15], first of all to the centrality index of a vertex [14]? Recall that the centrality index of a vertex v is defined as c v = (N −1)/ u ℓ(v, u), where ℓ(v, u) is the length of the shortest path between vertices u and v, N is the number of vertices in the graph, and the sum is over all vertices of the graph. (The centrality index is often given without the N − 1 factor.) One may see that the mean centrality index c(k) of a vertex of degree k is related (but not equal) to 1/ℓ(k). Nevertheless, there is a special casegraphs where every vertex of a given degree k has the same value of the sum of intervertex distances between this and the rest of the vertices. So, this value is exactly (N − 1)ℓ(k), and consequently c(k) = 1/ℓ(k). This situation is realized in our deterministic graphs. Thus, in the deterministic graphs, we actually found the inverse centrality index, but in random networks, c(k) and ℓ(k) are different characteristics.
In conclusion, we have studied the mean length of the shortest paths between a vertex of degree k and the other vertices in growing networks with power-law and exponential degree distributions. In the investigated deterministic and random networks, we have observed dependences ℓ(k) which strongly differ from those for uncorrelated networks. Our results characterize the compactness of a network from the point of view of a vertex with a given number of connections.