Design/upgrade of a transparent optical network topology resilient to the simultaneous failure of its critical nodes

This paper addresses two related problems in the context of transparent optical networks. In the network design problem, the aim is to identify a set of fiber links to connect a given set of nodes. In the network upgrade problem, the aim is to identify a set of new fiber links to add to a given network topology. For a given fiber length budget, the aim in both problems is to maximize the network resilience to the simultaneous failure of its critical nodes. The resilience is evaluated by the average 2‐terminal reliability (A2TR) against a set of critical node failures and the critical nodes are the ones that minimize the A2TR of the network. So, the design/upgrade problem is a bi‐level max‐min optimization problem. Recently, a multi‐start greedy randomized heuristic was proposed for both problems. Here, we propose an alternative method based on a greedy deterministic algorithm and we provide computational results showing that the new method obtains better solutions. The results show that the resiliency difference between existing network topologies and the best network design solutions is very high but this difference can be significantly reduced by network upgrades with small fiber length budgets.


INTRODUCTION
Large-scale failures can seriously disrupt a telecommunications network due to either natural, technological or malicious human activities [14]. Two recent surveys conducted within COST Action RECODIS are [10] on strategies to protect networks against large-scale natural disasters and [9] on security challenges in communication networks. When dealing with large-scale failures, it is important not only to recover from failures as quick as possible (the post-disaster problem), but also to prepare the network to minimize the impact of such failures (the pre-disaster problem).
This work deals with the pre-disaster problem by addressing the design/upgrade of telecommunication networks aiming to enhance their resilience to large-scale failures. To reach this goal, we first adopt a proper network resiliency metric and, then, we propose design methods aiming to optimize the network resiliency metric.
We address the design of resilient network topologies in the context of transparent optical networks. Note that, in general, multiple failures might involve only links or nodes and links (a node failure implies that its links also fail). For example, in malicious human attacks, node shutdowns are harder to realize but they are the most rewarding in the attackers' perspective since the shutdown of a single node also shuts down its incoming/outgoing fiber links. Moreover, power outages can only shut down nodes since fiber links do not require power supply. Here, we consider as large-scale failures the case of multiple node failures as they are the most harmful cases.
For a given network topology, if some nodes are considered critical due to some reason, the network design should take it into consideration, as in [3] where the approach proposed in [1] is adapted to the design of a transparent optical network minimizing the failure impact of a given set of critical nodes.
Here, we consider that the resilience of a network topology is evaluated by the average 2-terminal reliability (A2TR) against a set of critical node failures. The A2TR metric is defined as the number of node pairs that remain connected if all critical nodes fail. The critical nodes of a network are the nodes that minimize the A2TR of the network, an optimization problem commonly named critical node detection (CND) problem. So, the design problem is a bi-level max-min optimization problem.
CND problems have been considered in different contexts and are gaining special attention in the vulnerability evaluation of telecommunication networks to large-scale failures [10]. In [2], the CND problem is defined as the detection of a given number c of critical nodes aiming to minimize the number of connected node pairs. Recently, this and other variants of CND have been addressed [7,18,21,22], but none of these works addresses the CND problem in the context of transparent optical networks.
Other metrics have been used to evaluate the vulnerability of networks in other contexts [16] or assuming multiple geographically correlated failures [12]. There are also works on improving the preparedness of networks to multiple failures, some by changing the network topology [5,11,23], while others by proposing strategies to recover from failures [8,17]. None of these works, though, uses the optimal solution of the CND problem to assess the vulnerability of networks. On the other hand, CND was used in [6] but resiliency improvement is exploited by optimal robust node selection on a given network topology. The advantage of using CND is that it provides a worst case resiliency analysis, that is, in any failure involving the same number of failing nodes, the resulting A2TR is never worse than the value provided by the CND optimal solution.
In transparent optical networks, data is converted into light in the source node and transmitted through an all optical path, named lightpath, towards the destination node. Due to many optical degradation factors, like attenuation, dispersion, crosstalk, and other nonlinear factors, there is a maximum length, named transparent reach, for each lightpath to work properly. If a lightpath is required on a path whose length is higher than the transparent reach, regenerators must be placed at intermediate nodes to convert the optical signal into the electrical domain, regenerate the signal and reconvert it back to the optical domain (when regenerators are used, the network is referred to as a translucent network). Nevertheless, the use of regenerators is expensive and puts an additional burden on the network management and, so, they are avoided when possible. The design of translucent networks must take into account the cost imposed by the required regenerators, which is out of the scope of this work. The methods proposed in this work are applicable to transparent optical networks, that is, optical networks whose diameter (the highest optical length among the shortest paths of all node pairs) is not higher than the transparent reach. Note that the optical length of a path depends both on the length of its links and on its number of hops, that is, number of intermediate nodes. We model the optical degradation suffered by a lightpath while traversing an intermediate node as a fiber length value Δ, that is, by considering it equivalent to the degradation incurred due to the transmission over a fiber of length Δ. So, when accounting for the A2TR metric, the CND problem has to consider that two nodes are connected only if the surviving network provides them with at least one path whose optical length is within the required transparent reach.
In [4], a multi-start greedy randomized method was proposed to generate network topologies, with a given fiber length budget, that are resilient to critical node failures. The method is also adapted in [4] to the upgrade of an existing network topology. Here, we propose an alternative method for the same network design/upgrade problem based on a greedy deterministic algorithm and provide computational results showing that the new method obtains better solutions than the one proposed in [4]. With the updated results, we review the conclusions taken in [4] concerning the resiliency values obtained between the network design and the network upgrade solutions. The computational results will show that the resiliency difference between existing topologies and the best network design solutions is very high but this difference can be significantly reduced by network upgrades with small fiber length budgets.
The paper is organized as follows. Section 2 describes a path-based mixed integer linear programming (MILP) model defining the CND problem in the context of transparent optical networks and a row generation algorithm that is used to solve the problem. Section 3 proposes deterministic algorithms to generate network topologies resilient to the simultaneous failure of their critical nodes. The computational results are presented and discussed in Section 4. Finally, Section 5 presents the main conclusions of the work.

CRITICAL NODE DETECTION PROBLEM
Consider a transparent optical network represented by an undirected graph G = (N, E) where N = {1, …, n} is the set of nodes and E ⊆ {(i, j) ∈ N × N : i < j} is the set of fiber links. For each link (i, j) ∈ E, parameter l ij represents its length. The transparent reach of the network is denoted by parameter T > 0 and the fiber length equivalent to the degradation suffered by a lightpath while traversing an intermediate node is denoted by parameter Δ > 0. We assume that l ij ≤ T for all (i, j) ∈ E; otherwise, such link is worthless and can be removed from G.
The set of all paths in G between i ∈ N and j ∈ N (with i < j and (i, j) ∉ E) with length not greater than T is denoted by P ij . Each path p ∈ P ij is defined by the binary parameters p k , indicating whether node k ∈ N (which can be an end node) is in p or not, and p kt indicating whether link (k, t) ∈ E is in p or not. So, P ij is composed by all paths p such that ∑ ≤ T. To model the CND problem, we consider for each node i ∈ N a binary variable v i indicating whether i is a critical node or not. We consider also for each node pair (i, j), with i, j ∈ N : i < j, a binary variable u ij which is 1 if nodes i and j can be connected through a path satisfying the transparent reach T, and 0 otherwise.
Then, for a given number c of critical nodes, a path-based formulation for the CND problem is given by the following integer linear programming (ILP) model: The objective function (1) value z is the A2TR value defined as the total number of connected node pairs in the surviving graph (i.e., the graph given by removing all critical nodes from G). Constraint (2) ensures that at most c nodes are selected as critical nodes (in any optimal solution, c critical nodes are selected). Constraints (3) guarantee that a pair of adjacent nodes (i.e., with a direct link between them) is connected if none of the two nodes is a critical node. Constraints (4) are the generalization of Constraints (3) for the node pairs that are not adjacent in G: node pair (i, j) is connected if there is one path p ∈ P ij such that none of its nodes is a critical node. Constraints (5) and (6) are the variable domain constraints.
Note that, since variables v i are binary, constrains (3) and (4) impose u ij ≥ 1 when nodes i and j are connected, which then, due to the objective function, forces u ij = 1. Therefore, Constraints (6) can be replaced by u ij ≥ 0. The resulting MILP model will be considered henceforward and is referred to as the exact CND model.
The total number of Constraints (4) depends on the graph topology, the link lengths and the values of T and Δ. However, the exact CND model becomes too large (i.e., with too many constraints) for relatively small sized instances which does not allow solution by any available solver for reasonable sized instances. Instead, a row generation approach can be used to solve the exact CND model as described in Algorithm 1.

Algorithm 1.
Exact algorithm for the CND problem 1: Initialize and solve MILP ′ model (1)-(6) without Constraints (4). Let (u * , v * ) be the optimal solution 2: repeat 3: Set NCuts ←0 and Compute shortest path p ∈ P ij and its optical length d 7: if d ≤ T and u * ij + ∑ n k=1 p k v * k = 0 then 8: Add to MILP ′ Constraint (4) corresponding to path p 9: NCuts ← NCuts +1 10: In line 1, a MILP ′ model given by the exact CND model without Constraints (4) is initialized and solved. Then, in the main cycle (lines 2-15), the separation problem associated with Constraints (4) is solved (lines 3-11) where the identified violated constraints (whose number is accounted in NCuts) are added to MILP ′ (lines 7-10) and, finally, MILP ′ is solved again (lines [12][13][14]. The algorithm ends when no violated constraint is found (line 15) and the optimal solution is the solution of the last solved MILP ′ model.
The separation problem associated with Constraints (4) is solved as follows. In line 4, the subgraph G K = (N∖K, E K ) is computed by removing from G the critical nodes of set K (determined in line 3) and the corresponding incident edges and adding Δ to the length of each edge in E K . Since the number of intermediate nodes of a path is equal to the number of edges minus one, the shortest path value in G K is equal to the optical path length plus Δ. So, between each pair of nodes i and j in N∖K, such that (i, j) ∉ E K (line 5), the shortest path p in G K is determined (by Dijkstra algorithm) and its optical length d computed as the length of p minus Δ (line 6) and the violation of the Constraint (4) associated to p is checked (line 7).
Note that if the imposition of the transparent reach T is relaxed (i.e., considering T → + ∞), we get one of the classical CND problem variants for which there are known efficient compact MILP models. One such model, proposed in [18], is as follows. Consider for each pair of nodes (i, j) the set N ij ⊂ N which represents the set of adjacent nodes to i (on graph G) if the node degree of i is not higher than the node degree of j, or the set of adjacent nodes to j, otherwise. Then, the previous constraints (4) can be replaced by the following (polynomial sized) constraints: In these constraints, u {ik} represents variable u ik if i < k or variable u ki otherwise (the same meaning to u {kj} ). Constraints (7) guarantee that for each pair of nodes i and j not adjacent in G, they are connected if there is a noncritical node k ∈ N ij connected to both i and j. The resulting MILP model, that is, replacing Constraints (4) by Constraints (7), is referred to as the compact CND model.
Note that the optimal solution value of the compact CND model is an upper bound on the optimal solution value of the exact CND model since all node pairs that are connected are accounted in the objective function of the compact CND model while the ones whose shortest path length over the surviving network is higher than the transparent reach T are not accounted in the exact problem. Nevertheless, in most cases, the optimal solution of the compact CND model is also the optimal solution of the exact problem and its resolution is much quicker when using a standard solver (it does not require the row generation method of Algorithm 1). As will be described in the next section, we use this fact to derive computationally efficient algorithms for the network design/upgrade problem.

NETWORK DESIGN/UPGRADE PROBLEM
Consider an existing network G = (N, E) with a total fiber length L and a fiber length budget B = L + L ′ , with L ′ ≥ 0. The network design problem aims to identify a new network topology connecting all nodes in N whose total fiber length is not higher than B. The network upgrade problem aims to identify a set of fiber links within the budget L ′ to be added to the existing topology. In both cases, the aim is to obtain a network (design or upgrade) topology that maximizes the A2TR value of the simultaneous failure of its critical nodes.
In [4], a multi-start greedy randomized algorithm was proposed for this network design/upgrade problem. The method randomly generates multiple network topologies, with a fiber length budget given by B. Each topology is generated by a greedy randomized algorithm that builds a network by randomly selecting one link at a time until no new link can be added within the fiber budget B. In that stochastic method, the evaluation of each network topology uses centrality based heuristics in a preliminary evaluation and, if necessary, Algorithm 1 to compute its exact A2TR value. At the end, the method outputs the best generated topology, that is, the one with highest A2TR value.
Here, we propose an alternative deterministic method based on a greedy approach. The main differences when compared with the previous approach are that a single greedy solution is generated and, on each greedy step, the selected link is based on the critical nodes of the current partial topology (resulting from all already selected links). As a consequence, the critical nodes must be computed at each step of the greedy algorithm.
The proposed method is composed by three tasks which are run in sequence. In the first task, a greedy deterministic algorithm is run so that a topology solution is computed. In the second task, a local search method is applied to the previous solution to try to find a better one. Both first and second tasks consider the network A2TR evaluation imposed by the critical nodes based on the compact CND model. Finally, the third task evaluates the previous solution in terms of optical transparency and using the exact CND model.
If the previous solution is not optically transparent and/or the evaluation provided by the exact CND model is lower than the evaluation value provided by the compact CND model, the third task uses the unused budget to add new links so that the network topology becomes optically transparent and the exact A2TR value becomes as close as possible to the value provided by the compact CND. Next, we describe the tasks in three separate subsections and, then, describe the overall algorithms in the fourth subsection.

Greedy deterministic approach
In the first task, both network design and upgrade problems are conceptually modeled as the upgrade of a network topology represented by graph G = (N, E) where E = { } in the network design problem and E is the set of fiber links of an existing network in the network upgrade problem. In both cases, parameter l ij , with i < j, represents the length of the fiber link between nodes i and j (either an existing link or a possible new link).
Consider the following notation. For a general graph G = (N, ) and a given set of critical nodes K ⊂ N, consider the surviving graph given by The surviving graph G K might be composed by different connected components where a node in one component has no connectivity to a node of any other component. So, for a given set of critical nodes K ⊂ N, parameter m K indicates the number of connected components of the surviving graph G K and the binary parameters cp K Algorithm 2 presents a greedy deterministic algorithm to upgrade a network represented by graph G = (N, E) with a fiber budget B. This algorithm iteratively selects a candidate link among the ones whose end-nodes belong to two different components in the current surviving network topology, that is, the topology resulting by removing its critical nodes from the current partial topology.
Algorithm 2 has four input parameters (line 1): besides the graph G and the fiber budget B, the algorithm has a third integer parameter c representing the number of critical nodes of interest and a fourth parameter which is an input set of links E ′ considered as follows.
In the network upgrade problem, we consider the input link set E ′ = { }.
In the network design problem, we follow [4] considering the input link set E ′ given by the relative neighborhood graph (RNG) [20], which is defined as follows: nodes i, j ∈ N are connected by a link if and only if there is no other node k ∈ N∖{i, j} such that l ik ≤ l ij and l jk ≤ l ij . The preliminary tests have shown that this set of links provides a good initial balance between connectivity and amount of used fiber budget in the network design problem.

Algorithm 2. Greedy deterministic algorithm
Let z * be the optimal value and (u * , v * ) be the optimal solution 8: In line 2, some relevant variables are initialized: z which represents the CND optimal value of the current partial topology; B R which represents the amount of fiber budget still available; and set  which includes all computed sets of critical nodes.
Algorithm 2 is an iterative process (lines [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18] where each iteration is composed of two phases: an adding phase, where links are added to set E ′ in order to improve the CND value of the resulting topology until the available budget is not enough to add more links; and a removing phase, where all links in E ′ are reevaluated and removed if their elimination does not decrease the CND optimal value of the resulting topology. In each iteration, set E ′ add (and set E ′ rem ) represents the set of links added to (and removed from) the network in the current iteration. These sets are initialized empty at the beginning of each iteration (line 4).
In the adding phase (lines [5][6][7][8][9][10][11][12][13][14], the compact CND model is solved for the current partial topology (N, E ∪ E ′ ) (lines 6 and 7), its set of critical nodes is stored in set K (line 8) and the set of network components imposed by the critical node set K is determined (line 9). In line 10, if the number of components m K is one (the network is fully connected to any c node failures) or if the shortest available link connecting two components is larger than the remaining budget, no new link is added and the adding phase ends. Otherwise, the link (i, j) ∉ E ∪ E ′ with the smallest value of a given function f (i, j) that connects two nodes belonging to different components is selected (line 11) and the selected link is added to the current partial topology (line 12). Function f (i, j) can be one of the two following possibilities: , where i and j are the degree of nodes i and j in the current partial topology (N, E ∪ E ′ ).
The adding phase is repeated until no new link is selected (line 14). In lines 15-17, if at least one link was added in the adding phase, the removing phase (described in Algorithm 3) is run. Algorithm 2 runs until no links are added in the adding phase, or no links are removed in the removing phase or the last links added in the adding phase are the links removed in the removing phase (line 18).
for all K ∈  do 5: z K ′ ← resiliency value given by set of critical nodes K in graph G ′ 6: end for 7: if min K∈ {z ′ K } ≥ z then 8: Solve compact CND model for graph G ′ and set z ′ as the optimal value 9: The removing phase (Algorithm 3) is an iterative process (lines 2-14) that evaluates each link (i, j) ∈ E ′ in decreasing order of its length l ij . For each (i, j) ∈ E ′ , first, the algorithm computes (in line 3) graph G ′ that represents the upgraded network without link (i, j). Then, the compact CND of G ′ is solved (line 8) and if its CND value z ′ is not lower than the current resiliency value z (line 10), the link (i, j) is removed from E ′ since it does not degrade the resilience of the current topology (line 11).
Our preliminary tests showed that most of the CND optimal solutions in this phase overlap with previous solutions (stored in set , line 8 of Algorithm 2). So, in order to improve the computational efficiency of the removing phase, in lines 4-6 of Algorithm 3, the resiliency value z ′ K is computed for each critical node set K ∈  in graph G ′ (i.e., the number of connected node pairs in graph G ′ without nodes K). Then, if the minimum of these values is not lower than the current resiliency value (line 7), the algorithm needs to solve the compact CND model. Otherwise, we know that the current link (i, j) cannot be removed and, therefore, we do not need to solve the compact CND model for graph G ′ (line 8) which is the most time consuming part of the algorithm. Figure 1 is an illustration of Algorithm 2 for a graph with 9 nodes shown in (A), a fiber budget Initially, the compact CND model is solved for the initial graph to compute its critical nodes, highlighted in red in (A). By removing the critical nodes, it results in the surviving network in (B) with m K = 2 network components (with 3 and 4 nodes, respectively) and the selected link is the shortest one (recall that f 1 (i, j) = l ij ) that connects both components, highlighted in dashed blue in (B).
This link is added to the topology, resulting in the upgraded graph in (C), and the optimal critical node set is recomputed. This process is repeated once more, obtaining the upgraded graph in (E). Then, by removing its critical nodes from the graph, there is no candidate link to be added within the remaining fiber budget. At this stage, the removing phase starts and the first added link, highlighted in dashed red in (F), is now removed because the resulting topology in (G) has the same resiliency value as the one in (E). This removal increases the available fiber budget B R . The adding phase runs again which now adds a new link, highlighted in dashed blue in (H), obtaining the topology in (I). In this example, this last topology has maximal resiliency, that is, by removing its critical set of 2 nodes, the surviving network in (J) is fully connected.
In Algorithm 2, each added link takes into account the set of critical nodes given by the optimal solution of the compact CND model. A potentially better algorithm is, for each partial topology, to consider the S best critical node sets, with S > 1, as  Let z * be the optimal value and (u * , v * ) be the optimal solution. Set z ← z * 8: Compute the m K components of graph (N∖K, (E ∪ E ′ ) K ) 10: if m K ≥ 2 and min (i, j) {l ij : cp K (i, j) = 1} ≤ B R then 11: Initialize s ← 1 and c ij ← 0, for all candidate links (i, j) ∉ E ∪ E ′ such that l ij ≤ B R 12: while s ≤ S and m K ≥ 2 do 13: c ij ← c ij + (z∕z * ) 15: end for 16: if s < S then 17: Add constraint Solve compact CND model. Let z * be the optimal value and (u * , v * ) be the optimal solution. 19: Compute the m K components of graph (N∖K, (E ∪ E ′ ) K ) 21: end if 22: s ← s + 1 23: end while 24: In line 11 of Algorithm 4, a set of variables c ij are initialized to zero. These variables count, in a weighted manner, the number of times that each candidate link (i, j) ∉ E ∪ E ′ connects two components in the S alternative surviving graphs, that is, the surviving graphs of the S best CND solutions. Note that in line 7, z is set with the solution value of the first compact CND (the lowest value among all S surviving graphs). Then, for each candidate link (i, j) that connects two components of the current surviving graph (line 13), c ij increases by z∕z * where z * is the solution value of the current surviving graph. Note that the ratio z∕z * is always less than or equal to 1. So, the idea behind this weighted sum is to give a lower weight in the link selection to the links connecting two components of the surviving graphs whose CND solution values are worst. Finally, in line 24, the link (i, j) ∉ E ∪ E ′ with the largest value of c ij /f (i, j) is selected (like before, f (i, j) is set either with f 1 (i, j) or with f 2 (i, j)).
While the best CND solution of each partial topology is computed in line 6 of Algorithm 4, the additional alternative CND solutions are computed in the loop 12-23.
This loop ensures that the algorithm will continually compute a compact CND solution until either it loops S times or m k = 1 (line 12), that is, the current surviving graph does not have multiple components. The condition in line 16 ensures that the compact CND model is optimized at most S times.
Finally, given a set of critical nodes K, the added constraint (line 17) excludes this set from the set of feasible solutions of the compact CND model so that, when it is solved again (line 18), it will result in the next best set of critical nodes.
Note that, while computing the S sets of critical nodes for each partial topology, they are computed in increasing value of the A2TR (i.e., the number of connected node pairs). There might exist multiple optimal CND solutions, that is, all solutions next to the first one such that their optimal value z * is equal to z. Since these solutions are more damaging than the subsequent ones, a meaningful alternative is to consider only the optimal CND solutions and ignore the remaining ones. This is easily done by adding in line 12 of Algorithm 4 a third condition in the form z * = z. In the computational results, we also test this algorithm variant.

Local search approach
In the second task, a local search algorithm is applied to the solution provided by the first task. Note that in the removing phase of Algorithm 2 and Algorithm 4, a link can only be removed if the resiliency value of the current network topology remains the same. Here, a local search algorithm (described in Algorithm 5) is proposed where each link (i, j) ∈ E ′ is removed and excluded from the set of candidate links while re-running either Algorithm 2 or Algorithm 4. Removing the link from the graph decreases the resiliency value z but increases the remaining fiber budget B R , which might enable to find an alternative topology with a better resiliency value.
Run Algorithm 2 (or 4) starting from set E ′ ∖{(i, j)}, fiber budget B R + l ij and excluding link (i, j) from the set of candidate links, obtaining a new set of selected links E ij ′ , its resiliency value z ij and fiber budget B ij 7: if z ij > z LS or (z ij = z LS and B ij > B LS ) then 8: The inner loop (lines 4-10) removes each link (i, j) ∈ E ′ from graph (N, E ∪ E ′ ) and runs the greedy deterministic algorithm (Algorithm 2 or Algorithm 4) excluding the removed link to be selected. If two alternative topologies have the same resiliency value, the one with the highest remaining fiber budget is selected (line 7). Finally, in lines 11-13, the main variables z, E ′ , and B R are updated if the best alternative topology has a higher resiliency value than the current topology.
After running some computational tests, we observed that in the network design problem, the local search approach is very inefficient as it does not provide relevant gains in the resiliency value and the running time becomes very high. Therefore, the second task is only included in the overall algorithm of the network upgrade problem.

Transparent optical networks application
In the previous tasks, the A2TR value was computed by solving the compact CND model. So, in the network design problem, it is necessary to check if the solution provided by the previous task is optically transparent, that is, for each node pair, the optical shortest path between them is not higher than the transparent reach T (this is not an issue in the network upgrade problem since the optical transparency is guaranteed by the original network topology). If not, the aim is to use the unused fiber budget to turn the solution optically transparent. Moreover, it is also necessary to compute the A2TR value with the exact CND model and, if the two values are not equal, again we use the unused fiber budget to add new links so that the two values become as close as possible.
In our preliminary tests, we observed that, from the three stopping criteria used in Algorithm 2 and Algorithm 4, the most common stopping criterion is the last one: E ′ rem = {last|E ′ rem | links added to set E ′ add }, that is, the last added links do not improve the A2TR value of the solution and, therefore, they are removed. Thus, the solutions tend to have a reasonable amount of available fiber budget B R . Algorithm 6 presents a deterministic method to use the remaining budget B R in order to make an input topology G = (N, E) optically transparent. Initially, the shortest path distances between all node pairs without a direct link (i.e., (i, j) ∉ E) are computed in loop 3-5.
In line 6, variable M is set to the maximum distance between all node pairs (i, j) ∉ E and (i M , j M ) is set to such a node pair. If M exceeds the transparent reach (line 7), the algorithm selects the shortest link (i, j) that, when added to the current topology, turns the distance between (i M , j M ) within the transparent reach. If the selected link is within the available fiber budget, it is added to the topology (lines 9-11). The algorithm continues until no new link is added (line 13). if l ij ≤ B R then 10: end if 12: end if 13: until no new link added to set E Algorithm 6 is run when we aim to design a new topology based on the fiber budget of an existing optical network and the existing optical network is not 2-connected (in the context of transparent optical networks, a topology is 2-connected if it is optically transparent for every single node deletion). On the other hand, when the existing network topology is 2-connected, we also require the solution obtained by the network design problem to be 2-connected.
Algorithm 7 is a generalization of Algorithm 6 aiming to use the remaining budget to turn an input topology G = (N, E) into a 2-connected topology.
In Algorithm 7, instead of computing the distance between all node pairs without a direct link in G, these distances are computed in loop 3-8 for all reduced graphs G k , that is, graphs without node k and its links, for all nodes k ∈ N. Then, variable M is computed (line 9) with the maximum distance between all node pairs over all reduced graphs G k and the selected link (i, j) is computed in a way similar to Algorithm 6 but now considering the reduced graph G k over which the maximum distance M was computed.
for all node pairs (i, j) ∉ E k with i, j ∈ N∖{k} do 6: Compute shortest path p k ij (on graph G k ) and its length k ij 7: end for 8: end for 9: Compute M ← max{ k ij ∶ (i, j) ∉ E k , k ∈ N}, the corresponding node pair (i M , j M ) and removed node k M 10: if M > T then 11: Compute end if 15: end if 16: until no new link added to set E Finally, in Algorithm 8, we present a method that simultaneously solves the exact CND model (using Algorithm 1) and uses the remaining fiber budget B R to move the resiliency value provided by the exact CND solution as close as possible to the resiliency value of the compact CND model.
The inputs of Algorithm 8 are the network topology G, the remaining fiber budget B R , the target CND optimal value z and all the sets of critical nodes  previously generated by the greedy deterministic algorithm (either Algorithm 2 or Algorithm 4).
Similarly to Algorithm 1, the MILP ′ model is initialized without the path constraints (line 2). Then, in order to accelerate the row generation process, the path constraints associated to each set of critical nodes K ∈  are added to MILP ′ model (lines [3][4][5][6][7][8][9][10][11]. Next, the optimal solution of the exact CND model is solved as in Algorithm 1. In lines 15-17, if the exact resiliency value z * is lower than the target value z, the algorithm uses the remaining budget to compute new links to be added to the graph so that each component of the surviving graph induced by the optimal critical node set becomes optically transparent (this is equivalent to run Algorithm 6 in this surviving graph, line 16). Algorithm 8 is repeated until no new link is added (line 18), which happens either if z * = z (no need to add new links) or if no new link can be added due to the remaining fiber budget.

Overall algorithm
Recall that, for a given network G = (N, E) with a total fiber length L and a fiber length budget of B = L + L ′ , with L ′ ≥ 0, the network design problem aims to identify a new network topology connecting all nodes in N whose total fiber length is not higher than B and the network upgrade problem aims to identify a set of fiber links within the budget L ′ to be added to the existing topology.
So, the overall algorithm is a combination of the previous algorithms that depends on the problem type (network design or network upgrade). Algorithm 9 and Algorithm 10 describe how the different algorithms are put together to solve the network design and the network upgrade problem, respectively. As previously explained, the network design algorithm (Algorithm 9) does not include the local search algorithm (Algorithm 5) and the network upgrade algorithm (Algorithm 10) does not need to run the network validation algorithms (Algorithm 6 and Algorithm 7).
On both Algorithm 9 and Algorithm 10, we consider six algorithm variants: (i) Algorithm 2, (ii) Algorithm 4 considering S CND solutions, and (iii) Algorithm 4 considering the optimal CND solutions, each case using either f 1 (i, j) or f 2 (i, j) as the criteria to select each new link.

COMPUTATIONAL RESULTS
All results reported in this section were obtained using the optimization software Gurobi Optimizer version 8.0.0, with programming language Julia version 0.6.2, running on a PC with an Intel Core i7-8700, 3.2 GHz and 16 GB RAM. Following [15], we have assumed a transparent reach T = 2000 km corresponding to the use of OTU-4 lightpaths with a line rate of 100 Gbps. Moreover, we have considered Δ = 60 km. This value considers an optical node architecture with an input and an output wavelength selective switch (WSS() per fiber port and assumes an attenuation of 6.0 dB inserted by each WSS. Then, assuming that the attenuation on each WSS is the main optical degradation factor suffered by a lightpath, an optical node introduces a total of 12.0 dB, equivalent to the attenuation on a fiber of 60 km, with an attenuation of 0.2 dB/km. The network topologies used in these computational results are Germany50 [13], PalmettoNet [19], and Missouri Network Alliance (MissouriNA) [19]. Table 1 presents their topology characteristics in terms of number of nodes |N| and fiber links |E|, total number of node pairs, minimum ( min ), average ( ) and maximum ( max ) node degree and an indication if the topology is (or is not) 2-connected.
In all cases, the geographical location of nodes is publicly available but the geographical routes of fiber links are not known. So, we have considered that each (existing or possible) link follows the shortest path over the surface of a sphere representing Earth. Table 2 presents the resulting length characteristics in terms of minimum (l min ), average (l), maximum (l max ) and total link length (L), and diameter (i.e., the highest optical length among the shortest paths of all node pairs adding Δ for each intermediate node). Note that the three topologies are optically transparent for T = 2000 km since all diameter values are below 2000 km.
In the computational experiments, we have considered c ∈ {2, 3, 4, 5, 6} as the number of critical nodes. For each network and each c, we started by computing a topology with a fiber budget B equal to the total fiber length L of the original topology using Algorithm 9. Then, we computed an upgraded topology for each original topology assuming a fiber budget L ′ = p × L with p = 10% and 20% using Algorithm 10. Finally, we computed a topology with a fiber budget B = L + p × L also for p = 10% and 20% using Algorithm 9. These cases are the same as the ones considered in [4] so that we can compare the efficiency of the methods proposed here with the ones proposed in [4].
In all cases and in both types of problems (network design and network upgrade), we have run the six algorithm variants (see Section 3.4). In the variants with Algorithm 4 considering S CND solutions, we present the results with S = 10 as our preliminary tests have shown that this value is a good compromise between the running time and the algorithm efficiency.
Tables 3-5 present the resiliency values of the network upgrade solutions for the three network topologies. In these tables, in addition to the number of critical nodes c, column "MS" refers to the solutions obtained by using the multi-start greedy randomized method proposed in [4]. Columns "S1" and "S2" refer to the solutions obtained by Algorithm 2  S1  S2  A1  A2  M1  M2  MS  S1  S2  A1  A2  M1  M2   2  821  821  821  821  821  821  821  861  861  861  861  861  821  821   3  616  616  616  616  616  616  616  709  709  709  709  709  709  744   4  427  400  389  400  389  406  412  510  510  532  524  532  532  532   5  325  333  333  337  333  325  319  380  380  380  380  380  380   (the values 1 and 2 mean the use of function f 1 (i, j) and f 2 (i, j), respectively). Columns "A1" and "A2" refer to the solutions obtained by Algorithm 4 considering the optimal CND solutions and columns "M1" and "M2" refer to the solutions obtained by Algorithm 4 considering S = 10 best CND solutions. Finally, the best values of each problem instance are highlighted in bold. The first and most important observation of these computational results is that, with the exception of one instance (Palmet-toNet topology for c = 4 and p = 10%), the resiliency value obtained by at least one of the proposed algorithm variants is not lower (in many cases, it is significantly higher) than the resiliency value of the method proposed in [4]. This means that the best obtained upgrade topologies, in general, have a higher resiliency to critical node failures when compared to the ones provided in [4]. Additionally, the comparison of the different algorithm variants (proposed in this work) does not provide clear evidence that one of them is consistently better than the others. This means that, in practice, we might need to run all of them to compute the best upgrade solution.
Tables 6-8 present the resiliency values of the network design solutions for the three network topologies (the meaning of each column is similar to the previous tables). In the Germany50 network, the strikeout values represent invalid solutions where the algorithm variant was not able to compute a 2-connected network design solution.
When comparing the results using the different algorithm variants with the method in [4], it is possible to observe that, in general, the network topologies obtained in this work are much more resilient to critical node failures than the ones obtained in [4]. In the Germany50 network, all invalid topologies were obtained using function f 1 (i, j). So, in this case, it is preferable to use function f 2 (i, j) in the link selection. The reason for the superior performance of f 2 (i, j) is because it favors the selection of candidate links connecting nodes with lower degrees. So, there is a lower chance that the generated topology has leaves (nodes with degree one) and, even when this happens, Algorithm 7 needs, in general, a lower amount of fiber budget to make it 2-connected. Nevertheless, in the other networks (PalmettoNet and MissouriNA), there are some cases where the variants using f 1 (i, j) provide the best resiliency results. So, like in the network upgrade problem, in the network design problem there is no clear evidence that one of the variants is consistently better than the others. Table 9 presents the resiliency value z of the best topologies obtained for each instance presented in Tables 3-8. Rows "Original" refer to the original topologies (in column "0%") and upgraded topologies (in columns "10%" and "20%") while  and the original topology. The resiliency gaps between the best topologies and the upgraded topologies are presented in the red and green bars for p = 10% and 20%, respectively. First, the blue bars of Figure 2 show that the resiliency gaps are lower for Germany50 (but still significant for a number of critical nodes c ≥ 3) and very large for PalmettoNet and MissouriNA. These results reinforce the previous conclusion that Germany50 is more resilient than the others but also show that, in general, existing network topologies are not resilient to critical node failures. Second, the resiliency gaps shown in the red bars (corresponding to topology designs with 10% more total fiber length) represent, in all cases, a significant gap reduction when compared with the blue bars. This means that for all tested instances, adding new links to an existing topology with a fiber budget of 10% enables solutions with resiliency to critical node failures much closer to a topology designed to maximize this resilience with the same amount of fiber. Third, the results of the green bars (corresponding to topology designs with 20% more total fiber length) are mixed, that is, in some cases, the additional 10% fiber budget enables a significant gap reduction while in other cases, the reduction is negligible.
Finally, we can distinguish two cases. For a number of critical nodes c ≤ 3, the additional fiber budget of 20% allows in all cases the resiliency gap to become small (below 10%). For a number of critical nodes c ≥ 5 (in the Germany50 network) and c ≥ 4 (in the less resilient PalmettoNet and MissouriNA networks), the additional fiber budget of 20% is still not enough to make the resiliency gap small. This means that more fiber links are required in the upgrade of existing networks to reach the best resiliency to higher number of critical nodes.
For illustrative purposes, Figure 3 presents the original topologies, the best upgraded topologies with L ′ = 10 % L and L ′ = 20 % L (with the additional links highlighted in blue) and the best topologies with the same fiber budget L obtained considering c = 4 critical nodes. To highlight the differences, links of the best topology (and the best upgraded topologies) not in the original topology are highlighted in blue and, in all cases, critical nodes are represented with red squares. The analysis of these topologies shows that: Germany50. The critical node set splits the original network in three components (1, 10, and 35 nodes each) while it only isolates a pair of nodes from the others in the best topology. Moreover, the critical node set isolates 4 nodes from the others in the 10% upgraded topology and a pair of nodes in the 20% upgraded topology. So, an upgrade of 20% has the same resilience to 4 critical nodes as the best topology. PalmettoNet. The critical node set splits the original network in four components (2, 6, 13 and 20 nodes each) while it splits the best topology in only two components (8 and 33 nodes each). Moreover, the critical node set splits the 10% upgraded topology in four components (2, 5, 5, and 29 nodes) and the 20% upgraded topology in two components (9 and 32 nodes). In this case, the resilience to 4 critical nodes of the 20% upgraded topology is still slightly lower than the resilience of the best topology. MissouriNA. The critical node set splits the original network in four components (8, 16, 17 and 19 nodes each) while it splits the best topology in only two components (9 and 51 nodes each). Moreover, the critical node set splits the 10% upgraded topology in two components (24 and 36 nodes) and the 20% upgraded topology in two components (10 and 50 nodes). As in the previous case, the resilience to 4 critical nodes of the 20% upgraded topology is still slightly lower than the resilience of the best topology.
Concerning the running time of the proposed algorithm variants, Table 10 presents the average running time of the network upgrade problem (Algorithm 10), among all five values c = 2, …, 6 of each problem instance. In the instance name, "Ger," "Pal," and "Mis" refer to the Germany50, PalmettoNet, and MissouriNA networks, respectively, while the "10" and "20" refer to the fiber budget L ′ = 10 % L and 20 % L, respectively.
Similarly, Table 11 presents the average running time of the network design problem (Algorithm 9), among all five values c = 2, …, 6 of each instance. In the instance name, the "0," "'10," and "20" refer to the fiber budget B = L + 0 % L, B = L + 10 % L, and B = L + 20 % L, respectively.  The analysis of the running times of Tables 10 and 11 let us draw the following conclusions. First, all algorithm variants have higher running times when the problems consider higher fiber budget values. This was expected since more links are added with a higher fiber budget and, therefore, the algorithms run a larger number of iterations.
Second, even without using the local search algorithm (Algorithm 5) in the network design problem, this problem has much longer running times than the network upgrade problem. The main reason is that network upgrade problem starts with a fixed set of fiber links (the links of the original topology), while the network design problem has to build a solution from scratch.
Third, note that at each iteration, Algorithm 2 solves a single CND model, Algorithm 4 considering the optimal CND solutions solves a variable number of CND models and Algorithm 4 considering S CND solutions solves an even higher number of CND models (in our case, 10 CND models, as we consider S = 10). Moreover, solving the CND models is the most time-consuming part of all algorithms. As a consequence, the running times of both Algorithm 4 are higher, on average, than the running times of Algorithm 2 and the running times of Algorithm 4 considering S CND solutions are higher, on average,  than the running times of Algorithm 4 considering the optimal CND solutions. Note that the use of f 1 (i, j) or f 2 (i, j) as the criterion to select each new link does not have a significant impact in the obtained running times. Another aspect of interest is the comparison of the node degree distributions between the original topologies and the best topologies with the same total fiber length L. Figure 4 shows these distributions for the three network cases with the best topologies obtained for c = 4 critical nodes (original topologies in blue and best topologies in green).
For example, in Germany50 original topology, there are 10 nodes with the minimum degree of 2 and 11 nodes with the maximum degree of 5 while in the best topology all nodes have a degree between 3 and 4. In the other two networks, we observe from the original topology to the best topology that the number of nodes with degree 1 and 2 decreases and the maximum network degree also decreases from 5 to 4 in both cases.
So, the conclusion is that in the best topologies, there is a decrease in the number of nodes with the lowest and highest degrees and an increase in the number of nodes with degrees closer to the average. This observation also stands in the best topologies for the other values of c showing that resilient topologies tend to have more homogeneous node degrees.

CONCLUSIONS
In this work, we have addressed the topology design of transparent optical networks aiming to maximize their resilience to the simultaneous failure of their critical nodes. We have proposed different algorithm variants of a deterministic method that can be used both in the design of network topologies and in the upgrade of existing topologies. We have run the proposed algorithm on three network topologies with publicly available information comparing the resiliency gap between the existing and upgraded topologies with the best topologies designed to maximize its resilience with the same fiber budget.
The results have shown that the resiliency gap of existing topologies is significantly large but network upgrades with L ′ = 10 % L can already reduce significantly the resiliency gaps provided that such upgrades are aimed at maximizing the network resiliency to the critical node failures. Finally, comparing the best topologies with the existing ones, the best topologies are characterized by a decrease of the number of nodes with the lowest and highest degrees and an increase of the number of nodes with degrees closer to the average node degree. This clearly shows that network topologies resilient to critical node failures tend to have more homogeneous degrees among all their nodes.