Internet DRAFT - draft-choudhury-manral-flooding-simulation
draft-choudhury-manral-flooding-simulation
Internet Engineering Task Force Gagan L. Choudhury
Internet Draft AT&T
Expires in May, 2003
draft-choudhury-manral-flooding-simulation-00.txt
Vishwas Manral
Netplane Systems
November, 2002
LSA Flooding Optimization Algorithms and Their Simulation Study
Status of this Memo
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Distribution of this memo is unlimited.
Abstract
The full flooding of LSAs in OSPF may cause large CPU and memory
consumption at node processors of a network with large number of nodes,
links, adjacencies per node and LSDB size. An LSA storm, defined as
the near-simultaneous update of a large number of LSAs, in such
networks may cause network instability and outage.
We do a simulation study of four alternative algorithms to full
flooding to determine their ability to handle large LSA storms.
Algorithm 2 does full flooding but in case two neighbors are connected
by multiple interfaces, flooding is done over only one such interface.
The other algorithms are based on Algorithm 2 and employ
further flooding restrictions. In Algorithm 3 each node asks only
a subset of its one-hop neighbors, known as multipoint relays, to
flood further. In Algorithm 4 flooding is done only over a
Choudhury et. al. [Page 1]
Internet Draft Flooding Simulation May, 2003
minimum spanning tree. Algorithm 5 uses full flooding (as in Algorithm
2) for LSAs carrying intra-area topology information and restricted
flooding over a minimum spanning tree (as in Algorithm 4) for other
LSAs.
In terms of handling large LSA storms and in terms of robustness under
link failures, it appears that Algorithms 2 and 5 are the most
desirable ones and it is proposed that they be pursued further.
1. Introduction
In OSPF, LSAs (Link-State-Advertisements) received at a node
(router) are flooded over all interfaces except the one it came over.
This full flooding adds a lot of redundancy and ensures that LSAs do
reach all intended destinations reliably and quickly. However, this
may also consume a large amount of CPU and memory resources at the
node processors of a large network (largeness in terms of number of
nodes, number of links, adjacencies per node, and LSA Database size).
This is particularly true in the case of a LSA storm, which we define
as the simultaneous or near-simultaneous update of a large number of
LSAs. An LSA storm may be caused by (a) one or more link failures due
to fiber cuts, (b) one or more node failures for some reason, e.g.,
software crash or some type of disaster in an office complex hosting
many nodes, (c) requirement of taking down and later bringing back
many nodes during a software/hardware upgrade, (d) near-
synchronization of the once-in-30-minutes refresh instants of a subset
of the LSAs, (e) refresh of all LSAs in the system during a change in
software version.
If the CPU utilization at a node stays high for a
long time then that may result in many retransmissions (retransmission
timer is typically 5 seconds) and may also result in links being
declared down due to missed Hello packets (Router-Dead interval is
typically 40 seconds). The above events would be reinforced in the
case of heavy memory utilization and dropped packets. If a link is
declared down then that would generate additional LSAs to carry the
link-down information and if it results in rerouting of MPLS LSPs then
that may generate further LSAs carrying traffic-engineering information.
Eventually when the link comes back up, further LSAs would be generated
again to carry the link-up information and potentially LSAs carrying
traffic-engineering information due to a possible reroute of MPLS LSPs.
The retransmissions and additional LSA generations result in further
CPU and memory usage, essentially causing a positive feedback loop.
We define the LSA storm size as the number of LSAs in the original
storm and not counting any additional LSAs resulting from the feedback
loop described above. If the LSA storm is too large then the positive
Choudhury et. al. [Page 2]
Internet Draft Flooding Simulation May, 2003
feedback loop mentioned above may be large enough to indefinitely
sustain a large CPU and memory utilization at many network nodes,
thereby driving the network to an unstable state.
In the past, network
outage events have been reported in IP and ATM networks using
link-state protocols such as OSPF, IS-IS, PNNI or some proprietary
variants. See, for example [Ref1-Ref4]. In many of these examples,
large scale flooding of LSAs or other similar control messages (either
naturally or triggered by some bug or inappropriate procedure) have
been partly or fully responsible for network instability and outage.
In this contribution we consider several alternatives to the basic full
flooding algorithm in order to significantly increase the LSA storm
threshold that can cause network instability. Ideally, the LSA storm
threshold should be the same as the full LSDB (LSA Database) size since
normally this is the upper limit of the number of LSAs that can be
present in a LSA storm.
One of the algorithms we consider is similar in spirit to flooding
optimization mechanisms proposed in the past [Ref5,Ref6]. Also, our
work is synergistic with a broader set of scalability and stability
improvement proposals. [Ref7] proposes a mechanism for greatly
reducing LSA refreshes in stable topologies. [Ref8] proposes
prioritization of certain critical OSPF packets (e.g., Hellos and
LSA Acknowledgements) in order to ensure that links are not declared
down due to missed hellos and retransmission traffic is reduced
during periods of congestion. [Ref9] proposes a wide range of
congestion control and failure recovery mechanisms.
2. The Flooding Algorithms
All algorithms we describe in this Section are applicable to a single
OSPF Area. If there are multiple areas then the same flooding
algorithm may be run independently and in parallel within each area.
Algorithm 1 (Full Flooding): LSAs are flooded over all interfaces just
as in RFC 2328.
Algorithm 2 (Flooding over one of many parallel links between
neighbors): If two routers are connected by multiple parallel point-
to-point links then flooding is done over only one such link. If the
link designated for flooding goes down and at least one other parallel
link is still up then a different parallel link is designated for
flooding. Mechanisms similar in spirit to this were proposed in
[Ref5,Ref6]. PNNI, a Link-state protocol
Choudhury et. al. [Page 3]
Internet Draft Flooding Simulation May, 2003
used in ATM networks also uses a similar mechanism [Ref10].
(It is also to be
noted that other variants are possible. For example, LSAs may be
scheduled for transmission over all interfaces to a neighbor but with
random delays and as soon as an acknowledgement is received, the
remaining transmissions are disabled. For the purpose of simulation
we stick to the basic mechanism described by the first two
sentences of the algorithm description)
Algorithm 3 (Flooding over one of many parallel links between
neighbors plus Multipoint Relays): In addition to the flooding
restrictions in Algorithm 2 (i.e., if there are multiple parallel
links to an immediate neighbor then only one is used for flooding),
we also introduce a further restriction by using the distributed
multipoint relay mechanism. For a given node x, we define N1(x) as
the set of its immediate or one-hop neighbors. We also define N2(x),
the set of two-hop neighbors of x, as the set of nodes that are not
a one-hop neighbor of x but are a one-hop neighbor of at least one
member of N1(x). A subset of N1(x) whose one-hop neighbors include
all two-hop neighbors of x, i.e., all members of N2(x), is defined
as M(x), the "Multipoint Relay Set" of node x.
As node x floods an
LSA to its one-hop neighbors, it asks only the multipoint relays to
flood further. A one-hop neighbor of x that is not one of its
multipoint relays, i.e., does not belong to M(x), would receive the
LSA and acknowledge it but would not retransmit it any further. A
one-hop neighbor of x that is one of its multipoint relays, i.e.,
belongs to M(x), would flood further using the same algorithm, i.e.,
it will flood to all its one-hop neighbors (except Node N1) and ask
only its multipoint relays to flood further. This is a distributed
algorithm in the sense that each node determines the set of its
multipoint relays independent of the selection process at any other
node.
If for all node x, the set of multipoint relays is the same
as the set of all one-hop neighbors then Algorithm 3 would be
identical to Algorithm 2. In general, however, it would be possible
to choose a multipoint relay set that is smaller than the set of all
one-hop neighbors. Finding an optimal multipoint relay set has large
computational complexity and so we choose the following near-optimal
heuristic algorithm.
Step 1: Include a one-hop neighbor (say y) of x as part of M(x) if
there exists at least one member of N2(x) who has y as its only one-hop
neighbor in the set N1(x). Keep repeating this process until no more
one-hop neighbor y with above-mentioned property exists.
Choudhury et. al. [Page 4]
Internet Draft Flooding Simulation May, 2003
Step 2: If all nodes in N2(x) are covered as one-hop neighbors of
members of M(x) then stop. Otherwise, choose a member of N1(x) - M(x)
as a new member of M(x) which covers as one-hop neighbors the maximum
number of previously uncovered members of N2(x). Keep repeating this
step until all members of N2(x) are covered at which point the
selection of the multipoint relay set M(x) is complete.
For every node x in the network, perform the multipoint relay set
selection process in parallel and independently.
The multipoint relay
mechanism has been proposed for use in Mobile Wireless networks
[Ref11].
Algorithm 4 (Flooding over one of many parallel links between neighbors
plus flooding over a minimum spanning tree): In addition to the
flooding restrictions in Algorithm 2 (i.e., if there are multiple
parallel links to an immediate neighbor then only one is used for
flooding), we also introduce a further restriction by developing a
minimum spanning tree that connects all nodes and flooding only over
the minimum spanning tree. It is possible to create a tree spanning
all network nodes in a given area in many ways.
We create a spanning
tree with the objectives of minimizing the maximum delay between any
two network nodes in a given area and restricting the adjacency at a
node in the spanning tree to no more than a certain maximum value
adjmax. The first objective is to minimize the LSA convergence time
and the second objective is to reduce the impact of LSA storm on nodes
since the node with the highest adjacency is impacted the most. For
delay between two nodes we use just the sum of the propagation and
processing delays along the spanning tree between the source and the
destination. Other delay components such as queueing delays and
scheduling delays are not considered since they are dependent on the
traffic load, exact LSA generation time, etc. and difficult to
estimate for the purpose of optimization (these quantities are
nevertheless computed exactly as part of the simulation model to be
described later). Note that only a subset of the adjacencies are on
the spanning tree (also no more than one between immediate neighbors)
and only the sum of those is not to exceed adjmax.
We develop the following heuristic algorithm. Build the spanning tree
sequentially starting from a single node that is nearest to the
geographic center of the network area. In each stage add that link
and associated node to the tree that would minimize the maximum delay
(along the spanning tree) from the new node to any other node in the
existing spanning tree but with the restriction that the adjacency of
any node cannot exceed adjmax. (We have seen that a spanning tree
generated using this algorithm can have significantly lower maximum
Choudhury et. al. [Page 5]
Internet Draft Flooding Simulation May, 2003
end-to-end delay between nodes compared to a randomly generated
spanning tree with the same adjmax value).
After the formation of a spanning tree if a link on the spanning tree
goes down then it is replaced with a parallel link between the same
set of nodes, if any such exists. If no such parallel link exists then
the spanning tree is to be recomputed. The spanning tree
re-computation is also needed in the case of failure of a node on the
spanning tree. The spanning tree computation/re-computation may be
done independently at each node in a distributed fashion but using the
same algorithm, or it may be computed at one node and flooded to the
whole network periodically using Algorithm 2.
Algorithm 5 (Flooding over one of many parallel links between neighbors
for all LSAs plus flooding over a minimum spanning tree for LSAs that
do not carry intra-area topology information): This algorithm is a
hybrid between Algorithms 2 and 4. For LSAs that carry intra-area
topology information (e.g., Router and Network LSAs), flooding is done
as in Algorithm 2. We call these Type A LSAs. For LSAs that do not
carry area topology information, flooding is done as in Algorithm 4
and we call them Type B LSAs. It is to be noted that since Type A and
Type B LSAs are flooded differently, they should not be packed as part
Of the same LSU. Examples of Type B LSAs include ASE LSA,
summary LSA and ASBR-summary LSA. Another example may be Opaque LSAs
used to carry link bandwidth information as used in the Traffic
Engineering extension of OSPFV2 [Ref12].
The reason for introducing Algorithm 5 is that
in order to compute the minimum spanning tree (and re-compute it
under link/node failure) we need the intra-area topology information.
If we also rely on the minimum spanning tree to get the topology
information, as in Algorithm 4, then it may be difficult to get an
accurate picture of the topology under the failure of the minimum
spanning tree. In Algorithm 5, however, the topology information that
is needed to compute/re-compute the minimum spanning tree is
distributed using the more robust Algorithm 2 and so even if some
links/nodes of the minimum spanning tree fails but the rest of the
network remains connected then the topology information would be
accurately carried to all network nodes in the area and it would be
possible to re-compute the minimum spanning tree.
On the other hand,
Algorithm 4 is less message intensive compared to Algorithm 2 and Type
B LSAs can benefit from this reduced amount of messaging. In many
situations Type B LSAs may be many more in number compared to Type A
LSAs. In one example there may be many more ASE, Summary and
ASBR-Summary LSAs compared to Router and Network LSAs. In another
example, if a traffic-engineering-related Opaque LSA is used for each
Choudhury et. al. [Page 6]
Internet Draft Flooding Simulation May, 2003
direction of each link of the network then there would typically be
many more LSAs of this type compared to Router and Network LSAs.
Algorithm 5 has the same amount of robustness as Algorithm 2 for
Carrying topology-related information but it can be far less
message-intensive if majority of the LSAs are of Type B.
3. The Network under Simulation
We generate a random network over a rectangular grid using a modified
version of Waxman's algorithm [Ref13] that ensures that the network is
connected and has a pre-specified number of nodes, links, maximum
number of neighbors per node,
and maximum number of adjacencies per node. The rectangular grid
resembles the continental U.S.A. with maximum propagation delay of
30 ms in the East-West direction and maximum propagation delay of
15 ms in the North-South direction. We consider two different
network sizes as explained in Section 4.
The network has a flat, single-area topology.
Each node is a Router and each link is a point-to-point link
connecting two routers.
We assume that node CPU and memory (not the link bandwidth) is the main
bottleneck in the LSA flooding process. This will typically be true
for high speed links (e.g., OC3 or above) and/or links where OSPF
traffic gets an adequate Quality of Service (QoS) compared to other
traffic.
Different Timers (usually default values): LSA refresh interval =
1800 seconds, Hello refresh interval = 10 Seconds, Router-Dead
interval = 40 seconds, LSA retransmission interval = 5 Seconds (note
that a retransmission is disabled on the receipt of either an
explicit acknowledgment or a duplicate LSA over the same interface
that acts as an implicit acknowledgment),
Minimum Time Between generation of the same LSA = 5 Seconds, Minimum
time between successive Dijkstra SPF calculations is 1 second.
Packing of LSAs: It is assumed that for any given node, the LSAs
generated over a 1-second period are packed together to form an LSU
but no more than 3 LSAs are packed in one LSU.
LSU/Ack/Hello Processing Times: All processing times are expressed in
terms of the parameter T. Different values of T in the range of 0.5 ms
to 2 ms are considered.
In the case of a dedicated processor for processing OSPF packets the
Choudhury et. al. [Page 7]
Internet Draft Flooding Simulation May, 2003
processing time reported represents the true processing time. If the
processor does other work and only a fraction of its capacity can be
dedicated to OSPF processing then we have to inflate the processing
time appropriately to get the effective processing time and in that
case it is assumed that the inflation factor is already taken into
account as part of the reported processing time.
The fixed time to send or receive any LSU, Ack or Hello packet is T.
In addition, a variable processing time is used for LSU and Ack
depending on the number and types of LSAs packed. No variable
processing time is used for Hello.
Variable processing time per Router LSA is (0.5 + 0.17L)T where L is
the number of adjacencies advertised by the Router LSA. For other
LSA types (e.g., ASE LSA or a "Link" LSA carrying traffic engineering
information about a link), the variable processing time per LSA is
0.5T.
Variable processing time for an Ack is 25% that of the corresponding
LSA.
It is to be noted that if multiple LSAs are packed in a single LSU
packet then the fixed processing time is needed only once but the
variable processing time is needed for every component of the packet.
LSU/Ack/Hello Priority: All LSUs/Acks/Hellos received at a node are
queued at the same priority and processed in a FIFO manner. Any
packets generated internally to a node and usually based on a timer
are processed at a higher priority. This includes the initial LSA
storm, LSA refresh, Hello refresh, LSA retransmission and new LSA
generation after detection of a failure or recovery.
Buffer Size for Incoming LSUs/Acks/Hellos: Buffer size is assumed
to be 2000 packets where a packet is either an incoming LSU, Ack or
Hello. The buffer is managed in a FIFO manner and when the buffer is
full, new packet arrivals are dropped.
Processing Time for MPR/MST computation at a node (Algorithms
3, 4 and 5): 10 ms. This is needed once in the beginning and
otherwise only when the last live link between a pair of nodes in the
network goes down.
LSA Refresh: Each LSA is refreshed once in 1800 seconds and the
refresh instants of various LSAs in the LSDB are assumed to be
uniformly distributed over the 1800 seconds period, i.e., they are
completely unsynchronized. If however, an LSA is generated as part
of the initial LSA storm then it goes on a new refresh schedule of
once in 1800 seconds starting from its generation time.
LSA Storm Generation: As defined earlier, "LSA storm" is the
Choudhury et. al. [Page 8]
Internet Draft Flooding Simulation May, 2003
simultaneous or near simultaneous generation of a large number of LSAs.
In the case of only Router and ASE LSAs we normally assume that the
number of ASE LSAs in the storm is about 4 times that of the Router
LSAs, but the ratio is allowed to change if either the Router or the
ASE LSAs have reached their maximum possible value. In the case of
only Router and Link LSAs (carrying traffic engineering information)
we normally assume that the number of Link LSAs in the storm is about
4 times that of the Router LSAs, but the ratio is allowed to change
if either the Router or the Link LSAs have reached their maximum
possible value. For any given LSA storm we keep generating LSAs
starting from Node index 1 and moving upwards and stop until the
correct number of LSAs of each type have been generated. The LSAs
generated at any given node is assumed to start at an instant
uniformly distributed between 20 and 30 seconds from the start of the
simulation. Successive LSA generations at a node are assumed to be
spaced apart by 400 ms. It is to be noted that during the period of
observation there are other LSAs generated besides the ones in the
storm. These include refresh of LSAs that are not part of the storm
and LSAs generated due to possible link failures and subsequent
possible link recoveries.
Failure/Recovery of Links: If no Hello is received over a link (due to
CPU/memory congestion) for longer than Router-Dead Interval then
the link is declared down. At a later time, if Hellos are received
then the link would be declared up. Whenever a link is declared
up or down, one Router LSA is generated by each Router on the
two sides of the point-to-point link. If "Link LSAs" carrying
traffic engineering information is used then it is assumed that each
Router would also generate a Link LSA. In this case it is also assumed
that due to rerouting of LSPs, one other link in the network
(selected randomly in the simulation) would have significant change
in reserved bandwidth which would result in one Link LSA being
generated by the routers on the two ends of the link.
4. Simulation Results
In this section we study the relative performance of the five
algorithms with a range of Network sizes, LSA types, and processing
time values as explained below:
Network size: Two networks are considered. Network 1 has 100 nodes,
1200 links, maximum number of neighbors per node is 30 and maximum
number of adjacencies per node is 50 (same neighbor may have more
than one adjacencies). Network 2 has 50 nodes, 600 links, maximum
number of neighbors per node is 25 and maximum number of adjacencies
per node is 48. Dijkstra SPF calculation time for Network 1 is assumed
to be 100 ms and that for Network 2 is assumed to be 70 ms.
Choudhury et. al. [Page 9]
Internet Draft Flooding Simulation May, 2003
LSA Type: Each node has 1 Router LSA (Total of 100 for Network 1 and
50 for Network 2). There are no Network LSAs since all links are point-
to-point links and no Summary LSAs since the network has only one area.
Regarding other LSA types we consider two scenarios. In Scenario 1 we
assume that there are no ASE LSAs and each link has one "Link" LSA
carrying traffic engineering information (Total of 2400 for Network 1
and 1200 for Network 2). In Scenario 2 we assume that there are no
"Link" LSAs and half of the nodes are ASA-Border nodes and each border
node has 10 ASE LSAs (Total of 500 for Network 1 and 250 for
Network 2). We identify Scenario 1 as "Link LSAs" and Scenario 2 as
"ASE LSAs".
Processing time values: Processing times for LSUs, Acks and Hello
packets have been previously expressed in terms of a common parameter
T. Three values are considered for T, which are 1 ms, 0.5 ms and 2 ms
respectively.
Based on Network size, LSA type and Processing time values we develop
5 Test cases as follows:
Case 1: Network 1, Link LSAs and T = 1 ms.
Case 2: Network 1, ASE LSAs and T = 1 ms.
Case 3: Network 1, Link LSAs and T = 0.5 ms.
Case 4: Network 1, Link LSAs and T = 2 ms.
Case 5: Network 2, Link LSAs and T = 1 ms.
For each case and for each algorithm we study the network stability as
=========|==========================================================
| Number of Non-Converged LSUs in the Network at Time(in sec)
LSA |
STORM |==========================================================
SIZE |10s 20s 30s 35s 40s 50s 60s 80s 100s
=========|==========================================================
50 | 0 0 9 10 0 0 0 0 1
(Stable)|
---------|----------------------------------------------------------
100 | 0 0 24 29 26 21 8 1 1
(Stable)|
---------|----------------------------------------------------------
130 | 0 0 31 45 42 41 38 127 237
(Unstable)
=========|==========================================================
Table 1: Network Stability Vs. LSA Storm (Case 1, Algorithm 1)
Choudhury et. al. [Page 10]
Internet Draft Flooding Simulation May, 2003
a function of the size of the LSA storm. The stability is determined
by looking at the number of non-converged LSUs as a function of time.
An example is shown in Table 1 for Case 1 and Algorithm 1.
The LSA storm starts a little after 20 seconds and so for some period
of time after that the
number of non-converged LSUs should stay high and then come down for a
stable network. This happens for LSA storms of sizes 50 and 100. With
an LSA storm of size 130, the number of non-converged LSAs stay high
indefinitely due to repeated retransmissions, link failures due to
missed Hellos for more than the Router-Dead interval which generates
additional LSAs and also due to subsequent link recoveries which again
generate additional LSAs. It turns out that for this example the
maximum LSA storm size that still keeps the network stable is 115.
|===========|========================================================|
| | Maximum Allowable LSA Storm Size For |
|Algorithm |==========|==========|===========|==========|===========|
| Type | Case 1 | Case 2 | Case 3 | Case 4 | Case 5 |
| |(Net. 1, | (Net. 1, | (Net. 1, | (Net. 1, | (Net. 2, |
| |Link LSAs,| ASE LSAs,| Link LSAs,|Link LSAs,| Link LSAs,|
| | T = 1 ms)| T = 1 ms)|T = 0.5 ms)| T = 2 ms)| T = 1 ms) |
|===========|==========|==========|===========|==========|===========|
|Algorithm 1| 115 | 130 | 275 | 25 | 138 |
|(Full Fld.)| | | | | |
|___________|__________|__________|___________|__________|___________|
|Algorithm 2| | | | | |
|(Fld. Over | 238 | 243 | 610 | 85 | 450 |
|One Parall.| | | | | |
| Link) | | | | | |
|___________|__________|__________|___________|__________|___________|
|Algorithm 3| | | | | |
|(Alg. 2 + | 248 | 249 | 645 | 90 | 450 |
|Multipoint | | | | | |
|Relay) | | | | | |
|___________|__________|__________|___________|__________|___________|
|Algorithm 4| | | | | |
| (Alg. 2 + | Full LSDB| Full LSDB| Full LSDB | 1825 | Full LSDB |
|Fld. Over | (2500) | (600) | (2500) | | (1250) |
|Span. Tree)| | | | | |
|___________|__________|__________|___________|__________|___________|
|Algorithm 5| | | | | |
|(Alg. 2 For| Full LSDB| Full LSDB| Full LSDB | 625 | Full LSDB |
|Type A, Alg| (2500) | (600) | (2500) | | (1250) |
|4 For Type | | | | | |
|B LSAs | | | | | |
|___________|__________|__________|___________|__________|___________|
Table 2: Maximum Allowable LSA Storm for a Stable Network
Choudhury et. al. [Page 11]
Internet Draft Flooding Simulation May, 2003
This stability threshold depends on the Case and the Algorithm under
consideration. A better algorithm should allow a higher stability
threshold.
In Table 2 we show the maximum allowable LSA storm size that would
still keep the network stable for the five different cases and for
the five different algorithms.
5. Summary of Observations
Comparison of Algorithm 2 and Algorithm 1: We observe that in all
cases in Table 2, Algorithm 2 substantially increases (usually by a
factor of two or more) the maximum LSA storm threshold of the network
compared to Algorithm 2. This implies that flooding over only one of
many parallel links between nodes has clear beneficial impact in terms
of improving network scalability and stability and ought to be pursued.
Comparison of Algorithm 3 and Algorithm 2: The Multipoint relay
mechanism used in Algorithm 3 increases the LSA storm threshold
(compared to Algorithm 2)
but by a moderate amount. One reason for this moderate improvement
is that even though the multipoint relay mechanism can significantly
reduce the overall amount of flooding in the network, it appears that
the nodes with highest number of neighbors tend to be more likely
to be chosen as multipoint relays by their neighbors. As an example,
in Network 1 (Cases 1-4 in Table 2) on the average a node chooses
70% of its neighbors as multipoint relays. However, the node with
highest number of neighbors (30), is chosen as a multipoint relay
by 27 out of the 30 (i.e., 90%) of its neighbors. This node, due
to its large number of neighbors, also happens to be the bottleneck
in terms of determining LSA storm threshold and since its effective
number of neighbors (the ones that ask it to flood further) goes
down only slightly, the overall improvement in LSA storm threshold
is moderate. In Network 2 (Case 5 in Table 2) all neighbors of the
highest adjacency node elect this node as multipoint relays and
therefore there is no improvement in LSA storm threshold.
Comparison of Algorithm 4 and Algorithm 2: Compared to Algorithm 2,
Algorithm 4 always raises the LSA storm threshold substantially and
in four out of the five cases the LSA storm threshold is raised to
its maximum value, i.e., it is OK to flood all LSAs in the LSDB over
a short period of time. However, one issue with Algorithm 4 is that
if a link on the minimum spanning tree fails, many nodes may not get
the required LSAs carrying the topology information and it would be
difficult for them to recompute the new minimum spanning tree. In the
simulation we re-computed the minimum spanning tree based on global
knowledge under failure conditions.
Comparison of Algorithm 5 and Algorithm 2: Compared to Algorithm 2,
Algorithm 5 always raises the LSA storm threshold substantially and
Choudhury et. al. [Page 12]
Internet Draft Flooding Simulation May, 2003
in four out of the five cases the LSA storm threshold is raised to
its maximum value, i.e., the full LSDB size. This Algorithm is
worse than Algorithm 4 in terms of maximum LSA storm threshold (as
shown in Case 4) but much better than Algorithms 1, 2 and 3. It is
also more robust than Algorithm 4 under link failure conditions. Even
if a link on the minimum spanning tree fails, those carrying intra-
area topology information (defined earlier as Type A LSAs) would
continue to get to all destinations since they are flooded as in
Algorithm 2 (full flooding but over only one interface between
neighbors). Once all the nodes know about the link failure condition,
they would be able to re-compute the new minimum spanning tree.
Overall Observation: It appears that Algorithm 2 (full flooding but
only over one of many parallel interfaces between neighbors) and
Algorithm 5 (flooding as in Algorithm 2 for LSAs carrying intra-area
topology information and flooding over a minimum spanning tree for
other LSAs) are the best ones and should be pursued further for
detailed specification.
6. Acknowledgements
We would like to acknowledge Jerry Ash, Margaret Chiosi, Elie Francis,
Anurag Maunder, Beth Munson, Moshe Segal, Vera Sapozhnikova, and Pat
Wirth for collaboration and encouragement in our scalability
improvement efforts for Link-State-Protocol based networks. We also
thank Aman Shaikh and Moshe Segal for comments on this draft.
7. References
[Ref1] Pappalardo, D.,"AT&T, customers grapple with ATM net outage,"
Network World, February 26, 2001.
[Ref2] "AT&T announces cause of frame-relay network outage," AT&T
Press Release, April 22, 1998.
[Ref3] Cholewka, K., "MCI Outage Has Domino Effect," Inter@ctive
Week, August 20, 1999.
[Ref4] Jander, M., "In Qwest Outage, ATM Takes Some Heat," Light
Reading, April 6, 2001.
[Ref5] A. Zinin and M. Shand, "Flooding Optimizations in Link-State
Routing Protocols," Work in Progress.
[Ref6] J. Moy, "Flooding over Parallel Point-to-Point Links," Work in
progress.
[Ref7] P. Pillay-Esnault, "OSPF Refresh and flooding reduction in
Choudhury et. al. [Page 13]
Internet Draft Flooding Simulation May, 2003
stable topologies," Work in progress.
[Ref8] G. Choudhury, V. Sapozhnikova, A. Maunder and V. Manral,
"Explicit Marking and Prioritized Treatment of Specific IGP
Packets for Faster IGP Convergence and Improved Network Scalability
and Stability," Work in Progress.
[Ref9] J. Ash, G. Choudhury, V. Sapozhnikova, M. Sherif, A. Maunder,
V. Manral, "Congestion Avoidance & Control for OSPF Networks",
Work in Progress.
[Ref10] "Private Network-to-Network Interface", ATM Forum
Specification af-pnni-0055.000, March, 1996.
[REF11] A. Qayyum, L. Viennot, A. Laouiti, "Multipoint Relaying for
Flooding Broadcast Messages in Mobile Wireless Networks," Project
Hipercom, INRIA Rocquencourt, France.
[Ref12] D. Katz, D. Yeung, K. Kompella, "Traffic Engineering
Extensions to OSPF Version 2," Work in progress.
[Ref13] B. M. Waxman, "Routing of Multipoint Connections," IEEE
Journal on Selected Areas in Communications, 6(9):1617-1622, 1988.
8. Authors' Addresses
Gagan L. Choudhury
AT&T
Room D5-3C21
200 Laurel Avenue
Middletown, NJ, 07748
USA
Phone: (732)420-3721
email: gchoudhury@att.com
Vishwas Manral
Netplane
189, Prashasan Nagar,
Road Number 72
Jubilee Hills, Hyderabad
India
email: Vishwasm@netplane.com
Choudhury et. al. [Page 14]