Internet DRAFT - draft-choudhury-manral-flooding-simulation

draft-choudhury-manral-flooding-simulation




   Internet Engineering Task Force                 Gagan L. Choudhury
   Internet Draft                                  AT&T
   Expires in May, 2003                        
   draft-choudhury-manral-flooding-simulation-00.txt         
                                                   Vishwas Manral
                                                   Netplane Systems

                                                   November, 2002


  LSA Flooding Optimization Algorithms and Their Simulation Study



Status of this Memo

   This document is an Internet-Draft and is in full conformance
   with all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
        http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at
        http://www.ietf.org/shadow.html.
   Distribution of this memo is unlimited.


Abstract

The full flooding of LSAs in OSPF may cause large CPU and memory
consumption at node processors of a network with large number of nodes,
links, adjacencies per node and LSDB size.  An LSA storm, defined as  
the near-simultaneous update of a large number of LSAs, in such  
networks may cause network instability and outage.

We do a simulation study of four alternative algorithms to full 
flooding to determine their ability to handle large LSA storms.
Algorithm 2 does full flooding but in case two neighbors are connected
by multiple interfaces, flooding is done over only one such interface.
The other algorithms are based on Algorithm 2 and employ
further flooding restrictions. In Algorithm 3 each node asks only
a subset of its one-hop neighbors, known as multipoint relays, to
flood further. In Algorithm 4 flooding is done only over a


 Choudhury et. al.                                            [Page 1]

 Internet Draft             Flooding Simulation              May, 2003


minimum spanning tree. Algorithm 5 uses full flooding (as in Algorithm
2) for LSAs carrying intra-area topology information and restricted
flooding over a minimum spanning tree (as in Algorithm 4) for other
LSAs.

In terms of handling large LSA storms and in terms of robustness under
link failures, it appears that Algorithms 2 and 5 are the most 
desirable ones and it is proposed that they be pursued further.         


1. Introduction

In OSPF, LSAs (Link-State-Advertisements) received at a node 
(router) are flooded over all interfaces except the one it came over.
This full flooding adds a lot of redundancy and ensures that LSAs do 
reach all intended destinations reliably and quickly. However, this  
may also consume a large amount of CPU and memory resources at the 
node processors of a large network (largeness in terms of number of 
nodes, number of links, adjacencies per node, and LSA Database size).

This is particularly true in the case of a LSA storm, which we define 
as the simultaneous or near-simultaneous update of a large number of 
LSAs.  An LSA storm may be caused by (a) one or more link failures due 
to fiber cuts, (b) one or more node failures for some reason, e.g., 
software crash or some type of disaster in an office complex hosting 
many nodes, (c) requirement of taking down and later bringing back 
many nodes during a software/hardware upgrade, (d) near-
synchronization of the once-in-30-minutes refresh instants of a subset
of the LSAs, (e) refresh of all LSAs in the system during a change in
software version.  

If the CPU utilization at a node stays high for a 
long time then that may result in many retransmissions (retransmission 
timer is typically 5 seconds) and may also result in links being 
declared down due to missed Hello packets (Router-Dead interval is 
typically 40 seconds).  The above events would be reinforced in the 
case of heavy memory utilization and dropped packets.  If a link is 
declared down then that would generate additional LSAs to carry the 
link-down information and if it results in rerouting of MPLS LSPs then 
that may generate further LSAs carrying traffic-engineering information.  
Eventually when the link comes back up, further LSAs would be generated
again to carry the link-up information and potentially LSAs carrying 
traffic-engineering information due to a possible reroute of MPLS LSPs.

The retransmissions and additional LSA generations result in further 
CPU and memory usage, essentially causing a positive feedback loop.  
We define the LSA storm size as the number of LSAs in the original 
storm and not counting any additional LSAs resulting from the feedback 
loop described above.  If the LSA storm is too large then the positive 


 Choudhury et. al.                                            [Page 2]

 Internet Draft             Flooding Simulation              May, 2003


feedback loop mentioned above may be large enough to indefinitely 
sustain a large CPU and memory utilization at many network nodes, 
thereby driving the network to an unstable state.  

In the past, network
outage events have been reported in IP and ATM networks using 
link-state protocols such as OSPF, IS-IS, PNNI or some proprietary 
variants.  See, for example [Ref1-Ref4].  In many of these examples,
large scale flooding of LSAs or other similar control messages (either
naturally or triggered by some bug or inappropriate procedure) have
been partly or fully responsible for network instability and outage. 

In this contribution we consider several alternatives to the basic full 
flooding algorithm in order to significantly increase the LSA storm 
threshold that can cause network instability. Ideally, the LSA storm 
threshold should be the same as the full LSDB (LSA Database) size since 
normally this is the upper limit of the number of LSAs that can be 
present in a LSA storm.

One of the algorithms we consider is similar in spirit to flooding 
optimization mechanisms proposed in the past [Ref5,Ref6]. Also, our
work is synergistic with a broader set of scalability and stability 
improvement proposals. [Ref7] proposes a mechanism for greatly 
reducing LSA refreshes in stable topologies. [Ref8] proposes
prioritization of certain critical OSPF packets (e.g., Hellos and 
LSA Acknowledgements) in order to ensure that links are not declared
down due to missed hellos and retransmission traffic is reduced
during periods of congestion. [Ref9] proposes a wide range of 
congestion control and failure recovery mechanisms.   
  

2. The Flooding Algorithms

All algorithms we describe in this Section are applicable to a single 
OSPF Area.  If there are multiple areas then the same flooding 
algorithm may be run independently and in parallel within each area.


Algorithm 1 (Full Flooding): LSAs are flooded over all interfaces just
as in RFC 2328.


Algorithm 2 (Flooding over one of many parallel links between 
neighbors): If two routers are connected by multiple parallel point-
to-point links then flooding is done over only one such link.  If the 
link designated for flooding goes down and at least one other parallel 
link is still up then a different parallel link is designated for 
flooding.  Mechanisms similar in spirit to this were proposed in 
[Ref5,Ref6]. PNNI, a Link-state protocol 


 Choudhury et. al.                                            [Page 3]

 Internet Draft             Flooding Simulation              May, 2003


used in ATM networks also uses a similar mechanism [Ref10].  

(It is also to be
noted that other variants are possible.  For example, LSAs may be 
scheduled for transmission over all interfaces to a neighbor but with
random delays and as soon as an acknowledgement is received, the
remaining transmissions are disabled. For the purpose of simulation
we stick to the basic mechanism described by the first two
sentences of the algorithm description)     


Algorithm 3 (Flooding over one of many parallel links between 
neighbors plus Multipoint Relays):  In addition to the flooding 
restrictions in Algorithm 2 (i.e., if there are multiple parallel 
links to an immediate neighbor then only one is used for flooding), 
we also introduce a further restriction by using the distributed 
multipoint relay mechanism.  For a given node x, we define N1(x) as 
the set of its immediate or one-hop neighbors.  We also define N2(x), 
the set of two-hop neighbors of x, as the set of nodes that are not 
a one-hop neighbor of x but are a one-hop neighbor of at least one 
member of N1(x).  A subset of N1(x) whose one-hop neighbors include 
all two-hop neighbors of x, i.e., all members of N2(x), is defined 
as M(x), the "Multipoint  Relay Set" of node x.  

As node x floods an 
LSA to its one-hop neighbors, it asks only the multipoint relays to 
flood further.  A one-hop neighbor of x that is not one of its 
multipoint relays, i.e., does not belong to M(x), would receive the 
LSA and acknowledge it but would not retransmit it any further.   A 
one-hop neighbor of x that is one of its multipoint relays, i.e., 
belongs to M(x), would flood further using the same algorithm, i.e., 
it will flood to all its one-hop neighbors (except Node N1) and ask 
only its multipoint relays to flood further.  This is a distributed 
algorithm in the sense that each node determines the set of its 
multipoint relays independent of the selection process at any other 
node.   

If for all node x, the set of multipoint relays is the same 
as the set of all one-hop neighbors then Algorithm 3 would be 
identical to Algorithm 2.  In general, however, it would be possible 
to choose a multipoint relay set that is smaller than the set of all 
one-hop neighbors.   Finding an optimal multipoint relay set has large 
computational complexity and so we choose the following near-optimal 
heuristic algorithm.

Step 1: Include a one-hop neighbor (say y) of x as part of M(x) if 
there exists at least one member of N2(x) who has y as its only one-hop 
neighbor in the set N1(x).  Keep repeating this process until no more 
one-hop neighbor y with above-mentioned property exists.


 Choudhury et. al.                                            [Page 4]

 Internet Draft             Flooding Simulation              May, 2003


Step 2: If all nodes in N2(x) are covered as one-hop neighbors of 
members of M(x) then stop.  Otherwise, choose a member of N1(x) - M(x) 
as a new member of M(x) which covers as one-hop neighbors the maximum 
number of previously uncovered members of N2(x).  Keep repeating this 
step until all members of N2(x) are covered at which point the 
selection of the multipoint relay set M(x) is complete.
For every node x in the network, perform the multipoint relay set 
selection process in parallel and independently.  

The multipoint relay 
mechanism has been proposed for use in Mobile Wireless networks 
[Ref11].


Algorithm 4 (Flooding over one of many parallel links between neighbors 
plus flooding over a minimum spanning tree):  In addition to the 
flooding restrictions in Algorithm 2 (i.e., if there are multiple 
parallel links to an immediate neighbor then only one is used for 
flooding), we also introduce a further restriction by developing a 
minimum spanning tree that connects all nodes and flooding only over 
the minimum spanning tree.   It is possible to create a tree spanning 
all network nodes in a given area in many ways.  

We create a spanning 
tree with the objectives of minimizing the maximum delay between any 
two network nodes in a given area and restricting the adjacency at a 
node in the spanning tree to no more than a certain maximum value 
adjmax.  The first objective is to minimize the LSA convergence time 
and the second objective is to reduce the impact of LSA storm on nodes 
since the node with the highest adjacency is impacted the most.  For 
delay between two nodes we use just the sum of the propagation and 
processing delays along the spanning tree between the source and the
destination.  Other delay components such as queueing delays and 
scheduling delays are not considered since they are dependent on the 
traffic load, exact LSA generation time, etc. and difficult to 
estimate for the purpose of optimization (these quantities are 
nevertheless computed exactly as part of the simulation model to be 
described later).  Note that only a subset of the adjacencies are on 
the spanning tree (also no more than one between immediate neighbors) 
and only the sum of those is not to exceed adjmax. 

We develop the following heuristic algorithm.  Build the spanning tree
sequentially starting from a single node that is nearest to the 
geographic center of the network area.  In each stage add that link 
and associated node to the tree that would minimize the maximum delay 
(along the spanning tree) from the new node to any other node in the 
existing spanning tree but with the restriction that the adjacency of 
any node cannot exceed adjmax. (We have seen that a spanning tree
generated using this algorithm can have significantly lower maximum


 Choudhury et. al.                                            [Page 5]

 Internet Draft             Flooding Simulation              May, 2003


end-to-end delay between nodes compared to a randomly generated
spanning tree with the same adjmax value).  

After the formation of a spanning tree if a link on the spanning tree 
goes down then it is replaced with a parallel link between the same 
set of nodes, if any such exists.  If no such parallel link exists then 
the spanning tree is to be recomputed.  The spanning tree 
re-computation is also needed in the case of failure of a node on the 
spanning tree.  The spanning tree computation/re-computation may be 
done independently at each node in a distributed fashion but using the 
same algorithm, or it may be computed at one node and flooded to the 
whole network periodically using Algorithm 2.


Algorithm 5 (Flooding over one of many parallel links between neighbors 
for all LSAs plus flooding over a minimum spanning tree for LSAs that 
do not carry intra-area topology information): This algorithm is a 
hybrid between Algorithms 2 and 4.  For LSAs that carry intra-area 
topology information (e.g., Router and Network LSAs), flooding is done 
as in Algorithm 2.  We call these Type A LSAs.  For LSAs that do not 
carry area topology information, flooding is done as in Algorithm 4 
and we call them Type B LSAs. It is to be noted that since Type A and
Type B LSAs are flooded differently, they should not be packed as part
Of the same LSU.  Examples of Type B LSAs include ASE LSA, 
summary LSA and ASBR-summary LSA.  Another example may be Opaque LSAs 
used to carry link bandwidth information as used in the Traffic 
Engineering extension of OSPFV2 [Ref12].  

The reason for introducing Algorithm 5 is that 
in order to compute the minimum spanning tree (and re-compute it 
under link/node failure) we need the intra-area topology information.  
If we also rely on the minimum spanning tree to get the topology 
information, as in Algorithm 4, then it may be difficult to get an 
accurate picture of the topology under the failure of the minimum 
spanning tree.  In Algorithm 5, however, the topology information that 
is needed to compute/re-compute the minimum spanning tree is 
distributed using the more robust Algorithm 2 and so even if some 
links/nodes of the minimum spanning tree fails but the rest of the 
network remains connected then the topology information would be 
accurately carried to all network nodes in the area and it would be 
possible to re-compute the minimum spanning tree.    

On the other hand, 
Algorithm 4 is less message intensive compared to Algorithm 2 and Type 
B LSAs can benefit from this reduced amount of messaging.   In many 
situations Type B LSAs may be many more in number compared to Type A 
LSAs.  In one example there may be many more ASE, Summary and 
ASBR-Summary LSAs compared to Router and Network LSAs.  In another 
example, if a traffic-engineering-related Opaque LSA is used for each 


 Choudhury et. al.                                            [Page 6]

 Internet Draft             Flooding Simulation              May, 2003


direction of each link of the network then there would typically be 
many more LSAs of this type compared to Router and Network LSAs.   
Algorithm 5 has the same amount of robustness as Algorithm 2 for  
Carrying topology-related information but it can be far less 
message-intensive if majority of the LSAs are of Type B.



3. The Network under Simulation

We generate a random network over a rectangular grid using a modified 
version of Waxman's algorithm [Ref13] that ensures that the network is 
connected and has a pre-specified number of nodes, links, maximum 
number of neighbors per node, 
and maximum number of adjacencies per node. The rectangular grid 
resembles the continental U.S.A. with maximum propagation delay of 
30 ms in the East-West direction and maximum propagation delay of 
15 ms in the North-South direction.  We consider two different 
network sizes as explained in Section 4.

The network has a flat, single-area topology.

Each node is a Router and each link is a point-to-point link 
connecting two routers.

We assume that node CPU and memory (not the link bandwidth) is the main 
bottleneck in the LSA flooding process.  This will typically be true 
for high speed links (e.g., OC3 or above) and/or links where OSPF
traffic gets an adequate Quality of Service (QoS) compared to other
traffic.
 
Different Timers (usually default values): LSA refresh interval = 
1800 seconds, Hello refresh interval = 10 Seconds, Router-Dead 
interval = 40 seconds, LSA retransmission interval = 5 Seconds (note
that a retransmission is disabled on the receipt of either an
explicit acknowledgment or a duplicate LSA over the same interface
that acts as an implicit acknowledgment), 
Minimum Time Between generation of the same LSA = 5 Seconds, Minimum 
time between successive Dijkstra SPF calculations is 1 second.

Packing of LSAs: It is assumed that for any given node, the LSAs 
generated over a 1-second period are packed together to form an LSU 
but no more than 3 LSAs are packed in one LSU.

LSU/Ack/Hello Processing Times: All processing times are expressed in 
terms of the parameter T.  Different values of T in the range of 0.5 ms
to 2 ms are considered. 

In the case of a dedicated processor for processing OSPF packets the 


 Choudhury et. al.                                            [Page 7]

 Internet Draft             Flooding Simulation              May, 2003


processing time reported represents the true processing time.  If the 
processor does other work and only a fraction of its capacity can be 
dedicated to OSPF processing then we have to inflate the processing 
time appropriately to get the effective processing time and in that 
case it is assumed that the inflation factor is already taken into 
account as part of the reported processing time. 

The fixed time to send or receive any LSU, Ack or Hello packet is T.  
In addition, a variable processing time is used for LSU and Ack 
depending on the number and types of LSAs packed.  No variable 
processing time is used for Hello.
Variable processing time per Router LSA is (0.5 + 0.17L)T where L is 
the number of adjacencies advertised by the Router LSA.  For other 
LSA types (e.g., ASE LSA or a "Link" LSA carrying traffic engineering
information about a link), the variable processing time per LSA is 
0.5T.

Variable processing time for an Ack is 25% that of the corresponding 
LSA.

It is to be noted that if multiple LSAs are packed in a single LSU 
packet then the fixed processing time is needed only once but the 
variable processing time is needed for every component of the packet.

LSU/Ack/Hello Priority: All LSUs/Acks/Hellos received at a node are 
queued at the same priority and processed in a FIFO manner.  Any 
packets generated internally to a node and usually based on a timer 
are processed at a higher priority.   This includes the initial LSA 
storm, LSA refresh, Hello refresh, LSA retransmission and new LSA 
generation after detection of a failure or recovery.

Buffer Size for Incoming LSUs/Acks/Hellos: Buffer size is assumed 
to be 2000 packets where a packet is either an incoming LSU, Ack or 
Hello.  The buffer is managed in a FIFO manner and when the buffer is 
full, new packet arrivals are dropped.

Processing Time for MPR/MST computation at a node (Algorithms
3, 4 and 5): 10 ms.  This is needed once in the beginning and 
otherwise only when the last live link between a pair of nodes in the 
network goes down.

LSA Refresh: Each LSA is refreshed once in 1800 seconds and the 
refresh instants of various LSAs in the LSDB are assumed to be 
uniformly distributed over the 1800 seconds period, i.e., they are 
completely unsynchronized.  If however, an LSA is generated as part 
of the initial LSA storm then it goes on a new refresh schedule of 
once in 1800 seconds starting from its generation time.   

LSA Storm Generation: As defined earlier, "LSA storm" is the 


 Choudhury et. al.                                            [Page 8]

 Internet Draft             Flooding Simulation              May, 2003


simultaneous or near simultaneous generation of a large number of LSAs.
In the case of only Router and ASE LSAs we normally assume that the 
number of ASE LSAs in the storm is about 4 times that of the Router 
LSAs, but the ratio is allowed to change if either the Router or the 
ASE LSAs have reached their maximum possible value.  In the case of 
only Router and Link LSAs (carrying traffic engineering information) 
we normally assume that the number of Link LSAs in the storm is about 
4 times that of the Router LSAs, but the ratio is allowed to change 
if either the Router or the Link LSAs have reached their maximum 
possible value.  For any given LSA storm we keep generating LSAs 
starting from Node index 1 and moving upwards and stop until the 
correct number of LSAs of each type have been generated.  The LSAs 
generated at any given node is assumed to start at an instant 
uniformly distributed between 20 and 30 seconds from the start of the
simulation.  Successive LSA generations at a node are assumed to be 
spaced apart by 400 ms. It is to be noted that during the period of 
observation there are other LSAs generated besides the ones in the 
storm.  These include refresh of LSAs that are not part of the storm 
and LSAs generated due to possible link failures and subsequent 
possible link recoveries.

Failure/Recovery of Links: If no Hello is received over a link (due to
CPU/memory congestion) for longer than Router-Dead Interval then
the link is declared down.  At a later time, if Hellos are received
then the link would be declared up.  Whenever a link is declared
up or down, one Router LSA is generated by each Router on the
two sides of the point-to-point link.  If "Link LSAs" carrying
traffic engineering information is used then it is assumed that each
Router would also generate a Link LSA.  In this case it is also assumed
that due to rerouting of LSPs, one other link in the network
(selected randomly in the simulation) would have significant change 
in reserved bandwidth which would result in one Link LSA being
generated by the routers on the two ends of the link.


4. Simulation Results

In this section we study the relative performance of the five
algorithms with a range of Network sizes, LSA types, and processing 
time values as explained below:

Network size: Two networks are considered.  Network 1 has 100 nodes, 
1200 links, maximum number of neighbors per node is 30 and maximum 
number of adjacencies per node is 50 (same neighbor may have more 
than one adjacencies).   Network 2 has 50 nodes, 600 links, maximum 
number of neighbors per node is 25 and maximum number of adjacencies 
per node is 48. Dijkstra SPF calculation time for Network 1 is assumed 
to be 100 ms and that for Network 2 is assumed to be 70 ms.



 Choudhury et. al.                                            [Page 9]

 Internet Draft             Flooding Simulation              May, 2003


LSA Type: Each node has 1 Router LSA (Total of 100 for Network 1 and 
50 for Network 2). There are no Network LSAs since all links are point-
to-point links and no Summary LSAs since the network has only one area.
Regarding other LSA types we consider two scenarios.  In Scenario 1 we 
assume that there are no ASE LSAs and each link has one "Link" LSA 
carrying traffic engineering information (Total of 2400 for Network 1 
and 1200 for Network 2). In Scenario 2 we assume that there are no 
"Link" LSAs and half of the nodes are ASA-Border nodes and each border 
node has 10 ASE LSAs (Total of 500 for Network 1 and 250 for 
Network 2).  We identify Scenario 1 as "Link LSAs" and Scenario 2 as 
"ASE LSAs".
Processing time values: Processing times for LSUs, Acks and Hello 
packets have been previously expressed in terms of a common parameter 
T.  Three values are considered for T, which are 1 ms, 0.5 ms and 2 ms
respectively.

Based on Network size, LSA type and Processing time values we develop
5 Test cases as follows:

Case 1: Network 1, Link LSAs and T = 1 ms.

Case 2: Network 1, ASE LSAs and T = 1 ms.

Case 3: Network 1, Link LSAs and T = 0.5 ms.

Case 4: Network 1, Link LSAs and T = 2 ms.

Case 5: Network 2, Link LSAs and T = 1 ms.

For each case and for each algorithm we study the network stability as

=========|==========================================================
         | Number of Non-Converged LSUs in the Network at Time(in sec)
    LSA  |                    
   STORM |==========================================================
   SIZE  |10s   20s   30s   35s   40s   50s   60s   80s   100s
=========|==========================================================
    50   | 0     0     9    10     0     0     0     0     1
 (Stable)|
---------|----------------------------------------------------------
    100  | 0     0    24    29    26    21     8     1     1
 (Stable)|
---------|----------------------------------------------------------
    130  | 0     0    31    45    42    41    38    127   237
(Unstable)
=========|==========================================================

    Table 1: Network Stability Vs. LSA Storm (Case 1, Algorithm 1)



 Choudhury et. al.                                            [Page 10]

 Internet Draft             Flooding Simulation              May, 2003


a function of the size of the LSA storm.  The stability is determined 
by looking at the number of non-converged LSUs as a function of time.  
An example is shown in Table 1 for Case 1 and Algorithm 1.  

The LSA storm starts a little after 20 seconds and so for some period
of time after that the 
number of non-converged LSUs should stay high and then come down for a 
stable network.  This happens for LSA storms of sizes 50 and 100.  With 
an LSA storm of size 130, the number of non-converged LSAs stay high
indefinitely due to repeated retransmissions, link failures due to 
missed Hellos for more than the Router-Dead interval which generates 
additional LSAs and also due to subsequent link recoveries which again 
generate additional LSAs.  It turns out that for this example the 
maximum LSA storm size that still keeps the network stable is 115.  

|===========|========================================================|
|           |    Maximum Allowable LSA Storm Size For                |
|Algorithm  |==========|==========|===========|==========|===========|
|  Type     | Case 1   |   Case 2 |  Case 3   |  Case 4  |  Case 5   |
|           |(Net. 1,  | (Net. 1, | (Net. 1,  | (Net. 1, | (Net. 2,  |
|           |Link LSAs,| ASE LSAs,| Link LSAs,|Link LSAs,| Link LSAs,|
|           | T = 1 ms)| T = 1 ms)|T = 0.5 ms)| T = 2 ms)| T = 1 ms) |
|===========|==========|==========|===========|==========|===========|
|Algorithm 1|   115    |   130    |   275     |     25   |    138    |
|(Full Fld.)|          |          |           |          |           |
|___________|__________|__________|___________|__________|___________|
|Algorithm 2|          |          |           |          |           |
|(Fld. Over |   238    |   243    |   610     |     85   |    450    |
|One Parall.|          |          |           |          |           |
| Link)     |          |          |           |          |           |
|___________|__________|__________|___________|__________|___________|
|Algorithm 3|          |          |           |          |           |
|(Alg. 2 +  |   248    |   249    |   645     |     90   |    450    |
|Multipoint |          |          |           |          |           |
|Relay)     |          |          |           |          |           |
|___________|__________|__________|___________|__________|___________|
|Algorithm 4|          |          |           |          |           |
| (Alg. 2 + | Full LSDB| Full LSDB| Full LSDB |    1825  | Full LSDB |
|Fld. Over  |   (2500) |  (600)   |   (2500)  |          |    (1250) |
|Span. Tree)|          |          |           |          |           |
|___________|__________|__________|___________|__________|___________|
|Algorithm 5|          |          |           |          |           |
|(Alg. 2 For| Full LSDB| Full LSDB| Full LSDB |    625   | Full LSDB |
|Type A, Alg|   (2500) |  (600)   |   (2500)  |          |    (1250) |  
|4 For Type |          |          |           |          |           |
|B LSAs     |          |          |           |          |           |
|___________|__________|__________|___________|__________|___________|

    Table 2: Maximum Allowable LSA Storm for a Stable Network


 Choudhury et. al.                                            [Page 11]

 Internet Draft             Flooding Simulation              May, 2003


This stability threshold depends on the Case and the Algorithm under
consideration. A better algorithm should allow a higher stability 
threshold.  

In Table 2 we show the maximum allowable LSA storm size that would 
still keep the network stable for the five different cases and for 
the five different algorithms.

5. Summary of Observations

Comparison of Algorithm 2 and Algorithm 1:  We observe that in all 
cases in Table 2, Algorithm 2 substantially increases (usually by a 
factor of two or more) the maximum LSA storm threshold of the network 
compared to Algorithm 2.  This implies that flooding over only one of 
many parallel links between nodes has clear beneficial impact in terms 
of improving network scalability and stability and ought to be pursued.
Comparison of Algorithm 3 and Algorithm 2: The Multipoint relay 
mechanism used in Algorithm 3 increases the LSA storm threshold
(compared to Algorithm 2)
but by a moderate amount. One reason for this moderate improvement
is that even though the multipoint relay mechanism can significantly
reduce the overall amount of flooding in the network, it appears that 
the nodes with highest number of neighbors tend to be more likely
to be chosen as multipoint relays by their neighbors.  As an example,
in Network 1 (Cases 1-4 in Table 2) on the average a node chooses
70% of its neighbors as multipoint relays.  However, the node with
highest number of neighbors (30), is chosen as a multipoint relay
by 27 out of the 30 (i.e., 90%) of its neighbors.  This node, due
to its large number of neighbors, also happens to be the bottleneck
in terms of determining LSA storm threshold and since its effective
number of neighbors (the ones that ask it to flood further) goes
down only slightly, the overall improvement in LSA storm threshold
is moderate.  In Network 2 (Case 5 in Table 2) all neighbors of the
highest adjacency node elect this node as multipoint relays and 
therefore there is no improvement in LSA storm threshold.

Comparison of Algorithm 4 and Algorithm 2: Compared to Algorithm 2,
Algorithm 4 always raises the LSA storm threshold substantially and
in four out of the five cases the LSA storm threshold is raised to
its maximum value, i.e., it is OK to flood all LSAs in the LSDB over
a short period of time.  However, one issue with Algorithm 4 is that
if a link on the minimum spanning tree fails, many nodes may not get
the required LSAs carrying the topology information and it would be 
difficult for them to recompute the new minimum spanning tree.  In the 
simulation we re-computed the minimum spanning tree based on global 
knowledge under failure conditions.

Comparison of Algorithm 5 and Algorithm 2: Compared to Algorithm 2,
Algorithm 5 always raises the LSA storm threshold substantially and


 Choudhury et. al.                                            [Page 12]

 Internet Draft             Flooding Simulation              May, 2003


in four out of the five cases the LSA storm threshold is raised to
its maximum value, i.e., the full LSDB size.  This Algorithm is
worse than Algorithm 4 in terms of maximum LSA storm threshold (as 
shown in Case 4) but much better than Algorithms 1, 2 and 3.  It is 
also more robust than Algorithm 4 under link failure conditions. Even 
if a link on the minimum spanning tree  fails, those carrying intra-
area topology information (defined earlier as Type A LSAs) would 
continue to get to all destinations since they are flooded as in 
Algorithm 2 (full flooding but over only one interface between 
neighbors). Once all the nodes know about the link failure condition, 
they would be able to re-compute the new minimum spanning tree.

Overall Observation: It appears that Algorithm 2 (full flooding but
only over one of many parallel interfaces between neighbors) and
Algorithm 5 (flooding as in Algorithm 2 for LSAs carrying intra-area
topology information and flooding over a minimum spanning tree for
other LSAs) are the best ones and should be pursued further for
detailed specification.             


6. Acknowledgements

We would like to acknowledge Jerry Ash, Margaret Chiosi, Elie Francis,
Anurag Maunder, Beth Munson, Moshe Segal, Vera Sapozhnikova, and Pat 
Wirth for collaboration and encouragement in our scalability 
improvement efforts for Link-State-Protocol based networks.  We also
thank Aman Shaikh and Moshe Segal for comments on this draft.

7. References

   [Ref1] Pappalardo, D.,"AT&T, customers grapple with ATM net outage,"
   Network World, February 26, 2001.

   [Ref2] "AT&T announces cause of frame-relay network outage," AT&T 
   Press Release, April 22, 1998.

   [Ref3] Cholewka, K., "MCI Outage Has Domino Effect," Inter@ctive 
   Week, August 20, 1999.

   [Ref4] Jander, M., "In Qwest Outage, ATM Takes Some Heat," Light
   Reading, April 6, 2001.

   [Ref5] A. Zinin and M. Shand, "Flooding Optimizations in Link-State
   Routing Protocols," Work in Progress.

   [Ref6] J. Moy, "Flooding over Parallel Point-to-Point Links," Work in
   progress.
   
   [Ref7] P. Pillay-Esnault, "OSPF Refresh and flooding reduction in  


 Choudhury et. al.                                            [Page 13]

 Internet Draft             Flooding Simulation              May, 2003


   stable topologies," Work in progress.

   [Ref8] G. Choudhury, V. Sapozhnikova, A. Maunder and V. Manral,
   "Explicit Marking and Prioritized Treatment of Specific IGP
   Packets for Faster IGP Convergence and Improved Network Scalability 
   and Stability," Work in Progress.

   [Ref9] J. Ash, G. Choudhury, V. Sapozhnikova, M. Sherif, A. Maunder, 
   V. Manral, "Congestion Avoidance & Control for OSPF Networks",
   Work in Progress.

   [Ref10] "Private Network-to-Network Interface", ATM Forum 
   Specification af-pnni-0055.000, March, 1996.

   [REF11] A. Qayyum, L. Viennot, A. Laouiti, "Multipoint Relaying for 
   Flooding Broadcast Messages in Mobile Wireless Networks," Project
   Hipercom, INRIA Rocquencourt, France.

   [Ref12] D. Katz, D. Yeung, K. Kompella, "Traffic Engineering 
   Extensions to OSPF Version 2," Work in progress.

   [Ref13] B. M. Waxman, "Routing of Multipoint Connections," IEEE
   Journal on Selected Areas in Communications, 6(9):1617-1622, 1988.



8. Authors' Addresses

   Gagan L. Choudhury

   AT&T
   Room D5-3C21
   200 Laurel Avenue
   Middletown, NJ, 07748
   USA
   Phone: (732)420-3721
   email: gchoudhury@att.com
   
   
   Vishwas Manral
   Netplane
   189, Prashasan Nagar,
   Road Number 72
   Jubilee Hills, Hyderabad
   India
   email: Vishwasm@netplane.com
     




 Choudhury et. al.                                            [Page 14]