Internet DRAFT - draft-corson-fastrestore
draft-corson-fastrestore
Personal M. Scott Corson
Ansible Systems
Internet Draft A. O'Neill
BT
G. Tsirtsis
Flarion
Document: draft-corson-fastrestore-00.txt
Category: Informational November 2000
IP Fast Restoration
<draft-corson-fastrestore-00.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This memo describes a generalized approach for achieving IP fast
restoration in response to link and router failures by using IP
routing to support high network availability. The approach relies
on the utilization of flat, yet highly-localized routing technology
to reduce far-reaching control message propagation, thereby limiting
the effects of a failure and enabling sub-second restoration. It
then outlines how the Temporally-Ordered Routing Algorithm (TORA) (a
Mobile Ad hoc Network (MANET) routing protocol), when suitably
modified for use in fixed networks, can be used within this
framework. The general approach to routing topology management
within the fast restoration framework is immediate reaction on link
failure (to preserve existing flows) and gradually integration of
new network elements on activation into the domain (to further
assure stability).
Internet Draft Fast Restore August 2000
INDEX
1. Introduction . . . . . . . . . . . . . . . . . . . . . 2
1.1 Terminology. . . . . . . . . . . . . . . . . . . 2
2. Motivation . . . . . . . . . . . . . . . . . . . . . . 3
3. The Fast Restoration Framework . . . . . . . . . . . . 4
3.1 Highly Localized Routing Protocol. . . . . . . . 5
3.2 Link Status Assessment Protocol. . . . . . . . . 5
3.3 Reaction to Failure. . . . . . . . . . . . . . . 6
4. The Temporally-Ordered Routing Algorithm . . . . . . . 7
4.1 Overview . . . . . . . . . . . . . . . . . . . . 7
4.2 Router Activation (Route Creation) . . . . . . . 8
4.3 Link Failure . . . . . . . . . . . . . . . . . . 9
4.4 Link Activation. . . . . . . . . . . . . . . . . 9
4.5 Router Failure (Route Erasure) . . . . . . . . . 10
5. Detecting Router Failures with Single-Homed Prefixes . 10
5.1 Neighbor Advertisement Protocol. . . . . . . . . 11
5.2 Neighbor Reachability Protocol . . . . . . . . . 12
6. Conclusion . . . . . . . . . . . . . . . . . . . . . . 14
7. References . . . . . . . . . . . . . . . . . . . . . . 15
1. Introduction
This memo describes a generalized approach for achieving IP fast
restoration in response to link and router failures by using IP
routing to support high network availability. The approach relies
on the utilization of flat, yet highly-localized routing technology
to reduce far-reaching control message propagation, thereby limiting
the effects of a failure and enabling sub-second restoration. It
then outlines how the Temporally-Ordered Routing Algorithm (TORA) (a
Mobile Ad hoc Network (MANET) routing protocol), when suitably
modified for use in fixed networks, can be used within this
framework. The general approach to routing topology management
within the fast restoration framework is an immediate reaction on
link failure (to preserve existing flows) and gradually integration
of new network elements on activation into the domain (to further
assure stability).
The draft first discusses the motivation for this draft and its
relevance to a recent IAB report [IAB99] on routing convergence. It
then outlines a framework of protocol mechanisms, and then gives a
fuller description of the main mechanisms. It then describes a
candidate routing algorithm that can be used within this framework.
It then additionally describes the mechanisms within the framework
to address efficiency and localization issues associated with the
failure of routers advertising single-homed prefixes. The draft
concludes with a discussion of the benefits of this fast restoration
framework.
1.1 Terminology
Internet Draft Fast Restore August 2000
Here are some new terms defined in this draft.
RID Router ID. A unique ID used by a routing protocol to identify
a router.
RPLF Routing Protocol Link Failure. A RPLF packet is generated as
a routing protocol's reaction to a link failure for each
destination prefix affected by the failure
RPDF Routing Protocol Destination Failure. RPDF is a packet
generated to eliminate routing to a destination that has been
partitioned from the rest of the network
LFDT Link Failure Detection Timer. Failure to receive an FRLS
packet with the neighbor heard flag set within the LFDT
setting indicates link failure.
LFUB Link Failure Upper Bound. This is the upper bounds of the
time
required to detect failure.
FRNA Fast Restore Neighbor Advertisement. A router's FRNA
packet also contains the RIDs and corresponding prefixes of
the router's one-hop neighbors.
FRLS Fast Restore Link Status. A FRLS packet contains a router's
RID and an indication of whether or not the router has
recently heard its neighbor.
2. Motivation
The report from the recent IAB Network Layer Workshop [IAB99] raised
several routing issues. Regarding routing convergence, it noted
that "at least one backbone operator has concerns about the
convergence time of internetwork-wide routing during a failover.
This operator believes that current convergence times are on the
order of half a minute, and possibly getting worse. Others in the
routing community did not believe that the convergence times are a
current issue. Some, who believe that real-time applications (e.g.
telephony) require sub-second convergence, are concerned about the
implications of convergence times of a half minute on such
applications. [IAB99]"
It is agreed that existing Internet intra-domain routing protocols
(e.g. [RIP,OSPF]) have limited responsiveness in reacting to
topological changes (i.e. slow convergence) and consequently have
limitations on the size of domains they may support. These
limitations stem directly from the use of shortest-path routing
based on global topological information exchange.
For any premium service with critically tight availability
requirements, operators are left with the use of hot standby routers
and link level restoration to improve restoration times. Both
Internet Draft Fast Restore August 2000
mechanisms are however very expensive, and applicable mainly to
large-scale core elements. However, these cannot be cost-effectively
applied in edge networks such as IP enabled Radio Access Networks
and base stations and flat enterprise networks. These will likely be
composed of large numbers of small and necessarily cheap IP routers
which may be interconnected by ad-hoc, non-resilient or unprotected
link layers such as ethernet and various serial lines. The need for
a fast restoring IGP is therefore apparent. In addition, if a cost-
effective IP layer solution could be found which could be made
sympathetic to traffic engineering requirements then the requirement
for hot standby and link layer restoration could be reduced
resulting in cost savings. The traffic engineering implications are
not however considered in this draft.
By relaxing the requirement for shortest path routing only slightly
(with acceptable impact on forwarding efficiency), one can
approximate shortest-path routing while enabling IP fast restoration
and higher levels of domain scalability. This can be achieved by
utilizing adaptive routing algorithms that localize their response
to topological changes while ensuring loop-free paths. It is
localization that yields the desired benefits of reduced restoration
time and increased domain size.
The proposed fast restoration approach therefore offers a potential
intra-domain solution to this convergence problem in reaction to
failure. It is presently unclear whether the fast restoration
techniques discussed in this memo have potential applicability to
the associated inter-domain routing problem.
3. Fast Restoration Framework
Four general protocol mechanisms are required in the proposed fast
restoration framework. The first two are mandatory in that they
provide the basic fault detection and reaction mechanisms. The
second two are optional and illustrate one way to enable the
reaction to be tailored to whether the failure point is a link or a
neighbor router.
1) A highly localized routing protocol
This is required to enable the protocol to be able to react
extremely quickly to failures, with minimal impact on other
routers within the domain.
2) A link status assessment protocol
This is required to enable a router to rapidly detect link or
router failures.
3) A neighbor advertisement protocol
This is used to enable a router to learn about its local two
hop topology including its neighbors' neighbors and the degree
of connectivity (single/multi-homed) of subnets.
Internet Draft Fast Restore August 2000
4) A neighbor reachability protocol
This is used to enable a router to identify whether it is the
link or the neighbor router that has failed through
collaboration with its two hop neighbors.
3.1 Highly Localized Routing
The general approach to routing topology management within the fast
restoration framework is immediate reaction on link failure (to
preserve existing flows) and gradually integration of new network
elements on activation into the domain (to further assure
stability).
The method by which the aforementioned route adjustment occurs is a
function of the routing algorithm. It is essential that a routing
algorithm have the following properties to operate within this fast
restoration framework.
1. The protocol reaction to a link failure that does not partition
the network be localized for all destination prefixes.
2. The protocol reaction to a router failure that does not partition
the network should be localized for all destination prefixes.
(i.e. it should be treated as a link failure with respect to
these other prefixes).
3. The protocol reaction to a link failure that does partition the
network be capable of detecting the partition (and erasing routes
if desired).
4. The protocol reaction to a router failure that does partition the
network be capable of erasing routes for these prefixes (if
desired). Single-homed prefixes on the failed router will by
definition create a partition.
The preceding description assumes that a routing protocol reacts to
a link or router failure by sending generic Routing Protocol Link
Failure (RPLF). Routing Protocol Destination Failure (RPDF) packets
as generated as necessary in response to partitions.
3.2 Link Status Assessment
The link status assessment protocol is principally concerned with
fast detection of link failures. Work within the IETF on a
generalized IP level link status assessment algorithm was recently
proposed [FLIP]. A similar capability is required here where a link
consists of a pair of Router IDs (RIDs). After discovery of a
neighbor (identified by its RID) via the neighbor discovery
protocol, the objective of a link status assessment algorithm would
be to monitor with high resolution (in time) the bi-directional
connectivity of a link with an adjacent neighbor router.
Internet Draft Fast Restore August 2000
It is desired that a link failure be detectable on millisecond time
scales enabling the higher level mechanisms (to be described) to
quickly react thereby preserving ongoing sessions. The protocol
operates by periodically sending Fast Restore Link Status (FRLS)
packets between adjacent interfaces to reconfirm link bi-
directionality. The FRLS retransmission period is set to
LINK_PROBE_INTERVAL time units. A FRLS packet contains a router's
RID and an indication of whether or not the router has recently
heard its neighbor.
FRLS packets are initially sent indicating that the sending router
has not heard its neighbor. Reception of a FRLS packet at a router
from a neighbor causes the router to set the "neighbor heard" flag
to TRUE for ROUTER_HEARD TIME (perhaps 3.5 times
LINK_PROBE_INTERVAL) during which time it sends FRLS packets back
towards its neighbor indicating that it has heard its neighbor.
Reception of any packet from its neighbor causes the router to reset
the ROUTER_HEARD_TIMER associated with this flag setting. Reception
of a FRLS packet with the neighbor heard flag set to TRUE causes the
router to consider the link with its neighbor as bi-directional. At
that time the Link Failure Detection Timer (LFDT) is also set to
ROUTER_HEARD_TIME. Reception of a subsequent FRLS packet with the
neighbor heard flag set resets the LFDT. Failure to receive an FRLS
packet with the neighbor heard flag set within the LFDT setting
indicates link failure. Also, if the router considers the link to
be bi-directional, subsequent reception of a FRLS packet without the
neighbor heard flag set immediately indicates link failure (i.e.
this indicates that its neighbor no longer hears the router). Thus
the sum of ROUTER_HEARD_TIME plus LINK_PROBE_INTERVAL equals Link
Failure Upper Bound (LFUB) and effectively upper bounds the time
required to detect failure.
Alternatively, the protocol may also rely on a link layer
notification of link failure (e.g. loss of carrier) if available.
Such a signal may provide a more accurate and timely indication of
link status.
3.3 Reaction to Failure
After the link status assessment protocol detects a link failure
(possible router failure) at a router X with a neighbor Y, the
following processing steps occur at router X:
1) Router X initially drops (or buffers) data packets for
destination routes whose next hop was over the failed link with
neighbor Y. This clearly includes any single-homed or multi-
homed destination prefixes on the (potentially failed)
neighboring router Y.
2) Router X adjusts routes for all affected destination prefixes
aiming to locally route around the failed link and/or router. Let
Internet Draft Fast Restore August 2000
us assume that the routing protocol's reaction to a link failure
generates a Routing Protocol Link Failure (RPLF) packet for each
destination prefix. Note that the RPLF activity for a particular
destination will definitely fail in either of two cases.
3) The first would be if the link failure had generated a partition
in the topology towards that destination which places an obvious
requirement that restoration requires a sufficient level of
connectivity between nodes to guarantee the availability of an
alternate path. Partition detection would lead to the generation
of a Routing Protocol Destination Failure (RPDF) packet that
would be used to eliminate the routing for that destination
throughout the domain.
4) The second more specific example occurs if the destination prefix
is single-homed on a router, and that router has itself failed,
rendering restoration around the failure futile. Therefore, for
singly homed prefixes it may be useful to react differently to
the failure of the router. This firstly requires a method to
discover which routers have single-homed prefixes, a method to
discover when such routers fail, and an alternative strategy to
react to this type of failure. These are discussed in section 5.
One such strategy though would be to hold back on pointless RPLF
processing to give the router time to re-boot. This avoids the
RPLF processing causing partition detection and subsequent RPDF
generation, resulting in the elimination of the routing for that
destination throughout the domain. In this case, the domain
relies on interim ICMP unreachable messages to inform
applications of the loss of connectivity to the destination. If
re-boot does not occur within a fixed period then the routes to
that destination can be then be reasonably removed from the
domain through RPDF propagation. This is somewhat similar to the
`graceful' strategy for avoiding routing flaps in BGP
[BGPrestart] by suppressing route erasure on router failure in
anticipation of fast reboot and graceful BGP restart.
4 The Temporally-Ordered Routing Algorithm
4.1 Overview
We now map these generic packets to a specific routing protocol ---
the Temporally-Ordered Routing Algorithm (TORA) [TORA, TORAdraft].
In TORA, RPLF packets map to Update (UPD) packets while RPDF packets
map to Erase (ERS) packets. This latter packet type does not yet
appear in [TORA, TORAdraft].
TORA is not a shortest-path routing protocol. It is a "link
reversal" routing protocol that differs significantly from existing
routing technology. TORA is similar to distance vector protocols
(e.g. RIP) in that a separate version of the protocol runs
independently for each destination prefix. Thus its storage
Internet Draft Fast Restore August 2000
requirements scale with the number of destination prefixes. But its
similarity ends there. In its original usage mode, TORA is an on-
demand protocol that builds routes reactively in response to route
requests from a MANET IP kernel. However, for usage in fixed
networks it is transformed into a more traditional, proactive
protocol.
The key attribute of the protocol remains the same for both
versions, and this is the protocol's aggressive, optimistic reaction
to link failures which is critical within the MANET environment but
is becoming equally important in fixed network scenario's. On link
failure, when necessary, the protocol immediately reverses the
directions of links in the area near the failure effectively
searching for alternative routes to the destination. If such exist,
the protocol typically finds them in a time and communication-
efficient, single-pass reaction. Otherwise (in case of partition),
the reaction detects the partition enabling route erasure if
desired.
We now briefly describe how TORA would function within the fast
restoration framework.
4.2 Router Activation (Route Creation)
When a router boots up, it establishes new links with its neighbors.
The router proactively initiates a localized process of height
database synchronization with each of its new neighbors. Its
neighbors transmit the contents of their height databases to the new
router. From this and any received routing advertisements (if any),
a router can determine which of its own prefixes are single or
multi-homed.
It then generates an Optimization (OPT) packet for each destination
prefix it wishes to advertise. If the prefix is single-homed and
heights exist in the neighboring routers, the OPT packet stops at
these neighbors and is not propagated further. Otherwise, the OPT
packet floods fully (or partially) throughout a domain installing
(or reorienting) routing entries for its destination prefix. The
collection of routing entries forms a Directed Acyclic Graph (DAG).
From an individual router's perspective, this means that it will
have one or more feasible next hops for the destination, all of
which are guaranteed to be loop-free.
Loop-free multipath routing is achieved through the use of
"heights". Each router maintains a height with respect to the
destination, and the destination has the lowest height. A link is
directed from a higher height to a lower height, and packets may
only be forwarded from routers with higher heights to routers with
lower heights. Thus packets flow "downhill" towards the
destination. Each height is guaranteed to be unique and thus the
collection of heights forms a total order. The protocol is
Internet Draft Fast Restore August 2000
sometimes also referred to as the "Totally-Ordered Routing
Algorithm".
From a router's perspective, a neighbor that is higher is referred
to as an "upstream" neighbor while a neighbor that is lower is
referred to as a "downstream" neighbor. A destination router always
has only upstream neighbors, and no other router is allowed to have
upstream neighbors without having at least one downstream neighbor.
It should be understood that all heights and associated link
directions are per IP destination prefix DAG.
4.3 Link Failure
A router may lose an upstream neighbor (due to a link failure) at
any time without requiring protocol reaction. Similarly, it may
lose a downstream neighbor provided it has other downstream
neighbors, or has no downstream neighbors and no upstream neighbors,
without requiring protocol reaction. However, if due to a link
failure a router finds itself with at least one upstream neighbor
without having any downstream neighbors (i.e. it just lost its last
downstream neighbor and is a black hole), the router must adjust its
height so that it has at least one downstream neighbor. The manner
in which this occurs is described in [TORA,TORAdraft]. The
adjustment results in transmission of an UPD (RPLF) packet to the
router's neighbors advertising its new height.
The effect of this height adjustment is to reverse the direction of
some or all of the router's links; hence the name "link reversal"
algorithm. Reception of an UPD (RPLF) packet at a neighbor may
cause that neighbor to lose its last downstream link, which is cause
for subsequent height modification and UPD packet transmission.
Examination of the algorithm's reaction on mesh-like topologies
reveals that the number of subsequent UPD (RPLF) transmissions is
typically quite low; often only one UPD (RPLF) transmission is
required. The resultant reaction is thus highly localized. It is
also strongly de-correlated with the size of the domain.
4.4 Link Activation
When a router restarts, it re-establishes links with its neighbors.
The router initiates a localized process of height database
synchronization with each of its new neighbors. Its neighbors
transmit the contents of their height databases to the new router.
By default, the router's new heights are NULL for each destination.
(Note: A NULL height is higher than any non-NULL height, and packets
may be forwarded from NULL to non-NULL heights. A link between two
routers with NULL heights for a destination is marked as
"unassigned", and packet forwarding over such links is disallowed.)
The re-started router then performs its own non-NULL height
selection with respect to the existing heights of its neighbors.
Internet Draft Fast Restore August 2000
The algorithm for height selection is not discussed here, but it
will be described in an upcoming draft detailing modification of
TORA for use in fixed networks. The procedure only involves a
router's one-hop neighbors. The height selection procedure occurs
slowly, and a link is gradually brought back into active usage
within the routing domain.
4.5 Router Failure (Route Erasure)
When a router fails, then the default behavior is that each neighbor
individually reacts to the router failure as a link failure, sending
UPD (RPLF) packets as described in 3.3 above. This will restore
paths around the failed router if sufficient physical connectivity
exists.
The failed router itself may have been advertising destination
prefixes into the network and in the case of multi-homed
subnets/prefixes the TORA reaction will result in the paths towards
the failed routing being redirected towards the nearest alternate
router also advertising that prefix.
In the case of single-homed prefixes, the restoration attempt will
clearly fail as those subnets are now partitioned from the core
network. The collective UPD (RPLF) reaction of the routers connected
to the failed router, followed by partition detection via the normal
TORA reflection, would eventually clear the network of routing state
for these single-homed destination prefixes. This is wasteful if
the router is simply undergoing a re-boot and therefore additional
protocol mechanisms may be warranted as highlighted in 3.3 and
described in section5 below. The aim of these mechanisms in TORA
would be to avoid sending UPD (RPLF) packets for the single-homed
destination prefixes, wait to see if the associated router recovers
in a set time, and if not, only then use ERS (RPDF) packets to
efficiently clear this routing state from the network.
5. Detecting Router Failures with Single-Homed Prefixes
As has previously been discussed, it may be useful to use additional
protocol mechanisms to be able to distinguish router failures and
associated single-homed destinations so that futile restoration
processing and premature route elimination can be avoided. One such
approach is outlined below, but the necessity and some of the
mechanics for solving this problem may be critically dependent on
the particular IGP.
In this approach, a bootstrapping neighbor advertisement protocol
enables a router to learn its one-hop neighbors, as well as its
neighbors' one-hop neighbors, called the two-hop neighbor set. It
works in combination with IPv6 router advertisements (or equivalent
ICMP router messages in IPv4 (RFC1256)) to enable a router to assess
its routing neighbors. Once a link failure is detected, the neighbor
reachability protocol attempts to determine whether the router at
Internet Draft Fast Restore August 2000
the other end of the failed link is still reachable, and whether it
is advertising single-homed and/or multi-homed destination prefixes.
With input from the preceding protocols, the highly localized
routing protocol then reacts differently for router failures
involving single-homed prefixes.
5.1 Neighbor Advertisement Protocol
This protocol is principally concerned with neighbor discovery and
protocol bootstrapping. As with OSPF, each router is identified by a
Router ID (RID). It also is associated with at least one IP
destination prefix by which it is globally reachable. Routes for
this prefix are advertised out of all its interfaces and built by
the routing algorithm. It may also advertise other destination
prefixes. These prefixes may be single-homed (i.e. only this router
advertises itself as the last hop for these prefixes) or multi-homed
(i.e. they are advertised by multiple routers as being last hop
routers).
The neighbor advertisement protocol requires that each router
periodically advertises its RID and destination prefix(es) in Fast
Restore Neighbor Advertisement (FRNA) packets to its one-hop routing
neighbors out interfaces on which routing is activated. FRNA packets
would likely be multicast (TTL scope == 1) to a well-known multicast
address. The transmission period would be relatively long (on the
order of 30 seconds). A router's FRNA packet also contains the RIDs
and corresponding prefixes of the router's one-hop neighbors that it
has learned from their previous FRNA transmissions. In this way, by
receiving its neighbors' FRNA packets, a router learns the
identities of its neighbors' neighbors in a fashion very similar to
the neighbor advertisement algorithm of [OLSR]. Associated with
each link a router has with a routing neighbor is an instance of a
link status assessment protocol (see section 2.2). For each
instance a router has a Link Failure Detection Timer (LFDT) for that
link. This is the time after which a router will declare a link to
its neighbor down if no packets are received from its neighbor. A
router also computes an Link Failure Upper Bound (LFUB) on the
overall time required to detect a failure on each link, which is
equal to the LFDT value plus a link polling period (to be
described). A router X also advertises in its FRNA transmissions the
maximum computed value over all its links of these bounds, known as
the maximum LFUB (LFUBmax(X)).
The neighbor advertisement protocol operates in conjunction with
IPv6 Router Advertisement (RA) messages (or an equivalent in IPv4)
which are sent out interfaces on which routing is not activated
(e.g. over subnets with attached hosts). Router advertisement
messages convey the destination prefixes being advertised by other
routers on a given subnet. By listening to these advertisements, a
router can learn if other routers are advertising destination
Internet Draft Fast Restore August 2000
prefixes that itself is also advertising. It thus determines which
of its own last-hop prefixes are single-homed or multi-homed.
5.2 Neighbor Reachability Protocol
We now describe the basic operation of a neighbor reachability
protocol. The protocol is assumed to run at a router X, and is
triggered by link failure detection on a link to router Y. After a
short time (estimated sufficient for localized route re-
computation), router X executes a distributed neighbor reachability
protocol to determine reachability of the single-homed destination
prefixes advertised by neighbor Y as follows.
The (potentially failed) neighbor Y's maximum LFUB (LFUBmax) is
known from its previous FRNA transmissions. With this information a
router X creates a RPLF packet as it would if this were a link
failure, but delays transmission of the packet for a time T somewhat
greater than its neighbor Y's LFUBmax; thus T > LFUBmax(Y). In the
meantime it immediately sends a Fast Restore Link Failure (FRLF)
packet to each of the (potentially failed) neighbor Y's one-hop
neighbors. Reception of a FRLF packet at such a neighbor Z
instructs that neighbor to mark the link between the indicated
router Y and the sender X as "failed". But the receiver Z keeps the
sender X in its list of the neighbor Y's one-hop neighbors. This is
to ensure that Z will still send a FRLF packet to X if it also
detects a link failure with Y. Z will change its notion of Y's
neighbors only on reception of a FRNA message from Y itself.
Subsequent reception of a FRLS packet at router Z from neighbor Y
indicates to the receiver Z that the link between routers X and Y
may indeed have failed, but that router Y itself is still
functioning.
Let us assume that router Y has indeed failed, and that its neighbor
X was the first neighbor to detect this failure, thus sending FRLF
packets to router Y's previous one-hop neighbors. Subsequently
others of Y's previous neighbor set begin to declare link failures
with Y, each scheduling delayed RPLF transmissions and immediately
sending an FRLF packet to each of Y's past neighbors. Let us also
assume that Y's neighbor Z is the last neighbor to detect a link
failure with router Y. It also schedules a RPLF packet for
transmission (suitably delayed) and sends FRLF packets to each of
Y's previous neighbors. By this time (or very shortly thereafter) Z
will likely to be the first of Y's past neighbors to both declare
link failure with Y and have received FRLF messages from all of Y's
previous neighbors (without having heard a subsequent message from
Y).
At that point router Z declares "router failure" of previously
adjacent neighbor Y, and cancels its pending RPLF transmission for
Y. The router Z then queues a Routing Protocol Destination Failure
(RPDF) packet for each of neighbor Y's single-homed destination
prefixes (Note that for efficiency, multiple RPDF packets may be
aggregated and sent within a single aggregate RPDF packet).
Internet Draft Fast Restore August 2000
Assuming T (T > LFUBmax(Y)) is set appropriately at each of Y's
previous neighbors, none of Y's previous neighbors would have yet
transmitted their pending RPLF packets. The router also sets a
NEIGHBOR_RESTART_TIMER to some value (possibly 30 seconds) while it
waits for neighbor Y to reboot. During this time packets sent to
Y's single-homed prefixes will be undeliverable resulting in "host
unreachable" ICMP message being returned to the sources. Assuming
neighbor Y reboots within this time, the queued RPDF transmission is
cancelled. Otherwise the queued RPDF packet is transmitted when the
timer expires and the routes for Y's single-homed destination
prefixes are erased from the domain. The objective of this timer is
to suppress route erasure in the case of fast router reboot in a
fashion similar to that proposed in [BGPrestart].
If the NEIGHBOR_RESTART_TIMER expires and T (T > LFUBmax(Y)) is not
set appropriately at each of Y's previous neighbors, possibly others
may also have queued and then issued RPDF packets prior to receiving
an FRDF message originating from router Z. This is not detrimental.
RPDF packets are destination-specific and source-independent. Their
collective propagation results in only a single aggregated flood,
regardless of the number of routers originating RPDF packets. An
RPDF packet floods throughout the network immediately, efficiently
clearing state information regarding the effected destinations. An
RPDF packet is sent to the ALL_ROUTERS multicast address.
For maximal efficiency the aforementioned protocol assumes a domain
topology with the characteristic that, after any single router or
link failure, the remaining domain remains well connected. Thus
router failure does not partition the remaining domain permitting
the failed neighbors to exchange FRLF messages. This is required so
fast restoration can be achieved.
Nevertheless, if the network becomes partitioned the above algorithm
need not result in undo harm provided the routing protocol is
capable of detecting partitions. A router/link failure, in such a
case, will result in multiple link failures being detected since the
failed router Y's neighbors on one side of the partitioned network
will not be able to reach Y's one hop neighbor's on the other side
of the partitioned domain (i.e. FRLF will fail to reach some
neighbors) to reach a more definite conclusion. Consequently each
such neighbor will transmit a RPLF packet for each single-homed
destination prefix on the potentially failed router. Each packet
will initiate a distributed computation that performs "distributed
partition detection" and eventually clears the domain of routing
state for its destination. This approach is less efficient than
that using RPDF packets.
The approach also assumes the FRLF messages are not frequently
dropped. Loss of at least one FRLF message destined for each of the
failed destination's neighbors would ensure that no neighbor
generates an RPDF packet. If there are k such neighbors with each
needing to receive all k-1 FRLF packets from the other neighbors,
and the probability of packet dropping is p, then an upper bound on
Internet Draft Fast Restore August 2000
the probability of this undesirable, joint packet dropping event is
p^k. That is, each of the k neighbors must have missed at least one
packet from some other neighbor. If control packets are given
reasonable priority (suitably small p), then this will not occur
very frequently (i.e. only with probability p^k).
If it does occur, each of these neighbors would start to
independently transmit their queued RPLF packets (which is what this
router failure detection mechanism intends to avoid). This results
in the aforementioned distributed partition detection process and
the worst-case consequences regarding convergence in this case
depend on the particular routing algorithm.
6. Conclusions
The fast restoration processing occurs entirely at layer 3 and does
not require special layer 2 considerations, although a layer 2 link
activation/failure notification trigger may be used if available.
The overall restoration approach is constrained only by the speed at
which a router can create, transmit and process the localized
routing updates. Increases in processing and transmission speeds
will only improve restoration performance over time.
It is anticipated that restoration can frequently be achieved with
little or no noticeable effect on voice over IP sessions. Some
degradation would likely be experienced by higher rate sessions
given current hardware capabilities.
The approach offers immunity to link instability (such as that
triggered by malfunctioning interface cards). A link that flip-
flops between UP and DOWN will effectively be removed from the
forwarding paths of its adjacent routers without triggering domain-
wide routing updates or routing re-convergence computations. Thus
the network will not melt down due to localized link problems.
The approach offers reasonable immunity to router instability as
well. A router that frequently goes UP and DOWN will not create
domain-wide routing messages (route building followed by erasure) if
it can quickly reboot. If it cannot, then these messages still only
effect its own prefixes. With regards to the remaining network
prefixes, the routing protocol will treat such router instability as
a set of individually unstable links, effectively removing them from
the forwarding paths for these prefixes and isolating the network
from harm.
The net effect then is that one can deploy much larger domains with
greatly reduced risk of domain melt down while enabling fast
restoration of IP flows when necessary.
Internet Draft Fast Restore August 2000
7. References
[BGPrestart] S. Ramachandra et al., "Graceful Restart mechanism for
BGP", Internet-Draft, draft-ramachandra-bgp-restart-00.txt, (work in
progress), August 2000.
[FLIP] H.Sandick et al., "Fast LIveness Protocol (FLIP)", Internet
Draft, draft-sandiick-flip-00.txt, February 2000.
[IAB99] M. Kaat, "Overview of 1999 IAB Network Layer Workshop",
Internet-Draft, draft-iab-ntwlyrws-over-03.txt, July 2000.
[MobileIP] C.E. Perkins, ``IP Mobility Support," Internet RFC 2002,
Oct 1996.
[OLSR] P. Jacquet, P. Muhlethaler and A. Qayyum, "Optimized Link
State Routing Protocol", Internet-Draft, draft-ietf-manet-olsr-
01.txt, February 2000.
[OSPF] J. Moy, "OSPF Version 2", Internet RFC 2328, April 1998.
[RIP] G. Malkin, "RIP Version 2", Internet RFC 2453, November 1998.
[TORA] V. Park, M. S. Corson, "The Temporally-Ordered Routing
Algorithm", Proc. IEEE INFOCOM '97, Kobe, Japan, April 1997.
[TORAdraft] V. Park, M. S. Corson, "Temporally-Ordered Routing
Algorithm (TORA) Version 1 Functional Specification", Internet-Draft
(work in progress), draft-ietf-manet-tora-spec-02.txt, October 1999.
Author's Addresses
M. Scott Corson
University of Maryland
Ansible Systems
(+1) 301-405-6630
corson@isr.umd.edu
mscorson@ix.netcom.com
Alan O'Neill
BT
(+44) 1473-646459
alan.w.oneill@bt.com
George Tsirtsis
Flarion Technologies
(+44) 020 88260073
g.tsirtsis@Flarion.com
gtsirt@hotmail.com
Internet Draft Fast Restore August 2000
Copyright Notice
Placeholder for ISOC copyright.