Internet DRAFT - draft-chavali-bgp-prefixlimit
draft-chavali-bgp-prefixlimit
Network Working Group (Editor)Srikanth Chavali
INTERNET DRAFT Vasile Radoaca
Expiration Date: October 2004 Nortel Networks, Inc.
Mo Miri
BellSouth
Luyuan Fang
AT&T
(Editor)Susan Hares
NextHop Technologies
April 2004
Peer Prefix Limits Exchange in BGP
draft-chavali-bgp-prefixlimit-02.txt
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document proposes a mechanism to allow BGP peers to coordinate
the setting of a limit on the number of prefixes which one BGP
speaker will send to its peer. Coordination can prevent disruption
of the peering session or discarding of routes, which can occur when
a maximum prefix limit is configured on the "receiving" peer, and the
Srikanth Chavali et.al. Expires October 2004 [Page 1]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
"sending" peer exceeds the limit.
1. Terms
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.
In this document we use the term "BGP sender" to refer to a BGP
speaker which is advertising prefixes to its peer. We use the term
"BGP receiver" to refer to a BGP speaker which is receiving prefixes
from its peer. Although it is clear that in reality each peer is
usually both a "BGP sender" and a "BGP receiver", we emphasize a
unidirectional relationship in this document for clarity.
2. Introduction
There are many scenarios where BGP [BGP-4] peering may be established
between two speakers in which there is an expectation that some
limited number of prefixes will be announced by a given speaker.
Several implementations of BGP offer a configuration option that
allows a BGP receiver to provision a limit to the number of prefixes
it will accept from a specific peer. When the limit is exceeded, then
there are generally two options: the prefixes exceeding the limit can
be dropped by the BGP receiver, or the peering session may be
terminated by the BGP receiver and restarted at a later time. Neither
of these options is desirable.
Dropping prefixes leads to network unreliability. Terminating the BGP
session is probably worse, since all traffic between the peers will
typically be disrupted, even for those prefixes which were advertised
before the limit was reached. These effects may be due to network
changes, misconfigurations, miscommunications, or other factors where
the number of prefixes advertised from a BGP sender to the receiver
exceeds the expected number, and the configurations must be revised.
Some of the effects are described in detail in [BGP-STUDY].
The basic functionality proposed here is for the BGP speakers to
exchange: "warning", "stop receiving" and "disconnect" based limits.
Of these "stop receiving" limit parameter is required for this
functionality while the rest are OPTIONAL. These limits are exchanged
during the initial exchange as "open capabilities", and via the
dynamic capability exchange during the bgp connection.
3. Definition of Prefix based limits
Prefix limits are encoded as optional capability parameter [BGP-CAP]
in the BGP OPEN message [BGP-4] by each BGP speaker as shown below:
Srikanth Chavali et.al. Expires October 2004 [Page 2]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
0 1 2 3 4
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|type-code | length | AFI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SAFI | Must be Zero |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TLVs |
. .
. .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Mandatory TLV (Type lenght Values):
0 1 2 3 4
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|sub code 2 | length |limit indicator| Must be zero |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| maximum prefix limit (Stop Receiving) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Optional TLVs:
0 1 2 3 4
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|sub code 1 | length |limit indicator| Must be zero |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| warning prefix limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 4
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|sub code 3 | length |limit indicator| Must be zero |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| reset prefix limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Meaning for each of the bitwise indicated capability fields above is
as follows:
Type-Code (1 octet):
code identifying this capability (TBD)
Srikanth Chavali et.al. Expires October 2004 [Page 3]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
Length (1 octet):
Length of the capability value fields
Address Family Identifier AFI (2 octets):
This along with the Subsequent Address Family Indentifier field
identifies the Network Layer Protocol associated with the Network
Address.
Subsequent Address Family Identifier SAFI (1 octet):
This along with the Address Family Identifier field identifies the
Network Layer Protocol associated with the Network Address.
sub code 1 (1 octet):
This OPTIONAL subcode is used to identify the number of routes sent
before raising warning. This is done by the BGP speaker that detects
it.
Length (1 octet):
Length of the subcode. It has the same semantics attached to it in
all the subcodes.
Limit Indicator (1 octet):
This octet can be assigned a value of 0 or 1. A value of zero means
that the sender SHOULD NOT raise any warning. A value of 1 means the
warning indication is necessary and SHOULD be used by the sender when
its route advertisement equals the number of sent routes. It has the
same meaning in all the subcodes. However, in subcode 2 it can take a
value of 1 only. The warning mechanisms are described in the
operation section of this draft.
warning prefix limit (4 octet):
Number of routes sent by the BGP sender. The value for this field is
dependent on the maximum prefix limit and SHOULD be always less than
it.
sub code 2 (1 octet):
This mandatory sub code is used to identify the number of routes sent
before the sender BGP speaker needs to stop advertising routes to its
receiving BGP speaker.
Srikanth Chavali et.al. Expires October 2004 [Page 4]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
maximum prefix limit (4 octet):
Number of routes sent by the sender BGP speaker. When this limit is
hit by the advertising BGP speaker it stops the route advertisement.
sub code 3 (1 octet):
This OPTIONAL sub code is used to identify the number of routes
received after which the BGP speaker will reset the peering session.
It MUST be noted here that this situation will never be encountered
if adhered to the draft. In other words this happens only during
error conditions. The error conditions are beyond the scope of this
document.
reset prefix limit (4 octet):
Number of routes sent by the sender BGP speaker. The value for this
field is dependent on the maximum prefix limit and SHOULD be always
greater than it.
We refer to the warning prefix limit, maximum prefix limit and the
reset prefix limit as prefix limits in this document for the ease of
illustration.
4. Operation
4.1 Exchanging the configured prefix limits
BGP speakers exchange the prefix limits as an optional capability
parameter [BGP-CAP] as described in section 4.
+--------+ +--------+
| A | <-----------------> | B |
+--------+ +--------+
Figure 1
In figure 1 both BGP speakers A and B exchange the prefix limits
(defined in section 4) to indicate the support for this capability.
Each of A and B set these limits along with the actions associated
with each of them in the capability message before exchanging them.
The warning prefix limit and reset limit values are determined based
on the configured maximum prefix limit. They are typically a
percentage value of the maximum prefix limit. The exact percentage
values are beyond the scope of this document. The maximum prefix
limit configured on A for the peer B implies the maximum number of
prefixes that A expects to receive from B. B informs this in the new
capability described in section 4. The same interpretation applies to
Srikanth Chavali et.al. Expires October 2004 [Page 5]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
B too.
4.2 Route processing after prefix limits exchange
In figure 1 both A and B maintain a count of the routes that they
receive from each other. Route processing operation is illustrated
using the case where B sends route advertisements to A. The same
operational procedures apply for the other case of A sending route
advertisements to B. B as shown in figure 1 applies the out bound
route policies on the Adjacent-Rib-Out followed by the condition of
the prefix limits before route advertisements.
4.2.1 Processing When Warning Limit Encountered
+--------+ +--------+
| A | <-----------------> | B |
+--------+ +--------+
B detects warning
prefix limit
<------ generates dynamic
capability message
to A
Figure 2
In figure 2 it can be seen in due course of route advertisements to
A, B generates a dynamic capability [BGP-DYN-CAP] destined to A (if
the warning limit indicator is turned on). This message comprises of
the capability received from A, when the warning prefix limit is hit.
This serves as a warning indicator. Either A or B or both of them
could generate this message depending on timing of warning limit
detection. B and A MAY choose to raise internal warning when this
condition is detected. Following the warnings both A and B continue
advertising routes normally to each other.
4.2.2 Processing When Stop Limit Encountered
+--------+ +--------+
| A | <-----------------> | B |
+--------+ +--------+
B detects maximum
prefix limit
<------ generates dynamic
capability message
to A and stop route
advertisement to A
Figure 3
In figure 3, B during route advertisement detects that the maximum
Srikanth Chavali et.al. Expires October 2004 [Page 6]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
prefix limit for route advertisement is reached. It SHOULD stop
further route advertisements to A. B will toss any route received
with a new prefix once the stop limit has been has been hit.
B then SHOULD send a Dynamic Capability [BGP-DYN-CAP] to A indicating
the current capability if the limit indicator is set. As in the case
of warning prefix limit condition either A or B or both could send
dynamic capability [BGP-DYN-CAP]. Any route withdrawal to A is
automatically recorded and SHOULD result in restoring the announce
policy to the configured one (if any configured) implicitly.
4.3 Prefix limit changes
If a need for prefix limits change arises, each BGP speaker B whose
configuration changes for its peer A, SHOULD dynamically [BGP-DYN-
CAP] inform the corresponding peer of this change. Such changes
SHOULD be handled as described in the following sub-sections.
4.3.1 Processing when maximum prefix limit is increased
When the prefix limits are increased in the configuration of A in
figure 1, it SHOULD inform B about it as described in 4.3. B SHOULD
then restart the route advertisements and it MAY either choose to do
so from the Adjacent-Rib-Out for A incrementally or make use of Route
Refresh mechanism [BGP-RREFRESH]. In doing so the restart of BGP
peering and the associated network traffic and service disruption
with it, is avoided. If the maximum prefix limit is not reached and
increased prefix limits are received by the peer B, then peer B
SHOULD note this and continue with its advertisements to A until
these limits are reached.
4.3.2 Processing when the maximum prefix limit is decreased
When the prefix limits are decreased in the configuration of A (refer
figure 1), then B SHOULD be informed about it as described in 4.3. B
then SHOULD note this information and SHOULD stop route advertisement
immediately if the number of route adtverisments exceeds this new
maximum prefix limit for A. By doing so B can avoid processing the
routes which will be discarded by A when it detects the maximum
prefix limit condition. B at that point follows the process described
in 4.2 for route processing.
5. Error Handling
New error codes along with the sub-codes are defined (TBD).
If a BGP peer does not support this capability and receives it, then
the peer sends a NOTIFICATION with the appropriate error code and the
sub-code. The BGP speaker then SHOULD re-initiate the peering session
Srikanth Chavali et.al. Expires October 2004 [Page 7]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
without the unsupported capability.
5.1 Open Message responded to with Notification
OPEN messages can be rejected for the listed unsupported capabilities
by the BGP speakers. The error code for an open message negotiation
of Capabilities is sub-code 7 [BGP-CAP]. The maximum prefix TLV will
be included in the list of capabilities.
5.2 Capability Message responded to with a Notification Errors
For errors in Dynamic Capabilities, a NOTIFICATION message may be
sent with the Capability messages error code (7) [BGP-DYNCAP] set.
Current sub-code for this error message are:
Subcode Symbolic Name
1 Invalid Action Value
2 Invalid Capability Length
3 Malformed Capability Value
4 Unsupported Capability Code
Support for the Maximum Prefix value negotations will require the
addition of the following sub-code
5 Invalid Capability Value
If the Maximum Prefix code is not supported, the NOTIFICATION message
will be returned with a error code of 7 with a sub-code of 4
(unsupported Capability Code). If the Maximum Prefix Capability is
supported, but the value is not-acceptable to receiving node, the
Notification can be sent with the 5 invalid capability value and the
data field set to the Maximum Prefix TLVs that are not acceptable.
5.3 Cease message for peering reset
When the reset maximum prefix value is exceeded, the peering session
SHOULD be dropped. In which case the CEASE code in the NOTIFICATION
message will be used. The [CEASECODE] proposed BGP Draft gives a
subcode of 1 for a Maximum prefix exceed. The data field has a
maximum prefix upper bound. This field should have a optional 1
octet field that allows a maximum prefix sub-codes to be encoded
beyond this field.
6. Security Considerations
This document does not change the underlying security issues in the
BGP protocol. It however, does provide an additional mechanism to
Srikanth Chavali et.al. Expires October 2004 [Page 8]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
protect against Denial of service attacks based on exceeding
configured maximum prefix limits.
7. References
[BGP-4] Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 (BGP-
4)", draft-ietf-idr-bgp4-20.txt. Work in progress.
[BGP-CAP] Chandra, R., Scudder, J., "Capabilities Advertisement with
BGP-4", RFC 3392, May 2000.
[BGP-RREFRESH] Chen, E., "Route Refresh Capability for BGP-4", RFC
2918, September 2000.
[BGP-DYN-CAP] Chen, E., Sangli, S. R., "Dynamic Capability for BGP-
4", draft-ietf-idr-dynamic-cap-03.txt. Work in progress.
[BGP-STUDY] Chang, D., Govindan, R., Heidemann, J., "An Empirical
Study of Router Response to Large BGP Routing Table Load", ACM
SIGCOMM Internet Measurement Workshop, pp. 203-208, Marseille,
France, November 2002.
[CEASECODE] Chen, E., "Subcodes for BGP Cease Notification Message",
draft-ietf-idr-cease-subcode-05.txt. Work in progress.
8. IANA Considerations
This document uses a new capability type for the support of prefix
limits and the corresponding NOTIFICATION code along with the sub-
codes for non-support. This must be assigned by IANA.
9. Acknowledgements
The authors would like to thank George Matey, Marten Terpstra, Yakov
Rekhter, Enke Chen, Rob Thomas, Manish Gupta, Dan Joyal, Rajesh
Saluja and Elwyn Davies for their review and comments.
10. Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
Srikanth Chavali et.al. Expires October 2004 [Page 9]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
11. Author's Addresses:
Srikanth Chavali
Vasile Radoaca
Paul Knight
Nortel Networks
600 Technology Park Drive
Billerica, MA 01821 USA
Email: schavali@nortelnetworks.com
vasile@nortelnetworks.com
paul.knight@nortelnetworks.com
Mo Miri
BellSouth
575 Morosgo Drive
4A62
Atlanta, GA 3032
home: +1 404-499-5526
email: mohammad.miri@bellsouth.com
Luyuan Fang
ATT Labs
200 Laurel Avenue,
Room C2-3B35,
Middletown, NJ 07748
Phone: +1 732 420 1921
Email: luyuanfang@att.com
Srikanth Chavali et.al. Expires October 2004 [Page 10]
Internet Draft draft-chavali-bgp-prefixlimits-02.txt September 2003
Susan Hares
NextHop Technologies
825 Victors Way
Suite 100
Ann Arbor, MI 48108
Phone: +1 734 222 1610
Email: skh@nexthop.com
Srikanth Chavali et.al. Expires October 2004 [Page 11]