Internet DRAFT - draft-floyd-pushback-messages
draft-floyd-pushback-messages
Internet Engineering Task Force Sally Floyd/ACIRI
INTERNET DRAFT Steve Bellovin/AT&T
draft-floyd-pushback-messages-00.txt John Ioannidis/AT&T
Kireeti Kompella/Juniper
Ratul Mahajan/UW
Vern Paxson/ACIRI
July, 2001
Expires: January, 2002
Pushback Messages for Controlling Aggregates in the Network
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
Floyd et al. Standards Track [Page 1]
Internet Draft Pushback Messages July 2001
1. Introduction
Pushback [MB01] is designed to detect and control high bandwidth
aggregates in the network. An aggregate is a collection of packets
with a common property. For instance, with the destination prefix as
the common property, all packets with a matching prefix define an
aggregate. During a time of severe congestion from a flash crowd or
from a denial of service (DoS) attack, a router might enforce a rate-
limit on the traffic aggregate responsible for the congestion. In
addition, the congested router could ask adjacent upstream routers to
limit the amount of traffic they send for that aggregate. This
upstream rate-limiting is called pushback and can be recursively
propagated to routers further upstream. It serves to spatially
isolate the traffic aggregate so that other traffic sharing the same
downstream links is not impaired by the aggregate.
By imposing only a rate limit, rather than a complete blockage, of
the aggregate, pushback aims to minimize "collateral damage" suffered
by the non-hostile traffic matching the aggregate during a DoS
attack. In general, the hope is that during a DoS attack, pushback
will propagate sufficiently far in the network so that non-hostile
traffic fits within the rate limit imposed on its specific path to
the destination, and accordingly does not have its performance
limited.
This document specifies messages passed between cooperating routers.
It does not address procedures in routers for identifying aggregates
to be rate-limited, or for determining the rate-limits for those
identified aggregates. The goal is to specify an experimental
standard for pushback messages so that we can learn from the
experimental use of pushback. We expect that the specifications for
pushback messages will evolve over time, as we gain more experience
with their use.
There are two main pushback messages - REQUEST and STATUS. Pushback
REQUEST messages are sent to upstream routers asking them to rate-
limit the aggregate. Such a request for rate-limiting is only
advisory; the upstream router is not compelled to follow the request.
As part of rate-limiting on behalf of the downstream router, the
upstream router sends periodic STATUS messages to the downstream
router. The STATUS messages report the arrival rate for that
aggregate at the upstream router, and enable the congested router to
take decisions regarding the continuance of pushback. In addition to
REQUEST and STATUS messages, REFRESH messages reinforce the soft-
state rate-limiting, and CANCEL messages terminate it.
Pushback messages can be used in two ways. In one pushback type,
pushback messages are used to request upstream rate-limiting for the
Floyd et al. Standards Track [Page 2]
Internet Draft Pushback Messages July 2001
specified aggregate. In a second pushback type (DUMMY_PROP),
pushback messages are used simply to get information about the
arrival rates of an aggregate at upstream routers.
A pushback in progress can be visualized as a tree (or, with
multipath routing, possibly a graph), where the congested router
initiating the pushback is the root. The parent of a router in the
tree is the downstream router from which it got the pushback REQUEST.
Routers that do not propagate pushback further are leaves of the
tree.
The following sections specify the format for pushback messages and
the timing of REFRESH and STATUS messages. This document also
specifies the procedures for propagating pushback REQUEST and REFRESH
messages further upstream, and for merging the resulting STATUS
messages from upstream routers.
2. The Common Header
All pushback messages have the following fields prepended in the
header.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version |AdF| Msg Type | Rate-Limiting Session ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Pushback Initiating Router's IP Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender's IP Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first field specifies the version of the pushback protocol the
sender speaks. The protocol described in this document is Version=0.
"AdF" specifies the type of address family used. Currently defined
values are IPv4=0 and IPv6=1. Other values are reserved for future
definition.
The message type is one of REQUEST (= 0), REFRESH (= 1), STATUS (=
2), or CANCEL (= 3). The fields following the common header are
dependent on the type of the message.
The Rate-limiting Session ID (RLSID) is generated by the congested
router initiating pushback. It MUST be unique among all current rate-
limiting sessions initiated by this router. The RLSID combined with
the IP address of the congested router defines a pushback session
over the whole network. A router receives both these fields from the
Floyd et al. Standards Track [Page 3]
Internet Draft Pushback Messages July 2001
downstream router requesting pushback. These fields enable the
routers to map incoming messages to the appropriate rate-limiting
session. A router MAY use its different addresses when initiating
different pushback sessions.
Note that if the router's address reflects a private addressing
realm, then it MUST be altered upon crossing into a different
addressing realm. Ideally this transformation uses a new address
unique to the router; if not available, then the address of the
router propagating pushback (by sending the message) into the
different realm is used.
The sender's IP address has been included in the pushback message,
making message interpretation independent of the IP source address
field. This eliminates any confusion regarding which interface's
address is included in that field (that is, whether it is the IP
address of the message-sending interface, or of some other interface
at the router). It also helps when pushback messages traverse
between routers that are not directly connected. If a sender sends
pushback messages to two peers in two different addressing realms, so
that the sender doesn't have a unique address to send to both peers,
then the sender will use different values for the sender's IP address
in the two messages.
Both the IP addresses will be 128 bit fields for IPv6.
2. The Pushback REQUEST Message
Pushback requests (type REQUEST=0) are sent upstream when a router
wants the aggregate to be rate-limited upstream. The fields in a
pushback REQUEST, in addition to the common header, are shown below.
Floyd et al. Standards Track [Page 4]
Internet Draft Pushback Messages July 2001
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PType |SRMode | Max Depth | Depth in Tree | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bandwidth Limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Expiration Time |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Status Frequency |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Congestion Signature |
...............................................
...............................................
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
"PType" denotes the type of pushback and determines the upstream
router's behavior in various ways. PType=0 (HI_DROP_PROP) requests
the upstream router to propagate pushback if the restricted aggregate
suffers a high drop rate due to the restriction. (This is the usual
mode of pushback.)
PType=1 (ALWAYS_PROP) requests that the upstream router propagate
pushback irrespective of the drop rate experienced by the aggregate.
It would typically be used when the aggregate is known with high
confidence to be malicious.
PType=2 (DUMMY_PROP) indicates no actual rate-limiting should take
place---the downstream router is just interested in the arrival rate
estimate of this aggregate. The extent of propagation of these
pushback messages is controlled by the congested router using "Max
Depth" (explained below) to determine the arrival rate of an
aggregate several hops upstream.
Other values of PType are reserved for future definition.
The pushback requester can specify the mode in which it wants
feedback with the "SRMode" (Status Reporting Mode) field. SRMode=0
(COMPACT) specifies the feedback should be just the total arrival-
rate estimate of the aggregate. SRMode=1 (CLOSEST) specifies the
feedback should include per-router feedback for upstream routers, and
if there is not room for all of them, then those closest (lowest hop
count) should be preferred. SRMode=2 (FURTHEST) is similar to
CLOSEST, but prefers routers further away from the congested router.
SRMode=3 (SAMPLE) specifies that instead a pseudo-random subset of
Floyd et al. Standards Track [Page 5]
Internet Draft Pushback Messages July 2001
the upstream routers should be included. These last two modes
facilitate different forms of mapping to aid with tracing back the
attack sources. Other values are reserved for future definition.
Using "Max Depth" the congested router can control the maximum number
of hops pushback will propagate to. The special value of 255
indicates unrestricted propagation; 254 indicates that pushback
should be propagated up to, but not across, the AS boundary.
The depth in the tree is the distance of the requesting node from the
root of the pushback tree. The depth of the root is zero, and a
child's depth is one more than the depth of its parent. Depth
information is useful in setting timers for sending feedback. If a
router receives overlapping pushback requests from multiple peers,
its depth is one more than the minimum depth of its parents.
The bandwidth limit is a single precision (32-bit) floating point
number in IEEE format, as described in [SPG97]. It expresses the
rate in bytes per second. It is only a requested upper bound for the
bandwidth to be given to the aggregate. If congested, the upstream
router could send less than that upper bound. In addition, the
upstream router is not *required* to observe the requested bound.
The expiration time is the time period after which the pushback
request expires if no REFRESH messages arrive. The status frequency
gives the time after which the upstream routers should send STATUS
messages downstream. Both the times are specified in milliseconds,
and represented using 32-bit integers in network byte order.
2.0.1 The Congestion Signature
The congestion signature is the description of the aggregate that is
to be rate-limited. Its specification is a major open question. As
an example, the attack signature might consist of the destination
prefix(es) characterizing the aggregate. At the other extreme, we
could allow arbitrary ACLs of fields in the packet header. For this
first experimental use of pushback, we will limit the congestion
signature to depend only on the source and destination IP addresses
in the packet header. This excludes the use of aggregates defined by
the protocol field in the IP header and the port numbers in the
transport header; an example is an aggregate consisting of all DNS
traffic.
Congestion signatures are specified as TLVs (type-length-value):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Floyd et al. Standards Track [Page 6]
Internet Draft Pushback Messages July 2001
| Type | Length | Value .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...............................................
.....................Value.....................
...............................................
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .... Value and final Padding .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Length field includes the Type and Length fields, as well as the
length of the Value and any Padding.
Values are padded up to 32-bit alignment. If, after doing so, more
data remains in the datagram, then it's interpreted as another TLV.
Type=0 (SRC_PREFIX) indicates a source address prefix. Its Length is
4 bytes plus the length of an address in the format specified by AdF
above. The first octet of Value (bits 16 through 23 in the first
line of the above figure) are reserved. The second octet gives the
prefix length, in the range 1-32 (IPv4) or 1-128 (IPv6). Other
values are reserved.
Type=1 (DST_PREFIX) is the same but for destination address prefix.
When all prefixes in the list are of the same type, the congestion
signature describes packets that have the corresponding field (source
or destination) matching one of the prefixes in the list. In
presence of both source and destination prefixes, packets belonging
to the aggregate are those destined for one of the destination
prefixes *and* coming from one of the source prefixes.
2.1 Propagating a Pushback REQUEST Message
When propagating a pushback request upstream, the router MUST insert
the correct depth information, which is one more than the depth of
its parent(s).
In addition, the destination prefixes in the congestion signature
MUST be checked to see whether they have to be *narrowed*, to
restrict the rate-limiting only to traffic headed for the downstream
router that requested pushback, as follows. Suppose the congested
router X identifies a certain aggregate A with destination prefix
128.95/16. X will ask its upstream router Y (among others) to rate-
limit traffic from aggregate A (128.95/16). However, Y cannot use
the same specification directly because while Y could be forwarding
128.95.1/24 to X, it might not be forwarding the rest of 128.95/16 to
X. If Y (and routers upstream of Y) started rate-limiting all of
128.95/16, the network would drop traffic which would not have
Floyd et al. Standards Track [Page 7]
Internet Draft Pushback Messages July 2001
reached the congested router X.
To avoid this unnecessary packet-dropping, it is important that Y
look at its routing table to find prefixes within 128.95/16 that are
forwarded to X. Y has to check all extensions of the given prefix in
the routing table.
The issue of narrowing the congestion signature occurs when a
pushback request is propagated upstream by a router (thus becoming a
non-leaf in the tree), or when the pushback request is passed from
the output interface to an input interface at a router. The above
algorithm for narrowing the congestion signature works only for
congestion signatures with a destination address component in them.
It cannot be applied to other signatures, pure source-based ones, for
instance. We do not deal with the issue of narrowing non-
destination-based signatures in this document except noting that it
can be done given the right routing information at the upstream
router.
A router could receive requests from different downstream routers
with overlapping congestion signatures. Future work might address
the possibility of merging two different rate-limiting sessions in
this case.
2.2 Pushback REFRESH Messages
Pushback REFRESH messages are initiated by the congested router that
started the pushback, if it wants the pushback to continue. For
uninterrupted rate-limiting, these messages should be generated
before the rate-limiting expires at the upstream routers
The REFRESH message is identical to the REQUEST message, so that if
the upstream router has crashed in the meanwhile, the state can be
reestablished. However, the message type is set to REFRESH so that,
if state already exists, it is matched against the RLSID and router
address fields so that the receiving router does not have to go
through the process of setting up state from scratch.
REFRESH messages can change any field specified earlier in the
pushback REQUEST. On receiving the pushback REFRESH message the
upstream routers update the expiration time for the rate-limit
session and the limit imposed on the aggregate, and set the timer for
the STATUS message. Non-leaf routers in the pushback tree SHOULD
send REFRESH messages further upstream after dividing the rate limit
among upstream neighbors. If the aggregate specification has
changed, the router MUST check if the new aggregate needs to be
narrowed, using the process described above, before propagating the
pushback REFRESH.
Floyd et al. Standards Track [Page 8]
Internet Draft Pushback Messages July 2001
2.3 Pushback CANCEL Messages
The pushback CANCEL message is sent upstream to stop rate-limiting
the aggregate. It SHOULD be propagated upstream by routers that have
propagated pushback requests (non-leaf routers in the pushback tree).
The CANCEL message has no fields beyond those present in the common
header.
3. Pushback STATUS Messages
Upstream routers that receive a pushback REQUEST send pushback STATUS
messages to the router from whom they got their REQUEST. The
additional fields in the STATUS message are:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Arrival Rate Estimate |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SRMode| Rsrvd | Height | NumElem |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| <Router ID> |
| <Router Info> |
| <Arrival Rate at Router> |
...............................................
................................................
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The arrival rate estimate is a single precision floating-point number
in IEEE format, as described in [SPG97]. It expresses the arrival
rate of the aggregate in bytes per second if there was no rate-
limiting upstream of the STATUS sender. The arrival rate for the
first STATUS message is computed over the interval since the receipt
of the pushback REQUEST. For the subsequent messages, it is computed
over the interval since the last STATUS message.
The SRMode field specifies the mode in which status is reported. It
is the same as that in the pushback REQUEST message. The supported
modes and their semantics are described in Section 2.
"Height" denotes the height of the sender in the pushback tree. It is
zero for leaf nodes, and one more than the maximum height among
children for non-leaf nodes. This field tells the receiver how far
pushback has propagated upstream of it.
Floyd et al. Standards Track [Page 9]
Internet Draft Pushback Messages July 2001
For status modes that return a list of routers, NumElem gives the
number of listed routers. Then, for each router
* Router ID gives an address identifying the router
* Router Info gives information associated with the router.
Currently, this consists of:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S| Reserved | Depth in Tree |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
S is a bit that if set means that the entry is Stale, meaning that no
new STATUS message has been received from the router since the last
time the sending router sent a STATUS message.
Depth in Tree reflects the depth of the router in the pushback tree.
3.1 Aggregating STATUS Messages
A parent node uses the STATUS messages from its children to construct
the STATUS message it sends downstream. The merger technique used
depends on the mode of the STATUS message. In COMPACT mode, the
parent node adds the arrival rate estimates received from the
children and its estimate from upstream routers that were not sent
pushback messages. In other modes, the lists obtained from the
various children are merged, and appended with the parent node's own
estimate, as indicated in the previous section. The height is one
more than the maximum height recieved from the children.
3.2 Timing of STATUS Messages
The status frequency specified in the pushback REQUEST and REFRESH
messages is the rate at which the originating router would like to
receive status reports. Since the upstream routers are at different
distances from the root, the timer values they set have to be
different. In particular, routers further away from the root should
set smaller timer values because they get messages late and their
STATUS messages take time getting to the root node.
On receiving a REQUEST or REFRESH message, the routers set a timer to
send the STATUS message. The value of this timer is the status
frequency minus the _depth_ * _k_, where _depth_ is the router's
depth in the pushback tree, and _k_ is a constant that signifies the
maximum round trip time for a message over a pushback tree edge
(including message processing time). _k_ should be configured to
some comfortable upper bound like 100 ms (it is same for all the
Floyd et al. Standards Track [Page 10]
Internet Draft Pushback Messages July 2001
routers in the pushback tree). For satellite hops or other links
with round-trip times greater than the configured value _k_, the
consequences will simply be stale STATUS messages.
Setting timers in this fashion means that parents are likely to
obtain fresh STATUS messages from their children before their own
STATUS message timer goes off. This in turn means that fresh STATUS
messages are sent further downstream after aggregation. If a parent
router's timer fires before it has received STATUS message from one
of its children, it MUST send its own STATUS message downstream using
the last value received from this child or its own estimate, and, if
including an individual rate report for this child, marking it with
S=1 to indicate it is Stale.
The status timer is set again immediately after sending the STATUS
messages downstream. The value of the timer is the same for all the
routers in this case, and is equal to the status frequency, since the
required offset has already been achieved. If a router receives a
REFRESH message before its status timer expires, new timers are set
as described above.
A small jitter can be applied to status timers so that the downstream
router recieves STATUS messages from its children at different times.
In some cases, the original sender of the pushback REQUEST might want
some variation in the status timers to provide some degree of
protection against gaming adversaries that try to time their bursts
to avoid detection. This variation could be achieved by the original
sender by making changes to the Status Frequency specified in the
pushback REFRESH messages.
4. Authentication for Pushback Messages
Pushback messages require some form of authentication, even if the
pushback messages are between adjacent routers. However, this
document currently does not specify the form of authentication to be
used.
5. Messages between Routers and Local Agents
Some routers might send packet headers from a sample of the traffic
to an agent for outboard processing, and receive control messages
back from the agent about identified aggregates to be rate-limited.
The router and local agent will also exchange control messages, for
example, to control the sampling at the router. The formats for
these messages will probably be addressed in a separate document.
Floyd et al. Standards Track [Page 11]
Internet Draft Pushback Messages July 2001
Because this is a purely local conversation between a router and an
attached local agent, it is not necessary that a router and its
attached local agent follow the protocol suggested in that document.
6. Messages exchanged with the NOC
In some cases the NOC (Network Operations Center) will want to have
final approval before an aggregate is rate-limited. Thus, one
category of pushback messages will be the messages exchanged with the
NOC. This draft currently does not specify these messages.
Conclusions
Acknowledgements
There is a list of people who can be either co-authors, or can be
acknowledged in this section. So far, this list includes the
following. The pushback authors: Ratul Mahajan, Steven M. Bellovin,
Sally Floyd, John Ioannidis, Vern Paxson, and Scott Shenker. From
Juniper: Kireeti Kompella. From Cisco: Barbara Fraser, David Meyer.
Other: Randy Bush.
References
[MB01] Ratul Mahajan, Steven M. Bellovin, Sally Floyd, John
Ioannidis, Vern Paxson, and Scott Shenker, Controlling High Bandwidth
Aggregates in the Network, February 2001. URL:
"http://www.aciri.org/pushback/".
[SPG97] S. Shenker, C. Partridge, R. Guerin. Specification of
Guaranteed Quality of Service. RFC 2212. September 1997.
Security Considerations
We will eventually address the potential DoS features and security
vulnerabilities of pushback in detail here.
IANA Considerations
AUTHORS' ADDRESSES
Sally Floyd
Phone: +1 510 666 2989
ACIRI
Email: floyd@aciri.org
URL: http://www.aciri.org/floyd/
Floyd et al. Standards Track [Page 12]
Internet Draft Pushback Messages July 2001
Steve Bellovin
Phone: +1.973.360.8656
AT&T Labs - Research
Email: smb@research.att.com
John Ioannidis
Phone: +1.973.360.7012
AT&T Labs - Research
Email: ji@research.att.com
Kireeti Kompella
Juniper Networks
1994 N. Mathilda Ave
Sunnyvale, CA 94089
Email: kireeti@juniper.net
Ratul Mahajan
Phone: +1 206 616 1853
Univerity of Washington
Email: ratul@cs.washington.edu
URL: http://www.cs.washington.edu/homes/ratul/
Vern Paxson
Phone: +1 510 666 2882
ACIRI
Email: vern@aciri.org
URL: http://www.aciri.org/vern/
This draft was created in July 2001.
It expires January 2002.
Floyd et al. Standards Track [Page 13]