Internet DRAFT - draft-hummel-te-oct
draft-hummel-te-oct
TE-WG Working Group Heinrich Hummel
Internet Draft Siemens AG
Expiration Date: October 2000
April 2000
Orchestrally conducted Traffic ( OCT )
draft-hummel-te-oct-02.txt
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This draft proposes to engineer traffic such that we can speak of an
Orchestrally Conducted Traffic (OCT): Any traffic stream (given by
its source and destination nodes and by a priority class) may use
several differently routed LSPs. Each, traffic stream ingress
reports, periodically, the currently measured traffic load (in
bitrate) to a common TE Conductor (TEC) who evaluates them as to
forecast what traffic load change might occur in the immediate time
ahead. Accordingly the TEC computes well synchronized likelihood
values by which to take which route/LSP. These values will be such
that any traffic stream will be served as good as possible, as often
as possible, while equally ranked traffic streams are treated fair,
higher ranked streams are prioritized and the network thruput is
maximized. The TEC sends the likelihood values to the respective
ingress node, who will distribute the received packets of the
traffic stream to the pertaining LSPs accordingly.
Hummel April 2000 [Page 1]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
1. Introduction and summary
This draft proposes to engineer traffic such that we can speak of an
Orchestrally Conducted Traffic (OCT): Any traffic stream (given by
its source and destination nodes and by a priority class) may use
several differently routed Label Switched Pathes (LSPs). Each,
traffic stream ingress reports, periodically, the currently measured
traffic load (=bitrate value) to a common Traffic Engineering
Conductor (TEC) who evaluates them as to forecast the immediately
upcoming traffic load. Accordingly the TEC computes well
synchronized likelihood values by which to take which route/LSP.
These values will be such that any traffic stream will be served as
good as possible, as often as possible, while equally ranked traffic
streams are treated fair, higher ranked streams are prioritized and
the network thruput is maximized. The TEC sends the likelihood values
to the respective ingress node, who will distribute the received
packets of the traffic stream to the pertaining LSPs accordingly.
2. OCT concept in detail
2.1 Goals and Basics
The OCT concept is based on a periodic information exchange between
the ingress nodes of all traffic streams and a central Traffic
Engineering Conductor (TEC). For now, it is assumed that
communication channels are established (i.e. special LSPs) to cater
this information exchange.
It is also assumed that one or several differently routed LSPs per
traffic stream are established for carrying the user data. The OCT
concept solves the question "by which likelihood should which LSP
been taken so that the entire traffic is balanced best onto the given
network ?"
Best balancing includes many aspects, some of which have already been
mentioned: Higher ranked traffic streams shall be prioritized versus
lower ranked traffic streams. Equally ranked traffic streams shall be
serviced equally fair. The thruput should be maximized.
But implicitly there are even more properties of "best balancing":
avoiding congestions as best as possible, resolving a congestion
without creating a new congestion at just another location.
Maximizing the thruput: It means: Among equally ranked traffic
streams, give preference nevertheless to that portion of a traffic
stream that can be serviced by a shorter LSP (using less hops).
Eventually such a preferenced handling is beneficial to all traffic
streams (e.g. so that all traffic streams can be completely serviced
rather than some are more, some are less deteriorated). Eventually
however the traffic streams that use shorter LSPs are indeed better
Hummel April 2000 [Page 2]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
serviced than the others. Conclusion: Maximizing the thruput is more
important than equal treatment.
Let us partition the whole traffic into traffic streams T(i,e,p),
each of which is identified by an information-triplet {i-ngress node,
e-gress node, p-riority class}. By other words, a traffic stream is
the sum of all IP-Micro flows which enter the MPLS network domain at
the same node (= ingress node), which will leave the MPLS network
domain at the same node (=egress node) and which share the same
priority class, based on their DiffServ parameters. For each traffic
stream T(i,e,p) there shall be n(i,e,p) pre-established label
switched pathes, i.e. LSP-k; k=1,...,n(i,e,p), for transmitting its
packets. LSP-1 shall be the most preferrable one, LSP-n(i,e,p) the
least preferrable one.
An ingress node i knows all traffic streams T(i,e,p) it originates.
For each of them it knows all respective n(i,e,p) LSPs, incl. all
their link sequences, and measures, continuously, the respective
current total traffic load as well as the individual current traffic
load share per LSP. Periodically the ingress node i reports the
total traffic load (in bitrate) as well as the LSP-specific traffic
load shares to the TEC -with respect to all traffic streams it
originates.
Besides these reports, the TEC shall know the network topology incl.
the maximum bandwidth BWj of each network link j. Furthermore it
shall know the link sequences of all LSPs for all traffic streams in
the network.
The sending of the reports to the TEC by all ingress nodes may be
slightly jittered, but by neglecting the jitter, the TEC shall
receive all reports with respect to the same time of day. It is
advisable that any report also contain either the exact time of day
it has been sent or that time of day the report is meant for.
2.2 Forecasting traffic load
The TEC may apply different mathematical/heuristical methods,
algorithms, concepts, theories as to compute / forecast the total
traffic load TTL (i,e,p) for each traffic stream T(i,e,p).
For example, two methods are shown by the following:
Method 1:
Make a linear extrapolation based on the youngest traffic load value
at time t-1 ("t minus one") in the past and the present traffic load
value at time t0 ("t zero"), as to determine the future traffic load
at time t1 ("t one"). If total traffic load TTL(i,e,p)(t1) exceeds
Hummel April 2000 [Page 3]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
the highest traffic load ever reported for traffic stream T(i,e,p)
then replace the extrapolated value by the highest total traffic load
ever reported for T(i,e,p).
If total traffic load TTL (i,e,p) (t1) turns out negative then
replace this value by the value 0 + epsilon.
Method 2 (very promising, if upstream from ingress node i* there is a
webpage of general interest):
Build the average traffic load AVTL (i*,e,p) (t0) out of all
currently measured traffic loads TTL (i*,e,p) (t0) and forecast for
all traffic streams TTL(i*,e,p) (t1) = AVTL(i*,e,p) (t0).
Certainly there will be more methods as to forecast any total
traffic load for time t1.
Learning from the success:
Note that one report-timer-interval later the prognosticated traffic
load can be verified, how well it turned out. We can make statistics
which method is more often more successful than the other methods. We
may apply the more successful method more often ! So we may learn
from the success.
2.3 Prosaic description how to compute likelihood values
Imagine the TEC as a person sitting in front of a casino table. On
the table however, the topology graph of his network is drawn. On
each ingress node i the TEC has placed, with respect to traffic
stream T(i,e,p), a pile of uniformly sized (traffic load) chips: The
height of the pile amounts to the absolut value of the forecasted
total traffic load TTL (i,e,p) (t1). Let's call it total traffic
load-pile. (Eventually, at the bottom of any such pile, there may be
a chip whose size is smaller than the uniformly sized chips.)
On any link j of the network there is also a pile of (bandwidth)
chips whose height amounts to the maximum bandwidth BWj of that link.
Let's call it link-j-bandwidth pile. (Eventually, at the bottom of
any link-j-bandwidth pile, there may be a chip whose size is smaller
than the uniformly sized chips.)
The goal is to distribute the chips of each total traffic load-pile
by piling up respective LSP-specific traffic load piles. Eventually,
a remainder of the total traffic load-pile cannot be distributed in a
way which is shown next. It will reflect that proportion of the
Hummel April 2000 [Page 4]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
traffic load which cannot be transmitted.
The TEC processes one chip after the other, each time,at least in
general, from a different total traffic load-pile:
The TEC tries to move the chip to the k-th LSP's pile;
(k=1,2,...,n(i,e,p) ). At the beginning k=1.
Whenever a traffic load chip is moved to the k-th LSP-specific pile,
a (same sized) bandwidth chip is removed from each link-j-bandwidth
pile of all those links j which this k-th LSP traverses. If this is
not feasable because at least one of these links lacks such a
bandwidth chip, the same is tried with the (k+1)st LSP. If it is not
feasable with any of the n(i,e,p) LSPs then there will be a remainder
total traffic load pile whose (normed) height will indicate the
likelihood by which a packet from traffic stream T(i,e,p) shall not
be transmitted.
The sequence by which the traffic load chips of all total traffic
load -piles are handled as described above is very important as to:
achieve fairness among equally ranked traffic streams,
priortize/discriminate higher/lower ranked traffic streams, and get
best thruput:
Higher ranked traffic streams will be preferred versus the lower
ranked traffic streams by processing ALL their traffic load chips
first.
Fairness among equally ranked traffic streams will be achieved by
processing their traffic load chips alternatingly. However, make also
the following exception: Among equally ranked traffic streams process
at first those traffic load chips which can be moved to "shorter"
LSPs, i.e. to LSPs with fewer hops.
When all the traffic load chips have been processed, the heights of
the resulting LSP-specific piles and of the remainder pile need to
be normed i.e. divided by the sum of all these heights as to get
likelihood values.
The TEC shall send back to any ingress node i these likelihood values
with respect to all traffic streams T(i,e,p) which are originated at
that node i.
2.4 Distributing the traffic load onto the LSPs
An ingresss node must distribute the received traffic load onto the
respective LSPs according to the likelihood values. Ingress node i
whichs receives a new IP packet may identify the traffic stream to
Hummel April 2000 [Page 5]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
which this packet belongs. I.e. it may identify its egress node
based on its IP destination address and may identify its priority
class based on its Diffserv parameters.
(a):
In accordance with the actual likelihood values it will select an
appropriate LSP for transmitting this packet, respectively will
dishonor its forwarding request: The ingress node may build a hash
with respect to the IP source addresses of the IP packets of
T(i,e,p). The simplest hash is such that it intersects the entire IP
UNICAST address space into n(i,e,p) +1 intervalls. The sizes of
these intervalls shall be proportional to the likelihood values. The
IP source address of a received IP packet of a particular traffic
stream will be examined as to determine from which interval it is
part of. If it is from the k-th interval, then it is forwarded along
the k-th LSP; k=1,...,n(i,e,p). If it is from the n(i,e,p)+1 st
intervall, then it is not forwarded at all.
This very simple hash mechanism suits best if the ingress node is
fairly in the "center of the internet", i.e. receives IP packets
containing any existing IP UNICAST source address.
(b):
An ingress node i (e.g. of an access network) may know, by means of
network management configuration or by learning while operating, the
appropriate candidate IP source address range which may be
intersected as to form the right hash-intervals.
(c):
If an ingress node receives IP packets from a fairly limited number
of sources, e.g. given by less than 100 different IP source
addresses, it may be feasable to queue up information pertaining to
all currently active IP micro flows: A queue element may consist of
an IP source address and the LSP to be selected. If the first packet
of a new IP micro flow is received, i.e. if no queued information can
be found that matches the received packet's IP source address then a
new queue element must be allocated whose LSP information may be
determined by this: Generate a random number which hits/selects any
column of a statistical stage function which is built from the
likelihood values. The k-th column will select the k-th LSP. It
shall be used and also entered in the new queue element.
Any succeeding packet of a particular IP micro flow may be sent along
the same LSP. Advantage: all packets of the same IP micro flow will
arrive in proper sequence at the egress node. A mechanism needs to be
added to determine the end of the lifetime of an IP micro flow,
i.e. the point in time when a queue element is deleted: Maintain an
information in the queue element which is refreshed/incremented by
the transmission request of each packet of the IP micro flow. Employ
Hummel April 2000 [Page 6]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
a periodic timer. Whenever it expires check all queue elements for
all traffic streams originated in this ingress node whether they are
aged out. If so, delete them.
3. Further Aspects
3.1 Other traffic load input:
The TEC may also receive information about SCHEDULED traffic from
other senders than the ingress nodes. Such senders may be a policy
agent or a Network Management station/agent.
3.2 Adding/Removing another LSP per traffic stream
3.2.1 Adding an LSP:
In the preceding it was assumed that there are some LSPs per traffic
stream, without saying when and why an(other) LSP may be added or
removed. An obvious reason for establishing another LSP is given,
when the TEC determines a considerable likelihood for NOT forwarding
some packets of the traffic stream along any of the existing LSPs.
Indeed, the TEC may compute the route for another LSP while taking
into account the currently available bandwidth in the network due to
the present traffic loads and may send to the right ingress node
the computed route for the new LSP - as well as the likelihood by
which to use it.
3.2.1 Removing a LSP:
Over a longer period of time, an ingress node just like the TEC may
notice that an LSP is hardly used. It may be aborted. TEC and
ingress node must inform each other.
3.3 Delaying instead of dishonoring packet transmission
Despite of best traffic balancing and even despite of 3.2 there
may be IP packets whose transmission request might be dishonored.
Alternatively, their transmission may only be delayed. Further study
is required how to do it and what are the consequences with respect
to the entire "traffic balancing result".
3.4 On-demand LSPs:
Another variation would be to establish LSPs just on demand. The
ingress node may have for a traffic stream, instead of a set of
existing LSPs, a set of routes to choose from. The described
procedure as to determine the right LSP for transmitting an IP packet
may be applied analogously however as to determine the explicit route
to use when the on-demand LSP is to be established.
Hummel April 2000 [Page 7]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
3.5 Long term results
Over a longer period of time, while and though the total traffic is
orchestrally conducted, the TEC may find out which links are
permanently under-utilized resp. over-utilized and may conclude which
links should be stocked up by which amount of bandwidth respectively
may conclude which links may be removed or used differently by
becoming re-configured.
Determining to which extent a link is permanently under-utilized:
Simplest method: determine the smallest available bandwidth ever of
that link.
Determining to which extent the bandwidth of a link should be stocked
up: After the TEC has computed and sent out the likelihood values
for all traffic streams, it may compute likelihood values for a
fictitious "should be"-network by the following: In one additional
cycle the TEC may process each remainder TTL (i,e,p)-pile like
being one big chip and by moving this "big chip" onto its most
preferred LSP-1. Hereby substract from each affected link-j-bandwidth
pile the size of this "big chip" and do not mind if the height of
any link-j-bandwidth pile becomes a negative number. Indeed,after
this cycle any resulting negative link-j-bandwidth pile, indicates
the absolute amount of bandwidth by which link j should be stocked
up.
Of course, average values should be built, over a longer period of
time, as to determine the permanent over/under-utilization of any
link in the network.
3.6 Equally ranked LSPs
So far it has been assumed that per traffic stream T(i,e,p) there are
n(i,e,p) LSPs and that they are in ranking order: LSP-1 > LSP-2
>....> LSP-n(i,e,p) . LSP-1 ist the most favorable LSP, LSP-n(i,e,p)
is the least favorable LSP e.g. in the sense that the more favorable
LSP consists of fewer hops than the less favorable LSP. However among
the n(i,e,p) LSPs there may be subsets of LSPs which are equally
favorable and should be treated as equally ranked.
Accordingly, when another traffic load chip shall be distributed to
any LSP-specific traffic load pile, we should at first try any LSP
from the best ranked subset of LSPs. That LSP should be selected such
that an even distribution among the LSPs of this subset is warrented
(which is accomplished by proper alternation).
Hummel April 2000 [Page 8]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
Proceed analogously with the next ranked subset of LSPs, in case no
LSP of the better ranked subset can accommodate this chip.
3.7 Central TEC versus de-centralized solutions
This OCT concept operates with a central TEC. There may also be other
alternative concepts which operate in a de-centralized fashion. In
case the TE-WG is interested in standardizing a protocol according to
this OCT concept, I would like to point out one important advantage
of the centralized solution, i.e. one associated standardization
objective:
The ingress nodes have to be provided with some basic initial
software - just like the TEC. Later on the TEC may get smarter and
smarter by getting newer software versions. At the same time however
the ingress nodes can keep their old initial software.
3.8. Standardization - required / not required
It is possible to install the OCT concept in a completely proprietary
fashon, i.e. without standardizing any new messages , TLVs ,
codepoints or well-known TCP port number.
In a simplest implementation special p2p LSPs are established from
each ingress node to the TEC and vice versa for exchanging all
relevant information in proprietary syntax. Alternatively the TEC may
send out the likelihood values down a spanning tree, which is
provided by a tree-structured source routing information called TREE
ROUTE TLV, see [3], whereby each node i* will become aware which part
of the received info is to forward along which child link and which
part is just targetted to itself - which are all likelihood values
for all traffic streams T(i*,e,p).
It is up to the TE-WG (and maybe others) whether or not to
standardize all required protocols and formats with respect to the
communication between any ingress node and the TEC, as well as
between any Network Management and/or Policy Agent and the TEC.
4. Intellectual Property Considerations
Siemens AG may seek patent or other intellectual property protection
for some or all of the technologies disclosed in this document. If
any standards arising from this document are or become protected by
one or more patents assigned to Siemens AG. Siemens AG intend to
disclose those patents and license them under openly specified and
non-discriminatory terms.
5. References
Hummel April 2000 [Page 9]
Orchestrally conducted Traffic (OCT) Exp. Oct. 2000
[1] "Traffic Engineering Concept",
draft-awduche-mpls-traffic-eng-00.txt, Awduche
[2] "Provider Architecture for Differentiated Services
and Traffic Engineering (PASTE)", 1/1998, draft-li-paste-00.txt,
Li, Rekhter
[3] "Explicit Tree Routing",draft-hummel-mpls-tree-route-01.txt,
Hummel, Loke
6. Author's Address
Heinrich Hummel
Siemens AG
Hofmannstrasse 51
81379 Munich, Germany
Tel: +49 89 722 32057
Email: heinrich.hummel@icn.siemens.de
Full Copyright Statement
"Copyright (C) The Internet Society (March 2000). All Rights
Reserved. This document and translations of it may be copied and
furnished to others, and derivative works that comment on or
otherwise explain it or assist in its implmentation may be prepared,
copied, published and distributed, in whole or in part, without
restriction of any kind, provided that the above copyright notice and
this paragraph are included on all such copies and derivative works.
However, this document itself may not be modified in any way, such as
by removing the copyright notice or references to the Internet
Society or other Internet organizations, except as needed for the
purpose of developing Internet standards in which case the procedures
for copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
Hummel April 2000 [Page 10]