Internet DRAFT - draft-crowcroft-mat
draft-crowcroft-mat
Internet Engineering Task Force Jon Crowcroft
INTERNET DRAFT
draft-crowcroft-mat-00.txt
November, 2001
Expires: April, 2002
Multicast Address Translation
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
Network Address Translation is a technique for reducing the
complexity of address re-assignment. It can also lead to a reduction
in the routing state in the core.
This draft is about using the technique for the same reasons for
multicast. The approach is complementary to IP-in-IP tunnels for
multicast, but avoids the overhead of encapsulation. It has the same
state setup requirements at the edges, as does NAT.
For multicast sessions, there is often an aggregate known as the
"cross section" of traffic which is of sufficient common interest
that it is carried across the core. In this case, MAT can be used to
reduce the multicast olist routing state in the access routers, and
the full (S,G) state in the core.
Some promising candidate algorithms for this are described in the
paper [NGC2001]. As well as extracting the tree from the routers and
computing the overlaps and so on, clearly, a protocol to establish
the state for mapping would be required in exactly the same way that
a tunnel setup protocol is needed. This is for further work.
This technique is not necessary for Source Specific Multicast
(although it could be used). The interaction with IGMPv3 is to be
studied.
1. Introduction
In this document we propose the possible use of address translation
as a means to aggregate state for multicast groups that have
overlapping coverage in the core.
This type of thing has been proposed before but typically by use of
tunnels [Infocom 1998], or implicitly by examination of the
possibility of state aggregation merely within a router [Infocom
2000]. Both of these approaches have drawbacks.
The first has the problem with the additional header overhead and
subsequent problems for MTU discovery (or configuration). The second
is difficult since many service providers wish to account for traffic
on a per (S,G, iif/oif) basis, and the aggregation technique
potentially masks this. We do not intend to solve this latter problem
(if providers want fine grain accounting they have to provision
routers for fine grain state). We are proposing something that might
avoid the former problem.
Recently, aggregation of state amongst multiple trees was examined
for realistic group numbers, membership distribution and network
topologies [NGC 2001]. In that work, no specific mechanism other than
encapsulation was proposed. As above, this has drawbacks of the
additional header overhead.
However, there is no particular reason why one could not use address
translation to perform the aggregation, instead of tunnels.
If two trees are congruent in a region, then we can translate all the
destination addresses, Gi, into one G (e.g. G0). Usually, sources are
distinct for distinct groups. In the case where a source is sending
to more than group for which address translation is being performed
in a region, we propose translating the source address(es) for the
later created (or detected) group(s) at the ingress, and then re-
translating at the egress point from the core. (Si to Si' mapping).
A translation management protocol would piggy back on the routing
protocol, and would contain for the various base Group (G0), the list
of Gi, and the Si to Si' mappings.
To preserve the RPF safety feature of multicast trees, we suggest
that the mapping is from Si to an Si', drawn from an equal or more
specific prefix that matches.
Where G' has near coverage, but not perfect congruence, add links to
egress routers where there are receivers for Gi, but not G0, where
additional load is small, and add reverse re-translators back to Si
for traffic where there are receivers for G0 but not Gi so that
normal filtering removes the traffic (or add a filter - same h/w and
s/w support often involved in many router operating systems anyhow).
The rules for "near" coverage could be the same as those used to
configure the RP to source tree switch in PIM-SM (or similar traffic
threshold derived rules).
An alternative approach to traffic based tree merging, would be to
use actual topological similarity. Two approaches suggest themselves
1. Code the tree topologies as a data structure such as a depth then
breadth, and do a pure length comparison: if the structures map to a
certain threshold point, say they are "equivalent" the threshold to
be set by managers at each node in the tree.
2. Use information from neighbouring routers in the tree- e.g. we can
split neighbours into "same versus different AS", "same versus
different level of OSPF or ISIS", "same versus different prefix",
"same prefix up to some point". A Bloom filter could be applied with
the "leakiness" deliberately set high (high is a TBD parameter!),
which can then allow for approximate matching.
2. MAT and NAT
One view of of address translation [RFC2663] is that it provides a
valuable way to conserve addresses. Another view might be that it
reduces the globally visible routing state since it means that only
active hosts at sites use globally visible prefixes, and thus require
forwarding entries. This is, of course, not free - it incurs both the
translation state at the edges, and the protocol messages (e.g. via
an ALG) to setup and maintain the translation state. In the case of
Real Specific IP, the idea is extended to multiple consenting peer
realms, rather than between a single global visible address space and
any of a set of private address spaces. There is potentially an
analogy between Real Specific IP and the use of MATs for
Administratively Scoped multicast addresses, which should be explored
in the interests of address conservation, as well as the main goal of
core router multicast forwarding state reduction.
An important aspect of NATs that has to be understood for Multicast
Address Translation is the relationship between packet flows and
session flows. In unicast applications to date, most the common cases
are client-server, and the TU approach can be used to model how a
session initiation relates to the subsequent packet flow, and
therefore can be used to trigger (or figure out where to add
explicitly) the NAT state instantiation.
Multicast is somewhat different. We need to look at both the receiver
oriented nature of multicast (i.e. IGMP, and the prune/graft messages
in PIM), and the data driven aspects of some tree building (PIM SM
and DVMRP and PIM DM flood and prune).
Finally, higher level protocols that carry IP addresses are also
important. In fact, NATs must be aware of them (or else near by
firewalls and application layer gateways must coordinate with NATs to
translate the application layer messages to, subject to all the
security problems of being a "man in the middle", albeit with
"permission").
Note that there are already many examples of problems caused by the
fact that multicast applications do not rely on the 5-tuple that
unicast applications employ, (often because of the liberal use of
multiple groups, and also because of other levels of multiplexing).
Hence RTP uses CSRC and SSRC fields to indicate the contributer,
while some reliable multicast protocols use their own payload
multiplex. This means that a simple port-MAP is not simple, or indeed
effective (or even feasible in some cases as already discovered for
port NATs with unicast RTP and RTCP on two ports!).
In multicast, this means that we must understand SIP and SAP
messages, as well as potentially RTSP and several others. The problem
here is that we do not currently have the luxury of using the FQDN as
a stable namespace to hang the translations from, as NATs do. MATs
have to contend with a variety of session end point naming schemes.
Last but not least, in inter-domain multicast, current practice is to
use MSDP, which entails source advertisement. However, we envisage a
set of MAT servers which coordinate around the edge of a core domain,
and can typically be co-located with the MSDP servers in any case, so
that the appropriate translations to source advertisements can be
easily achieved.
3. Deployment
A key problem with this is deployment. However, we have NAT capable
routers in many places. I see this as a minor mod to a family of
protocols.
One interesting possibility to consider is that one could multicast
the mappings. However, note that this would have to be done reliably
(e.g. using PGM or SRM or similar). The same idea has been discussed
in terms of scaling BGP (instead of a mesh of TCP connections, use a
reliable multicast protocol) and also for link state information
flooding.
If we run out of Si's from the same prefix space, fall back to no
translation, or else to tunnels. This, obviously should be
configurable, as should the thresholds used to set the MAT
aggregation.
4. Discussion
This document is a stake in the turf concerning the use of address
translation for multicast.
A number of obvious questions need clarifying before the work can be
continued (or discontinued).
0. Failure modes (the system should fail safe through appropriate use
of soft state and timers).
1. Protocols that mention multicast addresses. We've mentioned some.
We are sure there are many others!
2. Feedback - many protocols use feedback, even multicast protocols.
For example, reliable multicast transport protocols such as PGM, RLC,
and RMTP use feedback messages. Layered coded multicast protocols
("multi-rate congestion control") may have very interesting
interactions with the proposed scheme here. Most notably, the
aggregation here may make the reverse path trees different again.
However, many multicast applications and transport protocols already
have to deal with asymmetry (e.g. inter-domain unicast routes through
BGP is asymmetric, and non-bi-dir PIM creates asymmetric routes).
Note some systems have several levels of feedback (e.g. RTP, RTCP,
where ports differ). If a port MAT was required, this would need
further consideration.
3. Unicast - One can imagine a case of ultimate aggregation ,where a
unicast translation is done. This might enable, for example,
multicast islands to communicate via unicast only cores. Some
researchers have commented that really fast (partially optical)
future IP routers may be hard to build if they have to support packet
replication for multicast fan out. Its possible that this scheme
could provide a workaround for that.
Note also that one could use a MAT as an alternative to tunnels for
edge access from dial-up or other edge networks (e.g. non
IGMP/multicast capable DSLAM or Cable Modem access nets), into a
multicast capable core. Given these are the same places NATs get
deployed, this could easily leverage that.
4. SSM- Source Specific multicast propagates IGMPv3 information. The
interaction between that and this needs close examination.
5. The idea of Realm Specific Multicast IP (administrative scoped
addresses and MATs) needs further exploration.
etc
5. Acknowledgements
Thanks are due to Colin Perkins for a discussion at NGC in November
in London, England.
6. References
[RFC2663] IP Network Address Translator (NAT) Terminology and
Considerations, P. Srisuresh, M. Holdrege, Aug 1999.
[RFC3102] Realm Specific IP: Framework, M. Borella, J. Lo, D.
Grabelsky, G. Montenegro, October 2001.
[RFC3022] Traditional IP Network Address Translator (Traditional
NAT), P. Srisuresh, K. Egevang, Jan 2001.
[Infocom 1998] Forwarding State Reduction for sparse mode multicast
communications, J. Tian, G Neufeld, in Proc of IEEE Infocom 1998,
March 1998.
[Infocom 2000] On the aggregatability of multicast forwarding state,
D. Thaler, M. Handley in Proc of IEEE Infocom 2000, March 2000
[NGC 2001] Aggregated Multicast with Inter-Group Tree Sharing, Aiguo
Fei, Junhong Cei, Mario Gerla, Michalis Faloutsos in Proc NGC 2001,
November, 2001, available online via
http://www.cs.ucla.edu/NRL/hpi/papers.html
7. Security Considerations
The E2E mantra is violated badly.
Black holing attacks are very possible
DDOS on the MATs is clearly quite a possibility.
As usual, all the caveats of intermediate devices that require some
information about higher levels apply.
8. IANA Considerations
There are no IANA considerations regarding this document yet. If
this note sees any subsequent work, then I would expect at least one
protocol to emerge, in which case code points will be needed.
AUTHORS' ADDRESSES
Jon Crowcroft
Marconi Professor of Communications Systems
University of Cambridge
Computer Laboratory
William Gates Building
J J Thomson Avenue
Cambridge
CB3 0FD
UK
E-Mail: Jon.Crowcroft@cl.cam.ac.uk
Tel: +44 (0)1223 763633
Fax: +44 (0)1223 334 678
This draft was created in November 2001.
It expires April 2002.