Internet DRAFT - draft-ballardie-mrsp

draft-ballardie-mrsp





Internet Engineering Task Force                A. Ballardie
INTERNET DRAFT                                 Consultant
                                                 C. Fletcher
                                               London Internet Exchange

                                               26 January 2002



                Multicast Router-Switch Protocol (MRSP)
                     <draft-ballardie-mrsp-01.txt>

Status of this memo

     This document is an Internet-Draft and is in full conformance
     with all provisions of Section 10 of RFC2026.

     Internet-Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and its working groups.  Note that
     other groups may also distribute working documents as
     Internet-Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time.  It is inappropriate to use Internet-
     Drafts as reference material or to cite them other than as
     "work in progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html


Copyright Notice

   Copyright (C) The Internet Society (2001). All Rights Reserved.



Abstract

   MRSP is a layer 2 protocol that runs over Ethernet. Together with
   minor enhancements to PIM-SM [PIM] and PIM-SSM [SSM] it can be used
   to build uni-directional multicast forwarding state in Ethernet
   Switches that interconnect IP multicast routers. MRSP's forwarding
   state a) restricts (S,G) multicast traffic to only those routers that



Ballardie/Fletcher                                              [Page 1]







INTERNET-DRAFT             Expires: July 2002              January 2002


   request it, and b) allows multiple ingress points for (S,G) multicast
   traffic by building multiple distribution trees per (S,G). This
   allows network operators to restrict multicast traffic to a subset of
   routers on the switched Ethernet network, and provides the capability
   to implement AS based multicast routing policies.  These features are
   particularly relevant and important to switch based multicast
   Internet Exchange points.

   Note, worked examples accompany this draft, available at:
   www.linx.net/chris/mrsp/

1. Introduction

   IGMP "snooping" Switches [SNOOP], layer 2 protocols such as GMRP
   [IEEE], and some proprietary protocols (e.g. Cisco's CGMP) constrain
   IP multicast traffic flow from Switches to group member hosts, but no
   IETF protocol exists to restrict IP multicast traffic between
   switches or from switches to routers.  This draft addresses these
   issues, and the issue of inter-domain multicast policy routing.

   This draft builds on the Multicast-Friendly Internet Exchange
   architecture draft [MIX] published in 1999. That draft was motivated
   because the multicast-enabled parts of the Internet - the MBONE
   [MBONE] - which comprised one flat, virtual, and largely tunnelled
   routing domain, had already outgrown itself from a manageability
   point of view, and there was a need to document a set of conventions
   to transition the MBONE to a network architecture similar to the
   unicast Internet architecture. This architecture comprises multicast
   capable Autonomous Systems (ASs) interconnecting at Internet Exchange
   points.

   It is IETF recommended practice that multicast routing between ASs is
   source based, explicit join (e.g. PIM-SM [PIM]); flood & prune
   multicast routing is not recommended between ASs. MRSP ensures that
   multicast traffic flow across the Exchange (layer 2) switched
   infrastructure reflects multicast routing policy. Multicast routing
   policy expression and enforcement are currently limited in the
   Switched Internet Exchange environment for reasons explained in
   section 2.1.

   To these ends, we are revisiting the issues of running IP multicast
   over Internet Exchange points.



2. Review of MIX

   The goal of MIX [MIX] was to define a set of protocols and operating



Ballardie/Fletcher                                              [Page 2]






INTERNET-DRAFT             Expires: July 2002              January 2002


   procedures to enable native, scalable, policy based multicast routing
   and traffic forwarding over an Internet Exchange point, without
   imposing any constraints on intra-domain multicast. MIX recommended
   FDDI or ATM PVCs rather than switches as the layer 2 medium due to "a
   number of unresolved issues" with switches at that time.  Layer 2
   Switching technology has since advanced considerably, particularly in
   respect of their multicast "awareness" and multi-layer capabilities,
   and there appears to be a growing trend to deploying high speed
   Switching technology at Internet Exchange points. The MRSP protocol
   we propose is fully dependent on the use of Ethernet based Switches
   at the Exchange points.

   In support of its objectives, MIX proposed that Exchange members use
   M-BGP [BGP-4+] in support of policy based multicast paths
   (reachability), PIM-SM [PIM] as the multicast routing protocol to use
   across the Exchange, and MSDP [MSDP] for announcing active sources
   between domains.



2.1 Limitations of the Current (MIX) Architecture

   All routers attached to an Exchange point switch typically belong to
   the same (Virtual) LAN / subnet. With multiple routers reachable via
   a single interface, interface based RPF checks are not guaranteed to
   ensure a multicast packet was forwarded by the RPF M-BGP peer for the
   packet's source, and therefore multicast routing policy cannot be
   enforced.  Consequently, Exchanges implicitly impose fully meshed M-
   BGP peerings between multicast Exchange members.

   This fact is compounded by the use of PIM's ASSERT mechanism across
   the Exchange; PIM's ASSERT mechanism elects a single LAN forwarder
   per source on multi-access LANs when multiple routers would otherwise
   forward S traffic onto the LAN.  Consequently, PIM is dictating
   multicast policy when M-BGP was designed for that purpose.



3. MRSP


   On a multi-access LAN the issues of multicast traffic containment and
   and implementation of multicast routing policy can only be addressed
   if the layer 2 infrastructure is capable of supporting multiple layer
   2 multicast distribution trees per source. MRSP is a layer 2
   (Ethernet) protocol designed for building and maintaining layer 2
   multicast distribution trees.




Ballardie/Fletcher                                              [Page 3]






INTERNET-DRAFT             Expires: July 2002              January 2002


3.1 MRSP Requirements

   MRSP must be administratively enabled on switch ports and multicast
   router interfaces.  In an infrastructure containing multiple
   switches, all switches must enable MRSP on all router ports, and all
   routers must enable MRSP on all switch ports.

   Enabling MRSP on a switch port(s) causes the default multicast
   forwarding behaviour on those switch ports to be: filter all received
   multicast traffic unless MRSP forwarding state exists for the
   particular (S,G) combination.

   Router interfaces with MRSP enabled are required to run a layer 3
   explicit join multicast routing protocol supporting (S,G)
   joins/prunes (e.g. PIM-SM, PIM-SSM).

   MRSP implementations should not restrict multicast traffic for link
   local groups (224.0.0.0/24). MRSP implementations should be
   configurable so as NOT to restrict other groups or group ranges if so
   desired.

   MRSP is a layer 2 protocol encapsulated by Ethernet, and will require
   the assignment of a new Ethertype.



3.2. MRSP Functional Overview

   MRSP is designed to build uni-directional layer 2 multicast
   distribution trees.  Within the scope of a switched Ethernet network,
   these trees are source based.  To satisfy multicast routing policy,
   it may be necessary to build/maintain multiple trees per source.

   The layer 2 multicast distribution trees are realised by MRSP, which
   is used to add/modify/delete multicast forwarding state in Switches
   that interconnect routers. MRSP joins/prunes serve this purpose.  No
   MRSP forwarding state is maintained by routers.

   MRSP's actions (events) are initially driven by the layer 3 explicit
   join multicast routing protocol, e.g. PIM-SM or PIM-SSM. This implies
   that (most) MRSP messages are triggered by routers.

   For MRSP to function as described requires minor enhancements to PIM-
   SM (and PIM-SSM). These are as follows:


+o    As already explained, PIM ASSERTs elect a single ingress domain per
     source, thereby limiting multicast routing policy. Multicast



Ballardie/Fletcher                                              [Page 4]






INTERNET-DRAFT             Expires: July 2002              January 2002


     routing protocols, including PIM, were not designed as policy rout-
     ing protocols; M-BGP was designed for this purpose. We therefore
     recommend that PIM's ASSERT mechanism be disabled when running PIM
     over a multicast Exchange point.  MRSP Switch forwarding state
     allows multiple ingresses per multicast source without causing
     duplicates, and thereby puts multicast routing policy back in the
     control of M-BGP.


+o    A multicast Exchange member router MAY (and indeed SHOULD) apply
     policy verification to a received L3 Join, and have the ability to
     reject as well as accept the join. In order to provide this capa-
     bility, we recommend PIM be enhanced with a JOIN_NACK for use in
     this type of environment.


+o    Finally, since MRSP makes the shared LAN behave more like a series
     of point-to-point links, it is no longer necessary (or desirable at
     an Exchange point) to multicast L3 joins/prunes. We therefore sug-
     gest PIM be modified to incorporate (perhaps as a configurable
     option) unicast joins/prunes.


   MRSP is a soft-state protocol; MRSP Joins must be refreshed periodi-
   cally otherwise the Switch state they previously instantiated will
   expire.



3.3 MRSP Protocol

   There is one MRSP protocol component for Routers, and another for
   Switches.

   MRSP devices use an HELLO protocol (tbd) to monitor "liveness" of
   adjacent MRSP devices.

   The layer 2 MRSP Joins/Prunes are explicitly addressed to a layer 2
   group address listened to by all Switches (assignment to be
   requested). These L2 join/prunes are processed by all Switches on the
   L2 spanning tree path between the L2 join/prune originator router and
   intended recipient router, both of which are carried in the MRSP
   join/prune. It follows that L2 joins/prunes are forwarded hop-by-hop
   over the spanning tree path leading to the L2 join/prune's intended
   L2 destination.

   A MRSP (layer 2) Join is triggered when a router receives (and
   accepts) a Layer 3 (S,G) Join on an MRSP interface. The MRSP Join



Ballardie/Fletcher                                              [Page 5]






INTERNET-DRAFT             Expires: July 2002              January 2002


   travels in the reverse direction of the corresponding L3 Join. The
   MRSP Join establishes/augments MRSP forwarding state in the switches
   it traverses.  A switch's MRSP forwarding entry has one upstream port
   (the port over which the MRSP Join arrives) and one or more down-
   stream ports (the port(s) over which a MRSP Join is forwarded).

   Since MRSP Joins travel downstream wrt the source, MRSP Joins can be
   aggregated from their upstream point of origin, thus reducing the
   MRSP message overhead.  An aggregated MRSP Join carries a list of
   intended recipient routers which are downstream. As this join travels
   towards the recipient routers, the spanning tree path to different
   recipient routers will diverge. At diverging points, the MRSP Join is
   duplicated and modified as necessary.

   A (layer 2) MRSP Prune is triggered by a router immediately after the
   router sends a L3 Prune. A MRSP Prune flows in the same direction as
   the corresponding L3 Prune. A MRSP Prune removes the port over which
   it arrived from the corresponding MRSP forwarding entry. Similar to a
   layer 3 Prune, a MRSP Prune is not forwarded upstream by a switch if,
   after processing the MRSP Prune, that switch still has downstream
   forwarding state for the same (S,G).

   If a MRSP Prune is lost, the corresponding Switch state will eventu-
   ally time out through lack of MRSP Join refresh (since the L3 joins
   have ceased, so too do the L2 MRSP joins).

   When MRSP is disabled on a router's interface, the last MRSP message
   a router sends on that interface is a MRSP BYE message. This message
   may be unicast or multicast to the Switch, but the message is not
   forwarded. On receipt of the BYE message, the Switch resumes its
   default multicast forwarding behaviour on that port.



3.4 MRSP Multicast Forwarding State

   A MRSP Switch multicast forwarding entry consists of:

   Layer 2 source address, layer 3 source address, layer 3 group
   address, out-port-list.

   A Switch uses its MRSP forwarding state as follows: when a multicast
   frame arrives via a MRSP Switch port, the frame's L2 source address
   is used as the primary index into the forwarding table. Several
   entries may exist per L2 source address. The L2 source is RPF checked
   by the switch to ensure it arrived on the correct port for the L2
   source.  The Layer 3 source and destination IP addresses in the data-
   gram must be matched with the L3 (IP) source and group fields in the



Ballardie/Fletcher                                              [Page 6]






INTERNET-DRAFT             Expires: July 2002              January 2002


   forwarding table to uniquely identify the correct forwarding entry.
   If all these tests are successful, a copy the frame is forwarded over
   each port listed in the out-port-list, otherwise the frame is dis-
   carded.



3.5 MRSP Message Types


   There are 5 types of MRSP message:

+o    type 1: HELLO, used to establish MRSP "liveness" of a neighbouring
     MRSP device.

+o    type 2: JOIN, instigated by routers, used for establishing/refresh-
     ing Switch multicast forwarding state.

+o    type 3: PRUNE, instigated by routers, used for modifying/deleting
     Switch multicast forwarding state.

+o    type 4: ERROR, used for signalling MRSP error conditions to a
     neighbouring MRSP device.

+o    type 5: BYE, used to inform a MRSP neighbour that MRSP is being
     disabled on this neighbour.



4. Summary

   This draft has described a new Router-Switch protocol for Ethernet,
   MRSP, which when combined with our suggested enhancements to PIM (SM
   and SSM), can be used to build uni-directional multicast forwarding
   state in Ethernet Switches that interconnect IP multicast routers.
   The advantages of MRSP (combined with the suggested enhancements to
   PIM) are that a) (S,G) multicast traffic is restricted only to those
   routers that request it, and b) it allows multiple ingress points per
   multicast source. The end result is that multicast traffic flow flows
   only where it is wanted, saving network resources, and multicast pol-
   icy can be applied and enforced. These features are currently not
   supported either separately, or together, by any existing IETF proto-
   col(s). However, these features are particularly relevant and impor-
   tant to inter-domain multicasting, given multicast domains increas-
   ingly interconnect at switch based Internet Exchange points.






Ballardie/Fletcher                                              [Page 7]






INTERNET-DRAFT             Expires: July 2002              January 2002


References

   [MIX] H. LaMaster, S. Schulz, J. Meylor, D. Meyer. Multicast-Friendly
   Internet Exchange (MIX). Work in progress, June 1999.

   [SNOOP] M. Christensen, F. Solensky. IGMPv3 and IGMP Snooping
   Switches. Work in progress, February 2001. draft-ietf-idmr-
   snoop-00.txt

   [IEEE] IEEE 802.1D, see http://www.ieee802.org/1/pages/802.1D.html

   [MBONE]

   [PIM-SM] W. Fenner, M. Handley, H. Holbrook, I. Kouvelas. Protocol
   Independent Multicast -  Sparse Mode Protocol Specification. Work in
   progress, March 2001.  draft-ietf-pim-sm-v2-new-02.txt,ps.

   [SSM] S. Bhattacharyya et al. An Overview of Source-Specific Multi-
   cast(SSM) Deployment. Work in progress, July 2000. draft-bhattach-
   pim-ssm-00.txt

   [BGP-4+] T. Bates , R. Chandra , D. Katz , Y. Rekhter, "Multiprotocol
   Extensions for BGP-4", RFC 2283

   [MSDP] D. Farinacci et al. Multicast Source Discovery Protocol. Work
   in progress, January, 2000. draft-ietf-msdp-spec-02.txt




Author Information

   Tony Ballardie
   Consultant

   ABallardie@acm.org


   Chris Fletcher
   London Internet Exchange
   3 Park Road
   Peterborough
   PE1 2UX
   UK

   chris@linx.net





Ballardie/Fletcher                                              [Page 8]






INTERNET-DRAFT             Expires: July 2002              January 2002


Full Copyright Statement

   Copyright (C) The Internet Society (2001). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works. However, this docu-
   ment itself may not be modified in any way, such as by removing the
   copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of develop-
   ing Internet standards in which case the procedures for copyrights
   defined in the Internet Standards process must be followed, or as
   required to translate it into languages other than English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MER-
   CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.



Acknowledgement

   Funding for the RFC editor function is currently provided by the
   Internet Society.


















Ballardie/Fletcher                                              [Page 9]