Internet DRAFT - draft-gentric-mmusic-stream-switching-req

draft-gentric-mmusic-stream-switching-req








    Internet Engineering Task Force                           MMUSIC WG
    Internet Draft
                                                      Philippe Gentric,
                                                    Philips Electronics
                                                    
                                                               May 2003
                                                  expires November 2003

           draft-gentric-mmusic-stream-switching-req-00.txt


           Requirements and Use Cases for Stream Switching




STATUS OF THIS MEMO

   This document is an Internet-Draft and is in full conformance 
   with all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six 
   months and may be updated, replaced, or obsoleted by other 
   documents at any time.  It is inappropriate to use Internet-
   Drafts as reference material or to cite them other than as "work 
   in progress".

   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt

   To view the list Internet-Draft Shadow Directories, see 
   http://www.ietf.org/shadow.html.


Abstract

   Stream switching is a technique used to change the data rate of a 
   media being streamed, typically for the purpose of adaptation to 
   the effectively available bandwidth of the network. This memo 
   lists the use cases and the requirements for stream switching.







Gentric                                                      [page 1]

Internet Draft       Stream Switching Requirements      March 2003



1. Introduction

   Stream switching is a technique used to change the data rate of a 
   media being streamed, typically for the purpose of adaptation to 
   the effectively available bandwidth of the network.
   
   The aim is that a real time streaming system can switch from 
   stream to stream in order to vary the data rate. This requires 
   that the same content is encoded as multiple streams at various 
   bit rates. 
   
   This memo lists a number of use cases in section 2 and 
   requirements in section 3.   
   
1.1 Typical usage context

   The typical scenario is video distributed on demand, also known 
   as "Video On Demand" (VOD). The situation is depicted in figure 1. 
   This is the domain of RTSP [RTSP] servers. HTTP is typically used 
   for the service/application i.e. provides the entry point, 
   usually a RTSP URL. The media can be pre-recorded on file or can 
   be a "live" source in which case the RTSP/RTP server acts as a 
   relay.
   
   
   
             *****************                        *****************
             *               *        HTTP            *               *
             *  HTTP Server  *  <------------------>  *  HTTP Client  *
             *               *                        *               *
             *****************                        *****************
                
             *****************                        *****************
             *               *        RTSP            *               *
             *  RTSP Server  *  <------------------>  *  RTSP Client  *
             *               *                        *               *
             *****************                        *****************
                
             *****************                        *****************
             *               *      RTP on UDP UC     *               *
             *  RTP Sender   *  ------------------->  *  RTP Receiver *
             *               *                        *               *
   media     *               *      RTCP SR           *               *
    on   --> *               *  ------------------->  *               *
   file      *               *                        *               *





Gentric                                                      [page 2]

Internet Draft       Stream Switching Requirements      March 2003

    or       *               *     RTCP feedback      *               *
   live      *               *  <-------------------  *               *
             *****************                        *****************
   
   Figure 1: video on demand
   

1.2 Generalities

   The rationale of stream switching is based on the following 
   premises:

   . With the emergence of streaming on wide band networks the lack 
   of congestion control tools for streaming has been creating an 
   increasing level of concern among users and operators. Obviously 
   it is highly desirable that these tools should provide a 
   "generalized" stream switching framework i.e. should not depend 
   on a given codec technology or particular network configuration.
   
   . With the emergence of streaming on wireless networks where 
   bandwidth fluctuations are the rule the need for such tools is 
   becoming vital, which is acknowledged by some dedicated fora 
   activity [3GPP-alt-attr] [3GPP-BWS].
  
   . While codec based schemes (scalable coding schemes, fine 
   grained scalable video etc) have been promoted for years in 
   various standardization bodies they did not succeed to pass the 
   next stage i.e. to enter the fora closer to product 
   specifications for the following reasons:
   
   . In the mean time the classical "constant bit rate" paradigm of 
   codecs has led to (ever improving) compression efficiencies ... 
   at constant bit rates.
   
   . For a media distribution service (by opposition to a 
   conferencing service where the situation is significantly 
   different) there are only two paradigms:
   
   . On demand (point to point). In that case since the distribution 
   is point to point between the media server and the client, the 
   key argument will be the coding efficiency i.e. the perceptual-
   quality versus bit-rate ratio, which favors switching between 
   hyper-optimized constant bit rate streams.

   . Live (broadcast). In that case changing the rate of one unique 
   encoder based on reports from a number of receivers is difficult 
   to imagine. On the other hand simultaneously encoding the same 





Gentric                                                      [page 3]

Internet Draft       Stream Switching Requirements      March 2003

   program at various bit rates is easy to deploy either with a 
   point-to-point relay (which makes it equivalent to the previous 
   case) or using some type of multicasting.
   

1.3 Vocabulary

   We define a "program" as a set of "tracks", for example a movie 
   is composed of an audio and a video track. 
   
   We define a "stream" as an encoded instance of a track, for 
   example the video track of a movie may be encoded at 50kb/s, 150 
   kb/s and 400 kb/s using respectively H263 baseline SQCIF 7.5 fps, 
   MPEG-4 SP@L3 QCIF 15 fps and MPEG-4 ASP@L3 CIF 30 fps, the audio 
   track may be encoded at 5 kb/s, 20 kb/s, 48 kb/s and 80 kb/s 
   using respectively AMR, AMR WB, AAC mono and AAC stereo.
   
   We define one "flavor" of a program as a given set of streams (a 
   pair for a movie, usually consisting in audio and video), for 
   example 400 kb/s video and 80 kb/s AAC is the high quality flavor 
   in the example above for which we have 12 different flavors (but 
   some flavors may not always make sense).
      
   We define a "switch-set" as the set of all the streams for a 
   track or a program. A switch-set can be organized either as 
   ordered first by track or first by flavor. Obviously switch-sets 
   are prepared during the content production or deployment phase. 
   
   We define "down-switch" as a switch toward a smaller rate.
   
   We define "up-switch" as a switch toward a higher rate.
   
   We define the "effective rate" as the data rate that the network 
   can sustain at a given moment, this is the smallest data rate 
   among all the links in the path from sender to receiver, usually 
   the last hop. The situation were several links would "compete" 
   for this position makes modeling more complex but does not 
   affect the overall rationale.
   

2. Use Cases

2.1 Home VOD service

   A service provider has deployed a VOD service (RTSP+RTP) as part 
   of some wired home Internet access (DSL, cable).
   





Gentric                                                      [page 4]

Internet Draft       Stream Switching Requirements      March 2003

   The service is designed to sustain N concurrent users watching N 
   different movies (i.e. has a bandwidth of N*BR where BR is the 
   nominal bit rate for the movies). However the same network is 
   also used for Web browsing.
   
   At some point of time there is a peak in traffic (TCP and/or 
   streaming) and the last hop between a given head-end, as a result 
   users experience congestion.
   
2.1.1 Without stream switching

   The router queues overflow, all TCP traffic falls back to share 
   whatever is left of the bandwidth by the streaming service. 
   
   TCP users are unhappy because of the small bandwidth.
   
   Video users may be slightly unhappy because TCP causes a constant 
   packet loss rate of several percents, because TCP sessions are 
   constantly probing by increasing their rate, which causes video 
   playback to be slightly affected, but the video users who have 
   good (error resilient) decoders are almost not affected at all.
   
2.1.2 With stream switching

   In this case the detection of packet losses causes all or most of 
   the streaming systems to switch down to lower rates, leaving more 
   bandwidth for the TCP traffic.

   TCP users are happy because of the relatively high remaining 
   bandwidth.
   
   Video users are moderately unhappy because the lower video rates 
   cause lower objective quality, also both types of traffic (TCP 
   and RTP) still causes a constant packet loss rate of several 
   percents, because there are constantly TCP and RTSP/RTP sessions 
   probing by increasing their rate, which causes video playback to 
   be slightly affected (but again this is decoder dependant).
   

2.2 GPRS wireless VOD service

   The video service is a news service for GPRS handsets based on 
   the availability of 5 GPRS slots (roughly 50 kb/s) for download. 
   This bandwidth is divided in 5 kb/s for AMR speech and whatever 
   is left for video which can be: (1) no video (2) 25 kb/s video 
   "slide show" (3) 45 kb/s video at 10 fps.
   





Gentric                                                      [page 5]

Internet Draft       Stream Switching Requirements      March 2003

   The policy implemented in the radio system is that each 
   authenticated user (or non-authenticated emergency user) must at 
   least have 1 slot (i.e. voice has precedence other data).
   
   Another policy implemented in the radio system is that all IP 
   traffic is handled the same way and the system is tuned for 
   minimum error rate, specifically the low level layers will try 
   the maximum number of attempts for each radio packet while 
   increasing the redundancy, before giving up.
   
2.2.1 Static user 

   The user is static in a very busy cell i.e. there are many people 
   popping in and out of the cell (either physically moving or making 
   short calls). They are causing rapid fluctuations in the 
   effective GPRS bandwidth for streaming.
   
2.2.1.1 Without stream switching

   When there are too many calls in progress in the cell the video 
   user gets only 1 slot and the video session causes congestion (in 
   the base station router). Catastrophic degradation follows with 
   player freeze, difficulties in reconnecting etc...
   
2.2.1.2 With stream switching

   When the bandwidth folds to 1 slot the decoder immediately 
   detects it (e.g. by measuring the jitter). The feedback causes 
   the source to switch off the video allowing the system to recover 
   without having caused congestion.
   
   If the bandwidth folds to 3 or 4 slots the system will have the 
   opportunity to switch to the 25 kb/s "slide show" alternative.
   
2.2.2 Mobile user, fully transparent hand over.

   The user is moving from cell to cell, some are very busy some are 
   empty. We assume hand over from cell to cell is fully transparent.
   
   This case is similar to the previous one.

2.2.3 Mobile user, non-transparent hand over.

   In this case a hand over may cause the player to receive nothing 
   during a certain time. We assume that the network does not loose 
   any data i.e. that the data accumulated in the network during the 
   hand over will eventually reach the player. The lack of data may 





Gentric                                                      [page 6]

Internet Draft       Stream Switching Requirements      March 2003

   cause an underflow depending on how the player de-jittering 
   buffer have been configured. This type of problem is the primary 
   reason why large de-jittering buffers (many seconds of data) are 
   required for this type of networks.
   
2.2.4 Mobile user, radio problems

   In this case the motion of the users causes radio problems 
   (obstacles between the antenna(s) and the phone).
   
   Assuming per configuration (see above) the radio network is 
   "almost" lossless the congestion effect is multiplied i.e. when 
   the radio bandwidth decreases packets will pile up instead of 
   being discarded.
   
   Apart from that (which will make the problem worse) this case is 
   similar to 2.2.1. Switching both streams (audio and video) off if 
   the effective bandwidth becomes really small may be an 
   interesting behavior.

2.3 Feedback from congested routing device

   An interesting use case to consider is the case when by some 
   mean that we don't need to specify here the routing device 
   experiencing congestion has a way to signal it to the RTP source 
   (RTSP server in our case).
   
   This case is similar to 2.2.1. with the advantage that the 
   reaction will be faster.
   
   
2.4 GPRS video with server initiated video stream switching

   The video configuration for GPRS as in 2.2 is documented by the 
   3GPP Packet Switched Streaming specification.
   
   In the case the whole bit rate range is covered by a single video 
   codec with the same configuration (MPEG-4 video Simple Profile 
   Level 0). This means that the scenario depicted in 2.2 is 
   possible as soon as the server receives feedback about the 
   network conditions.
   
   RTCP feedback or extended RTCP feedback can be used for this 
   purpose. Direct feedback from the congested node as described in 
   2.3 is another possibility.
   
   No other signaling is needed assuming that the video packets are 





Gentric                                                      [page 7]

Internet Draft       Stream Switching Requirements      March 2003

   sent as belonging to the same single RTP session.
   
   This configuration is called "client-transparent", for that 
   reason i.e. client implementations are expected to be robust to 
   bit rate changes, including a complete cut off for many seconds 
   (however some time-outs may occur)
   
  
2.5 GPRS audio with codec change

   The audio service is a music on demand service based on the 
   availability of 5 GPRS slots (roughly 50 kb/s) for download. The 
   available bandwidth is expected to vary down to 5 kb/s. The 
   switch set prepared for the service is as follows: 
   
   (1) 5 kb/s AMR 
   
   (2) 12 kb/s AMR WB 
   
   (3) 20 kb/s AMR WB 
   
   (4) 30 kb/s AAC mono 
   
   (5) 40 kb/s AAC stereo 
   
   (6) 50 kb/s AAC stereo.
   
   The specific problem in that case, when compared to the previous 
   one, is that there are several different decoders and/or decoder 
   configuration involved (specifically there are 4 of them: AMR, 
   AMR WB, AAC mono and AAC stereo which must be processed as 
   different codecs). Therefore the server MUST signal a switch to 
   the client since feeding AAC into an AMR decoder (or vice versa) 
   may crash it.
   
   This configuration is called "non-client-transparent".

   [Note: although a Requirement section should not hint at the 
   solution, the next paragraph will, in order to explore some 
   additional requirements with respect to synchronization] The 
   obvious thing to do is that the switch set should be set up 
   within a single RTSP session between client and server using a 
   different RTP session for each stream. This means that each 
   stream will at least have a different Payload Type and in 
   addition may be transported toward a different UDP port. This 
   insures that the receiver can "perceive" that the server switched 
   simply because one RTP session will not receive any packet 





Gentric                                                      [page 8]

Internet Draft       Stream Switching Requirements      March 2003

   anymore while another one will start to receive some packets. One 
   key question is how the client will be able to seamlessly 
   synchronize. RTP time stamps will be used but since they have a 
   different random offset for each RTP session additional 
   information is required.
   
   Notes:
   
   1) The way this is handled in RTSP is to use RTP-info in 
   responses to PLAY commands in order to convey the mapping between 
   the Normal Play Time (media time) and the RTP time stamps .
   
   2) The way this is handled in RTP is to send RTCP sender reports 
   containing the mapping between the sender wall clock and the RTP 
   time stamps. Doing this has two drawbacks, firstly such packets 
   may be lost, secondly the timing may be late.
   

3. Requirements

   Requirements listed here are characterized by a number (e.g. 
   "R23") a description (a sentence), "Utility" (Always,  config 
   specific, player specific, server specific, rare) and 
   "Importance" (Critical, high, medium, low).
   
3.1 Out of scope requirements
   
   This memo does not address the requirement for a way to convey 
   the description of the switch set(s) ( it would typically need a 
   memo of its own).
   
   This memo does not address requirements affecting the rate 
   control algorithm itself. i.e. it is considered here as given 
   that the rate control must provide a suitable target for the 
   switch in terms of bit rate (for more on rate control algorithm 
   for streaming see [TFRC]). In particular the rate control system 
   must follow the following rules:
   
   . For down-switches the target rate should be substantially lower 
   than the effective bandwidth in order for the streaming system to 
   "recover" i.e. to compensate the negative effects of running an 
   excessive data rate for the amount of time Tsw needed to detect 
   the problem then compute a target and execute the switch. The 
   time it takes to "recover" i.e. to flush routing buffers and 
   replenish the receiver buffers increases with Tsw and is in first 
   order proportional to the difference between the (new) rate and 
   the effective bandwidth.





Gentric                                                      [page 9]

Internet Draft       Stream Switching Requirements      March 2003

   
   . Subsequently up-switches will be performed in order to 
   "explore" and find the ceiling i.e. the effective bandwidth, this 
   exploration MUST follow extremely strict rules in order to avoid 
   congestion explosions.
   
   In a similar fashion this memo does not address "application 
   policy" issues such as:
   
   . For some applications a complete switch off may be better 
   perceived (or easier to bill).
   
   . For some applications media type may create preferences; for 
   example a music service will first reduce the video to a slide 
   show while a news service would first switch from high quality 
   stereo audio to low bit rate mono audio.
   
   This memo also does not cover the multicast cases (i.e. 
   simulcast) for which switching is performed by routers.
    
3.2 Minimal receiver perturbation: Seamless switching requirements

   Seamless stream switching is obtained when the switch is 
   performed in such a fashion that media playback is minimally 
   disturbed from a player point of view.
   
   This requirement divides in several key issues as follows.

3.1.2. Preventing gaps in media

   Gaps in the media can have 2 causes, packet losses and discarded 
   packets.

3.1.2.1 Packet losses

   Losses (whatever the cause) create gaps. We assume here that 
   retransmission and FEC are out of scope in as much as the 
   solution should work without them. We will assume in the next 
   section that losses occur due to routing buffer overflow, which 
   is due to sending data at a rate that is higher than the link 
   bandwidth.

   R1: Prevent packet losses
   Utility: Always
   Importance: high
   
   





Gentric                                                      [page 10]

Internet Draft       Stream Switching Requirements      March 2003

3.1.2.2  Discarded packets

   The receiver may be unable to process incoming data for two 
   reasons:

3.1.2.2.1 Random Access Point Required

   Some decoders (typically video decoders) may need a Random Access 
   Point (usually in video this is an "Intra" frame) in order to 
   start decoding; a stream switching system that would switch 
   "anywhere" in a stream would cause such receivers to discard data 
   until such a RAP is found. 
   
   However, thanks to recent video compression technologies  "well 
   implemented" video decoders can restart decoding "anywhere". This 
   is a by-product of implementing error concealment and resilience 
   techniques. Note that for some extremely resilient 
   implementations the capability to minimize the visible artifact 
   when jumping "anywhere" is surprisingly good, while for more 
   naive implementations the result can be awful.
   
   R2: Switch on RAP
   Utility: player and config specific
   Importance: medium
   
3.1.2.2.2 Synchronization information unavailable

   For audio and video playback accurate relative synchronization 
   (a.k.a. "lip-sync") is a key requirement. In some application one 
   may even prefer to switch the audio off rather than playing out 
   of sync. The name "lip-sync" indicates the type of content for 
   which this is an extremely critical feature: video displaying 
   people talking (unfortunately videos not displaying people caught 
   in the action of talking are rather the exception than the rule!) 
   The issue at stake is that for RTP streaming in the context of a 
   RTSP sessions the receiver expects to receive the required lip-
   sync information in response to a PLAY command thanks to the RTP-
   info field.
   
   Quote from RFC2326 section 12.33:

   "A mapping from RTP time stamps to NTP time stamps (wall clock) 
   is available via RTCP. However, this information is not 
   sufficient to generate a mapping from RTP time stamps to NPT. 
   Furthermore, in order to ensure that this information is 
   available at the necessary time (immediately at startup or after 
   a seek), and that it is delivered reliably, this mapping is 





Gentric                                                      [page 11]

Internet Draft       Stream Switching Requirements      March 2003

   placed in the RTSP control channel."
     
   R3: Send sync info after switch
   Utility: Client non-transparent
   Importance: Critical
   
3.1.2. Preventing pauses in playback

   Playback pauses are caused by buffer underflows: the receiver 
   simply does not have data to decode and must therefore wait for 
   some. There can be a number of causes (as follows) but all causes 
   share the same precondition: the sender is pushing packets 
   corresponding to a data rate higher than some hop (very often the 
   last one) in the path to the client can sustain, the obvious 
   strategy then is to perform a down-switch.

3.1.2.1 There was no down-switch 

   The buffers in the network will eventually saturate in high data 
   rate packets while these packets take longer to arrive to the 
   decoder than it takes time to decode them.

   R4: Switch down to compensate bandwidth decrease
   Utility: Always
   Importance: Critical
   
   The next issue is obviously to make sure that the switch-down 
   signal arrives at the sender.
   
   R5: Make sure the switch-down signal is not lost
   Utility: Always
   Importance: Critical
 
3.1.2.2 The down-switch occurred too late

   If the switch occurs too late the result is similar: the routing 
   buffers are still full of high rate packets which takes a long 
   time to flush (this may depend on the router discard policy, if 
   the router has the policy to discard the oldest data first this 
   is not true, but then this policy will create gaps, ... see 
   above) 

   R6: Switch down as soon as possible when congestion detected
   Utility: Always
   Importance: Critical

   Note that the urgency to switch is roughly increasing with the 





Gentric                                                      [page 12]

Internet Draft       Stream Switching Requirements      March 2003

   difference between the (old) rate and the effective rate, which 
   leads to another requirement:
   
   R7: Down-switch to the smallest bit rate available when 
   congestion detected
   Utility: Always
   Importance: high
   
   This last requirement can be interpreted by considering that the 
   smallest bit rate is zero i.e. suppress one media, an example is a 
   video news service where video is cut off while audio remains.

3.1.3 Preventing visible quality changes

   Media quality is directly a function of the data rate. The 
   obvious requirement is therefore to always use the highest 
   possible data rate (which is in exact opposition with the 
   previous item!):

   R7bis: Down-switch to the highest bit rate available (but below 
   the effective rate)
   Utility: Always
   Importance: high

   A way to solve the contradiction is to defer to the rate control 
   algorithm the responsibility to compute a low target rate so as 
   to cause a fast recovery but to prepare for an up-switch just 
   below the estimated available bandwidth as soon as the network 
   conditions show signs of recovery. This effectively eliminates R7 
   and R7bis.


3.2 Minimize network perturbation  
 
   It is yet another key Requirement to minimally disturb the 
   network.
   
3.2.1 Avoid accumulating data in network (routing) devices

   Since the amount of storage in routing device is limited 
   streaming traffic should behave and avoid using too much of this 
   storage too often. It is also obvious that since data accumulates 
   in routers in case of congestion, this requirement is exactly 
   similar to the one above (R5) i.e. the key parameter is to switch 
   down as soon as possible when congestion is detected.
   
3.2.2 Avoid sending redundant data





Gentric                                                      [page 13]

Internet Draft       Stream Switching Requirements      March 2003


   There are several possible reason why the sender may send 
   redundant data. Obviously sending more data when the system is 
   experiencing congestion is a very bad idea, on the other hand it 
   is less important when switching up.
   
3.2.2.1 RTP session using retransmission or FEC

   Obviously RTP sessions using some type of retransmission scheme 
   or some type of adaptive FEC (Forward Error Correction) scheme 
   will cause additional traffic in case losses are detected, which 
   may worsen congestion.
   
   R8: Avoid retransmission and addaptive FEC 
   Utility: Always
   Importance: High
   
3.2.2.2 Back track to RAP

   As mentioned above decoders may need a RAP to start decoding. The 
   hypothesis explored above was that the decoder would discard data 
   until a RAP is received, the reverse solution consist in having 
   the sender back track the stream until a RAP is found and start 
   sending the new stream at this point. This solution can be 
   extremely costly in case the stream has few RAPs and the previous 
   one is many seconds away.

   R9: Avoid back tracking to RAP for down-switch
   Utility: video and configuration specific
   Importance: variable

3.2.2.3 Packetization overlap

   Two streams encoding the same media at different rates may have 
   packetization overlap. This is typical for audio in VOD where 
   each packet contains as many frames as possible, i.e. up to the 
   path MTU or some safe smaller value (in order to reduce the 
   packet header overhead). In this case the time stamps of packets 
   from streams at different rates coincides very rarely. This means 
   that up to 1 packet equivalent of redundant media will be sent at 
   the switch, which is not a lot of data except for very low bit 
   rates (e.g. 4 kb/s audio packetized in 1500 octet datagrams have 
   a packet rate of one packet every 3 seconds, an additional packet 
   represents then a 30% peak rate increase!) 
   
   R10: Avoid packetization overlap 
   Utility: Audio





Gentric                                                      [page 14]

Internet Draft       Stream Switching Requirements      March 2003

   Importance: Low
   
3.3 Minimize receiver resource usage

   It is a requirement to minimize the amount of resources necessary 
   to implement stream switching in the players. This is especially 
   true for mobile clients. This requirement however is pretty weak 
   due to the comparatively vast amount of resources required for 
   media decoding.

   R11: Avoid large receiver resource requirements 
   Utility: Embedded players
   Importance: Low
   
3.4 Minimize sender resource usage

   It is a requirement to minimize the amount of resources necessary 
   to implement stream switching in the servers. This is only true 
   for high volume servers, but it is extremely important in that 
   case. Indeed high volume VOD servers are dedicated machines 
   optimized for thousands of concurrent sessions. Cost 
   effectiveness then depends on the ability of the implementers to 
   produce more concurrent sessions for the same hardware 
   configuration which resolves ultimately in the ability to switch 
   context, which in turn depends critically on the amount of memory 
   and CPU cycles each context individual cycle requires (see also 
   section 1. of [TFRC]).
   
   R12: Avoid increasing sender resource requirements 
   Utility: High volume servers
   Importance: Critical

3.5 Minimize receiver security risk

   The key risk for the receiver is to be the victim of an 
   unexpected switch or a switch that it does not support.
   
3.5.1 Switch with change in decoder configuration

   Changes in decoder configuration are in general either not 
   covered or explicitly excluded by compression standards. For 
   example in MPEG-4 video it is explicitly forbidden to change the 
   screen size in the middle of a stream (e.g. by sending a VO-VOL 
   update), more generally nobody would expect a given decoder to 
   detect that the content it is receiving has changed in nature 
   (say from AMR to AAC!).
   





Gentric                                                      [page 15]

Internet Draft       Stream Switching Requirements      March 2003

   R13: No unsignaled or unprepared switches involving decoder 
   configuration changes 
   Utility: client non transparent 
   Importance: Critical
   
3.5.2 Switch without change in decoder configuration

   This is the case when nothing changes but the bit rate (many 
   codecs support this, but unfortunately usually over a restricted 
   bit rate range). 
   
   In theory no signaling is required. 
   
   In practice there is an extremely high risk that some part of 
   most existing implementations relies on the assumption that 
   streaming is performed at a constant (average) rate, however 
   adding explicit signaling would obviously not solve this backward 
   compatibility issue either...
   
   R14: Avoid unsignaled switches even if decoder configuration 
   does not changes 
   Utility: client transparent 
   Importance: low to very low   

3.6 Minimize network security risk

   The key security issue for the network is directly related to 
   congestion avoidance, as such stream switching will be a benefit 
   (when comparing with streaming without stream switching!) 
   providing that it uses the correct rate control algorithm. In 
   case the congestion problem is not handled correctly by the rate 
   control system a nice safe feature would be that servers can be 
   authoritatively limited in their output bandwidth.
   
   R15: Use proven rate control algorithms
   Utility: Always
   Importance: Critical
   
   R16: Allow servers to deny an up-switch
   Utility: Always
   Importance: Critical
   
3.6 Minimize sender security risk

   The key security issue for the sender is DOS in various forms, 
   for which the defenses are simple:
   





Gentric                                                      [page 16]

Internet Draft       Stream Switching Requirements      March 2003

   R17: Allow servers to deny a switch
   Utility: Always
   Importance: high
   
   R18: recommend that servers implement safe limits (max switch 
   rate etc) 
   Utility: Always
   Importance: high
   
3.7 Backward compatibility

   Another key requirement is maximal backward compatibility with 
   the relevant IETF standards: RTSP, RTP/RTCP, SDP

   R19: Backward compatibility
   Utility: Always
   Importance: critical
   
3.7 Forward compatibility

   Another requirement is maximal forward compatibility with the 
   relevant future IETF standards for example RTSPv2 and SDPNG.

   R20: Forward compatibility
   Utility: Always
   Importance: high
   
3.8 Table of Requirements    

   ******************************************************************
   * R#  | Utility   | Importance | Description                     *
   ******************************************************************
   | R1  | Always    |  High      | Prevent packet losses           |
   +----------------------------------------------------------------+
   | R2  | Video     |  Medium    | Switch on RAP                   |
   +----------------------------------------------------------------+
   | R3  |  Client   |  Critical  | Send sync info after switch     |
   |     |   Non     |            |                                 |
   |     |Transparent|            |                                 |
   +----------------------------------------------------------------+
   | R4  | Always    |  Critical  | Switch down to compensate       |
   |     |           |            | bandwidth decrease              |
   +----------------------------------------------------------------+
   | R5  | Always    |  Critical  | Make sure the switch down       |
   |     |           |            | signal is not lost              |
   +----------------------------------------------------------------+
   | R6  | Always    |  Critical  | Switch down as soon as possible |





Gentric                                                      [page 17]

Internet Draft       Stream Switching Requirements      March 2003

   +----------------------------------------------------------------+
   | R8  | Always    |  High      | Avoid RTX and adaptive FEC      |
   +----------------------------------------------------------------+
   | R9  | Video     |  Variable  | Avoid back tracking to RAP      |
   +----------------------------------------------------------------+
   | R10 | Audio     |  Low       | Avoid packetization overlap     |
   +----------------------------------------------------------------+
   | R11 | Embedded  |  Low       | Avoid large receiver resource   |
   |     | Clients   |            | requirements                    |
   +----------------------------------------------------------------+
   | R12 | Large VOD |  Critical  | Avoid large sender resource     |
   |     | Servers   |            | requirements                    |
   +----------------------------------------------------------------+
   | R13 |  Client   |  Critical  | No unsignaled or unprepared     |
   |     |   Non     |            | switches involving decoder      |
   |     |Transparent|            | configuration changes           |
   +----------------------------------------------------------------+
   | R14 |  Client   |  Very Low  | No unsignaled switches even if  |
   |     |Transparent|            | decoder configuration does not  |
   |     |           |            | change                          |
   +----------------------------------------------------------------+
   | R15 | Always    |  Critical  | Use proven rate control         |
   |     |           |            | algorithms                      |
   +----------------------------------------------------------------+
   | R16 | Always    |  Critical  | Allow servers to deny an        |
   |     |           |            | up-switch                       |
   +----------------------------------------------------------------+
   | R17 | Always    |  Critical  | Allow servers to deny any       |
   |     |           |            | switch (DOS resistance)         |
   +----------------------------------------------------------------+
   | R18 | Always    |  High      | Servers should implement safe   |
   |     |           |            | limits                          |
   +----------------------------------------------------------------+
   | R19 | Always    |  Critical  | Backward compatible             |
   +----------------------------------------------------------------+
   | R20 | Always    |  High      | Forward compatible              |
   +----------------------------------------------------------------+
 
4. Security considerations

   See the security specific requirements in the above section.
   
5. References
     
   [RTP]           http://www.ietf.org/rfc/RFC1889.txt
      
   [RTSP]          http://www.ietf.org/rfc/RFC2326.txt





Gentric                                                      [page 18]

Internet Draft       Stream Switching Requirements      March 2003

   
   [TFRC]          http://www.ietf.org/rfc/RFC3448.txt
   
   [3GPP-alt-attr] 
   http://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_22/Docs/S4-
   020407.zip
   
   [3GPP-BWS] 
   http://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_25/Docs/S4-
   030024.zip   


6. Authors' Addresse

   Philippe Gentric 
   Philips MP4Net 
   51 rue Carnot 
   92156 Suresnes 
   France 
   e-mail: philippe.gentric@philips.com