Internet DRAFT - draft-duric-rtp-ilbc

draft-duric-rtp-ilbc





                                                             Alan Duric 
                                                    Soren Vang Andersen 
   Internet Draft                                                       
   draft-duric-rtp-ilbc-01.txt                          Global IP Sound 
   July 1st, 2002                               
   Expires: Jan. 1st, 2003                      
 
 
                    RTP Payload Format for iLBC Speech 
 
 
Status of this Memo 
 
   This document is an Internet-Draft and is in full conformance 
   with all provisions of Section 10 of RFC2026. 
 
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups. Note that 
   other groups may also distribute working documents as Internet-
   Drafts. 
    
   Internet-Drafts are draft documents valid for a maximum of six 
   months and may be updated, replaced, or obsoleted by other documents 
   at any time.  It is inappropriate to use Internet-Drafts as 
   reference material or to cite them other than as "work in progress." 
    
   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt 
   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html. 
    
    
Abstract 
    
   This document describes the RTP payload format for the internet Low 
   Bit Rate Coder (iLBC) Speech [1] developed by Global IP Sound 
   (GIPS). Also, within the document there are included necessary 
   details for the use of iLBC with MIME and SDP. 
 
 
Table of Contents 
    
   Status of this Memo................................................1 
   Abstract...........................................................1 
   Table of Contents..................................................1 
   1. INTRODUCTION....................................................2 
   2. BACKGROUND......................................................2 
   3. RTP PAYLOAD FORMAT..............................................2 
   3.1 Multiple iLBC frames in a RTP packet...........................3 
   4. IANA CONSIDERATIONS.............................................4 
   4.1 Storage Mode...................................................4 
   4.2 MIME registration of iLBC......................................4 
   5. MAPPING TO SDP PARAMETERS.......................................5 
   6. SECURITY CONSIDERATIONS.........................................5 
   INTERNET DRAFT RTP Payload format for iLBC Speech         July 2002 
    
   7. REFERENCES......................................................6 
   8. ACKNOWLEDGEMENTS................................................7 
   9. AUTHOR'S ADDRESSES..............................................7 
 
 
1. INTRODUCTION  
 
   This document describes how compressed iLBC speech as produced by 
   the iLBC codec [1] may be formatted for use as an RTP payload type. 
   Methods are provided to packetize the codec data frames into RTP 
   packets. The sender may send one or more codec data frames per 
   packet, depending on the application scenario or based on the 
   transport network condition, bandwidth restriction, delay 
   requirements and packet-loss tolerance. 
    
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in 
   this document are to be interpreted as described in RFC 2119 [2]. 
 
    
2. BACKGROUND 
 
   Global IP Sound (GIPS) has developed and defines a freeware speech 
   compression algorithm for use in IP based communications [1]. The 
   iLBC codec enables graceful speech quality degradation in the case 
   of lost frames, which occurs in connection with lost or delayed IP 
   packets. 
    
   Some of the applications for which this coder is suitable are: real 
   time communications such as telephony and videoconferencing, 
   streaming audio, archival and messaging. 
    
   The iLBC codec [1] is an algorithm that compresses each 30 ms of 
   8000 Hz, 16-bit sampled input speech into size output frames with 
   rate of 416 bits. 
    
   The codec has a bit rate of 13.867 bits/s using a block independent 
   linear-predictive coding (LPC) algorithm. The codec operates at 
   block lengths of 30 ms and produces 416 bits per block, which can be 
   packetized in 52 bytes. The described algorithm results in a speech 
   coding system with a controlled response to packet losses similar to 
   what is known from pulse code modulation (PCM) with a packet loss 
   concealment (PLC), such as ITU-T G711 standard [10], which operates 
   at a fixed bit rate of 64 kbit/s. At the same time, the described 
   algorithm enables fixed bit rate coding with a quality-versus-bit 
   rate tradeoff close to what is known from code-excited linear 
   prediction (CELP). 
 
    
    
3. RTP PAYLOAD FORMAT 
    
   The iLBC codec uses 30 ms frames and a sampling rate clock of 8 kHz, 
   so the RTP timestamp MUST be in units of 1/8000 of a second. The RTP 
   Duric, Andersen                                            [Page 2] 
   INTERNET DRAFT RTP Payload format for iLBC Speech         July 2002 
    
   payload for iLBC has the format shown in the figure bellow. No 
   addition header specific to this payload format is required. 
    
   This format is intended for the situations where the sender and the 
   receiver send one or more codec data frames per packet. The RTP 
   packet looks as follows: 
    
   0                   1                   2                   3 
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                      RTP Header [4]                           | 
   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 
   |                                                               | 
   +                 one or more frames of iLBC [1]                | 
   |                                                               | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
 
   The RTP header of the packetized encoded iLBC speech has the 
   expected values as described in [4]. The usage of M bit should be as 
   specified in the applicable RTP profile, for example, RFC 1890 [5], 
   where [5] specifies that if the sender does not suppress silence 
   (i.e., sends a frame on every 30 millisecond interval), the M bit 
   will always be zero. When more then one codec data frame is present 
   in a single RTP packet, the timestamp is, as always, that of the 
   oldest data frame represented in the RTP packet. 
    
   The assignment of an RTP payload type for this new packet format is 
   outside the scope of this document, and will not be specified here. 
   It is expected that the RTP profile for a particular class of 
   applications will assign a payload type for this encoding, or if 
   that is not done, then a payload type in the dynamic range shall be 
   chosen by the sender. 
    
 
3.1 Multiple iLBC frames in a RTP packet 
    
   More than one iLBC frame may be included in a single RTP packet by a 
   sender. 
    
   It is important to observe that senders have the following 
   additional restrictions: 
    
   o SHOULD NOT include more iLBC frames in a single RTP packet than 
   will fit in the MTU of the RTP transport protocol. 
    
   o Frames MUST NOT be split between RTP packets. 
    
   It is RECOMMENDED that the number of frames contained within an RTP 
   packet is consistent with the application.  For example, in a 
   telephony and other real time applications where delay is important, 
   then the fewer frames per packet the lower the delay, whereas for a 
   bandwidth constrained links or delay insensitive streaming messaging 
   application, more then one or many frames per packet would be 
   acceptable. 
    
   Duric, Andersen                                            [Page 3] 
   INTERNET DRAFT RTP Payload format for iLBC Speech         July 2002 
    
   Information describing the number of frames contained in an RTP 
   packet is not transmitted as part of the RTP payload.  The way to 
   determine the number of iLBC frames is to count the total number of 
   octets within the RTP packet, and divide the octet count by the 
   number of expected octets per frame (52 per frame). 
    
    
4. IANA CONSIDERATIONS 
    
   One new MIME sub-type as described in this section is to be 
   registered. 
    
4.1 Storage Mode 
    
   The storage mode is used for storing speech frames (e.g. as a file 
   or e-mail attachment). 
    
   +------------------+ 
   | Header           | 
   +------------------+ 
   | Speech frame 1   | 
   +------------------+ 
   :                  : 
   +------------------+ 
   | Speech frame n   | 
   +------------------+ 
 
   The file begins with a header that includes only a magic number to 
   identify that it is an iLBC file. The magic number for iLBC file 
   MUST correspond to the ASCII character string "#!iLBC\n", or "0x23 
   0x21 0x69 0x4C 0x42 0x43 0x0A" in hexadecimal form. After the 
   header, follow the speech frames in consecutive order. 
 
 
4.2 MIME registration of iLBC 
    
   MIME media type name: audio 
    
   MIME subtype: iLBC 
    
   Optional parameters: 
    
   This parameter applies to RTP transfer only. 
    
        maxptime:The maximum amount of media which can be  
                 encapsulated in a payload packet, expressed  
                 as time in milliseconds. The time is  
                 calculated as the sum of the time the media  
                 present in the packet represents. Thetime SHOULD be a 
                 multiple of the frame size. If this parameter is not 
                 present, the sender MAY encapsulate any number of 
                 speech frames into one RTP packet. 
    
   Encoding considerations: 
                  This type is defined for transfer via both RTP (RFC 
   Duric, Andersen                                            [Page 4] 
   INTERNET DRAFT RTP Payload format for iLBC Speech         July 2002 
    
                  1889) and stored-file methods as described in Section 
                  4.1, of RFC XXXX. Audio data is binary data, and must 
                  be encoded for non-binary transport; the Base64 
                  encoding is suitable for Email.  
    
   Security considerations: 
                  See Section 6 of RFC XXXX. 
    
   Public specification: 
                  Please refer to RFC XXXX [1]. 
    
   Additional information: 
                  The following applies to stored-file transfer 
                  methods: 
    
                  Magic number:  
                  ASCII character string "#!iLBC\n"  
                  (or 0x23 0x21 0x69 0x4C 0x42 0x43 0x0A in  
                  hexadecimal) 
    
                  File extensions: lbc, LBC 
                  Macintosh file type code: none 
                  Object identifier or OID: none 
    
   Person & email address to contact for further information: 
                  alan.duric@globalipsound.com 
    
   Intended usage: COMMON. 
                  It is expected that many VoIP applications will use 
                  this type. 
    
   Author/Change controller: 
                  alan.duric@globalipsound.com 
                  IETF Audio/Video transport working group 
 
5. MAPPING TO SDP PARAMETERS 
    
   Parameters are mapped to SDP [7] in a standard way. When conveying 
   information by SDP, the encoding name SHALL be "iLBC" (the same as 
   the MIME subtype). An example of the media representation in SDP for 
   describing iLBC might be: 
    
     m = audio 49120 RTP/AVP 97 
     a = rtpmap:97 iLBC 
 
 
6. SECURITY CONSIDERATIONS 
 
   RTP packets using the payload format defined in this specification 
   are subject to the general security considerations discussed in [4] 
   and any appropriate profile (e.g. [5]). 
    
   As this format transports encoded speech, the main security issues 
   include confidentiality and authentication of the speech itself. The 
   payload format itself does not have any built-in security 
   Duric, Andersen                                            [Page 5] 
   INTERNET DRAFT RTP Payload format for iLBC Speech         July 2002 
    
   mechanisms. Confidentiality of the media streams is achieved by 
   encryption, therefore external mechanisms, such as SRTP [9], MAY be 
   used for that purpose. The data compression used with this payload 
   format is applied end-to-end; hence encryption may be performed 
   after compression with no conflict between the two operations. 
    
   A potential denial-of-service threat exists for data encoding using 
   compression techniques that have non-uniform receiver-end 
   computational load. The attacker can inject pathological datagrams 
   into the stream which are complex to decode and cause the receiver 
   to become overloaded. However, the encodings covered in this 
   document do not exhibit any significant non-uniformity. 
    
 
7. REFERENCES 
    
   [1] Andersen, et al., Internet Low Bit Rate Codec (iLBC)", draft-
      andersen-ilbc-01.txt, July 2002. 
    
   [2] S. Bradner, "Key words for use in RFCs to Indicate requirement 
      Levels", BCP 14, RFC 2119, March 1997. 
    
   [3] S. Bradner, "The Internet Standards Process -- Revision 3", BCP 
      9, RFC 2026, October 1996 
    
   [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: 
      A Transport Protocol for Real-Time Applications", IETF RFC 1889, 
      January 1996. 
    
   [5] H. Schulzrinne, "RTP Profile for Audio and Video Conferences 
      with Minimal Control" IETF RFC 1890, January 1996. 
    
   [6] Handley & Perkins, "Guidelines for Writers of RTP Payload 
      Formats", BCP 36, RFC 2736, December 1999. 
    
   [7] M. Handley and V. Jacobson, "SDP: Session Description Protocol", 
      IETF RFC 2327, April 1998 
    
   [8] N. Freed and N. Borenstein, "Multipurpose Internet Mail 
      Extensions (MIME) Part One: Format of Internet Message Bodies", 
      RFC 2045, November 1996. 
    
   [9] Baugher, et al., "The Secure Real Time Transport Protocol", IETF 
      Draft (Work in Progress), November 2001. 
 
   [10] ITU-T Recommendation G.711, available online from the ITU 
      bookstore at http://www.itu.int. 
    
   [11] J. Sjoberg, M. Westerlund, A. Lakaniemi, Q. Xie, ôRTP payload 
      format and file storage format for the Adaptive Multi-Rate (AMR) 
      and Adaptive Multi-Rate Wideband (AMR-WB) audio codecsö, draft-
      ietf-avt-rtp-amr-13.txt, February 2002. 
 
   Duric, Andersen                                            [Page 6] 
   INTERNET DRAFT RTP Payload format for iLBC Speech         July 2002 
    
8.  
ACKNOWLEDGEMENTS 
 
   The authors wish to thank Henry Sinnreich and Patrik Faltstrom for 
   great support of the iLBC initiative and for their valuable feedback 
   and comments. 
    
 
9. AUTHOR'S ADDRESSES 
    
   Alan Duric 
   Global IP Sound AB 
   Rosenlundsgatan 54  
   Stockholm, S-11863 
   Sweden           
   Phone:  +46 8 54553040 
   Email:  alan.duric@globalipsound.com 
    
   Soren Vang Andersen 
   Global IP Sound AB 
   Rosenlundsgatan 54  
   Stockholm, S-11863 
   Sweden           
   Phone:  +46 8 54553040 
   Email:  soren.andersen@globalipsound.com 
    
   Duric, Andersen                                            [Page 7]