Internet DRAFT - draft-heath-rohc-sip-v44

draft-heath-rohc-sip-v44







Internet Engineering Task Force                              ROHC/SIP WG
Internet Draft                                                Jeff Heath
draft-heath-rohc-sip-v44-00.txt                   Hughes Network Systems
Sept 28, 2001
Expires: March 2002


       SIP Compression using ITU-T V.44 with Pre-loaded Dictionary	 

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and it working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Status of this Memo

   This memo is an Internet-Draft that provides information for the
   Internet community.  Distribution of this memo is unlimited.

   The file name for this Internet-Draft is 
   draft-heath-rohc-sip-v44-00.txt.  Comments are invited and should be 
   addressed to the author whose contact information is in Section 8.

Abstract

   This document describes a compression method based on the data
   compression algorithm described in ITU-T Recommendation V.44 [V44].
   Recommendation V.44 is a modem standard but Annex B of the 
   recommendation describes the implementation of V.44 in packet
   networks.  This document defines the application of V.44 to packets
   or messages where some or all of the contents of the data is known
   ahead of time.  For such applications the V.44 dictionary is 
   partially pre-loaded after intialization with words and/or phrases 
   that are expected to appear in the data.  This Internet Draft
   describes the mechanism of the dictionary pre-load and its 
   advantages for applications such as SIP/SDP.
 
   The V.44 algorithm is unique among data compression algorithms in


J. Heath                                                        [Page 1]

Internet Draft           SIP Compression using V.44            Sept 2001

  
   that it combines excellent compression ratios with very fast 
   execution times and low memory requirements.  It learns and adapts
   to the input data very quickly which is one reason it provides
   superior compression ratios on relatively small blocks of data.  
   The partial pre-load of the dictionary with expected words and 
   phrases will allow even better compression ratios since it will 
   allow V.44 to encode each pre-loaded word or phrase with between
   7 to 10 bits each time it is encountered in the data. 
   
   V.44 is lossless and is based upon the LZJH data compression
   algorithm.  As described in Annex B of [V44], LZJH operates in
   packet networks using one of two methods: Packet Method where each
   individual IP packet is compressed/decompressed separately, or 
   Multi-Packet Method where several IP packets are
   compressed/decompressed as a continuation using a reliable transport.
      
   Thoughout the remainder of this document the terms V.44 and LZJH
   are synonomous.

Table of Contents

   1. Introduction...................................................2
      1.1 Advantages of V.44 Data Compression........................3
      1.2 Background of LZJH Compression.............................4
   2. LZJH Design and Operation for SIP..............................4
      2.1 LZJH Modifications for Pre-loaded Dictionaries.............5
      2.2 Operation and Performance..................................6
      2.3 Data Integrity.............................................7
   3. LZJH Pre-load Test Results.....................................7
   3.1 Test 1........................................................7
   3.2 Test 1A.......................................................7
   3.3 Test 2........................................................7
   3.4 Test 3........................................................8
   3.5 Test Conclusions..............................................8
   4. Algorithm Comparison...........................................9
   4.1 Comparison Example 1..........................................9
   4.2 Comparison Example 2..........................................9
   4.3 Comparison Example 3..........................................9
   4.4 Comparison Conclusions........................................10
   5. Security Considerations........................................10
   6. Intellectual Property Right Considerations.....................10
   7. References.....................................................10
   8. Authors' Address...............................................11
   9. Full Copyright Statement.......................................11

1. Introduction

   Cellular and wireless networks are planning on implementing more
   and more Internet, IP based, services in the future.  One problem 
   with the set of IP protocols invented with the Internet in mind is
   the high overhead of additional headers and verbose data fields.
   Since bandwidth is at a premium in cellular, satellite, and other    
   

J. Heath                                                        [Page 2]

Internet Draft           SIP Compression using V.44            Sept 2001

      
   wireless networks the future implementation of these protocols
   poses a bandwidth efficiency problem.

   For example, application signaling protocols such as SIP [SIP], 
   SDP [SDP] and RTSP [RTSP] will typically be used to setup and
   control applications in a mobile Internet based on a cellular
   network. However, the generous size of the protocol messages 
   combined with their request/response nature create delays and 
   waste bandwidth in a cellular network.

   In order to reduce the delays incurred and bandwidth used by
   these protocols it is necessary to use compression techniques in
   cellular, satellite, and other wireless networks.  Some of these
   protocols contain messages that contain data that is known in 
   advance and is highly redundant.  Since the same protocols also
   have messages containing ASCII free form text, a general purpose
   data compression algorithm which can take advantage of words or
   phrases known to appear in the text is the best solution.
   
1.1 Advantages of V.44 Data Compression

   Below are some of the reasons ITU-T V.44 data compression should be 
   considered for this environment:
   
   A. The primary reason for using compression for SIP/SDP is to
      reduce the amount of data going over the RAN.  LZJH gets 
      the best compression in this environment and therefore will
      the most effective in data reduction.
      
   B. LZJH is a general purpose algorithm.  If operating in a network
      entity for one application, such as SIP/SDP, it can be used for
      other applications such as the compression of IP data packets 
      [IPCOMP] to improve the performance of the Radio Access Network.

   C. LZJH is the basis of the latest ITU-T data compression standard,
      ITU-T V.44 and will become widely used in networks.  New
      developments, such as SIP/SDP in UMTS, will benefit from using 
      the latest standarized compression technology.
      
   D. As defined in Annex B of V.44, the algorithm operates in packet
      networks on one packet at a time or on several packets using a
      reliable transport.  LZJH source code handles both methods on a
      per session basis at run time.  Thus, each network entity can
      indepedently determine, based on its resources, which method to
      use at any time.
      
   E. LZJH is superior to the other algorithms being considered for
      the SIP/SDP type application.  In this application, LZJH will get
      about twice the compression as LZW using the same MIPs and memory.  
      LZJH will get 20% to 35% better compression than LZSS while using 
      about one fifth the MIPs and similar memory.  Refer to section 4
      for algorithm comparisons.
      

J. Heath                                                        [Page 3]

Internet Draft           SIP Compression using V.44            Sept 2001
   
   
   F. LZJH was designed for packetized data.  It can get good
      compression on packets of 200 bytes and decent compression on 
      packets as small as 50 bytes.

1.2 Background of LZJH Compression

   LZJH is similar to the algorithm described in [LZ78] although is has
   aspects which are similar to the algorithm described in [LZ77].  As
   such, it provides the execution speed and low memory requirements
   of [LZ78] with compression ratios that are better than [LZ77].
   Originally developed for the satellite industry to compress IP
   datagrams independantly, it is ideal for the SIP/SDP application.
   The LZJH algorithm was modified to compress a continuous stream of
   data for a modem environment and this modified version is the basis
   for Recommendation V.44.  LZJH is an adaptive, general purpose,
   lossless data compression algorithm that provides excellent
   performance across a wide variety of data types, particularly ASCII
   text with high redundancy.  

   A typical [LZ78] compression algorithm, such as LZW, is not
   suitable for application in a packet network since it takes too long
   to build up its dictionary, resulting in poor compression ratios on
   IP datagrams that are compressed separately.  

   A typical [LZ77] compression algorithm, such as LZSS, suffers in the
   packet network application due to poor execution times.  
   
   LZJH not only has superior compression ratios when encoding or 
   decoding packetized data, but it uses little memory and provides 
   very fast execution times.  
   
   The details of LZJH compression can be found in [V44].

2. LZJH Design and Operation for SIP

   As stated earlier, LZJH operates using one of two methods in a
   packet network.  Packet Method where each IP packet, or protocol
   message, is compressed separately and the dictionaries are 
   re-initialized between messages.  Multi-Packet Method where 
   several messages are compressed and data from previous messages
   remains in the dictionary for use in compressing the current message.
   
   Multi-Packet Method requires that the messages, or packets, be
   delivered in order without any packet loss.  In other words, it
   requires a reliable transport.  The definition of that transport
   in the SIP/SDP or other similar application is beyond the scope of
   this document.
    
   Multi-Packet Method requires a separate history buffer to hold the
   data from the messages that were previouly processed.  This 
   method also requires that each separate thread or session 
   supported by a network entitiy, such as the P-CSCF, have a separate
   
   
J. Heath                                                        [Page 4]

Internet Draft           SIP Compression using V.44            Sept 2001 
   
   
   context for compression.  This context includes information for the 
   reliable link and compression dictionaries, including history buffer.
   
   In contrast, Packet Method does not require a reliable transport,
   does not require in-order delivery of messages, and does not
   require a separate history.  Packet Method is implemented in
   a serving entity, supporting hundreds of simultaneous sessions, 
   with a single instance of encoder and decoder dictionaries.  For 
   this application the total memory requirement for both encoder
   and decoder is a trivial 16K.  About 11K for the encoder, and 
   about 5K for the decoder.
   
   Multi-Packet Method provides better compression ratios than 
   Packet Method, but at the cost of considerable memory for a serving 
   network entity.  In addition, the cost of complexity and bandwidth to
   implement the underlying reliable transport may offset the better 
   compression ratios.  

2.1 LZJH Modifications for Pre-loaded Dictionaries 
   
   The modifications required to pre-load the LZJH dictionaries are
   minimal and will not affect the basic algorithm.  Thus, the same LZJH
   object module operating in a network entity will be able to operate
   normally for general purpose compression or operate using pre-loaded
   dictionaries.
   
   Currently the LZJH algorithm consists of several functions which are
   called by a user supplied control function which controls the
   operation of the algorithm.  They are:
   - initialize Packet Method
   - initialize Multi-Packet Method
   - re-initialize compressor (encoder) dictionary
   - re-initialize decompressor (decoder) dictionary
   - compress packet
   - decompress packet
     
   New functions to support the pre-load of the dictionaries have been  
   added as follows:
   - pre-load compressor dictionary
   - pre-load decompressor dictionary
   - re-initialize compressor pre-loaded dictionary
   - re-initialize decompressor pre-loaded dictionary
   
   The dictionary pre-load procedures will take as an input into the
   procedure two arrays that define the words or phrases (strings) to be
   pre-loaded.  Note that the encoder and its peer decoder must be 
   pre-loaded with exactly the same arrays for compression/decompression
   to work correctly.  The content, origin, and distribution of the 
   pre-load arrays is beyond the scope of this document and there could 
   conceivably be different pre-load arrays for each message type.
   
   The pre-load arrays consists of an array of the strings to be 


J. Heath                                                        [Page 5]

Internet Draft           SIP Compression using V.44            Sept 2001 


   pre-loaded and an array of string lengths corresponding to the
   strings as follows:

   char pl_data [] = "INVITE"
                     "SIP"
                     "/2.0"
                     "sip:"
                     "Via: "
                     "SIP/2.0/UDP"
                     "From: "
                     "To: "
                     "Call-ID: "
                     "CSeq: ";
   
   byte pl_len [] = {6,3,4,4,5,11,6,4,9,6,0};

   The above defines 10 strings to be pre-loaded.  Note that a string
   length of zero indicates the end of the strings.  Also note this 
   mechanism is subject to change as better ideas come along.
   
   Once the encoder and decoder dictionaries are pre-loaded with a set 
   of strings, the re-initialize pre-loaded dictionary procedures will
   re-initialize only the dynamic portion of the dictionaries and
   histories, the static, pre-loaded, strings will be maintained.  Thus,
   the LZJH encoder does not have to keep compressing the same 
   pre-loaded data on each message which saves processing time.  To 
   remove the pre-loaded information, the control function uses either 
   the re-initialize compressor (or decompressor) function to clear the
   information or uses the pre-load compressor (or decompressor)
   dictionary functions to pre-load different information.
   
   It may be desireable to place a byte into the message header or as
   the first byte of compressed message data to indicate which pre-load 
   table was used by the compressor.  Alternatively, the type of 
   message, INVITE, etc., can explicitly indicate which pre-load
   table was used.
   
2.2 Operation and Performance

   During the pre-load of either the compressor (encoder) or 
   decompressor (decoder) dictionary, the information required to define 
   the strings is loaded into the dictionary structures and the strings 
   themselves are cancatenated into one array of data that occupies the 
   first N locations of the packet buffer to be processed.  Thus, the 
   data to be compressed is copied into the first location following the
   "static" data in the packet buffer prior to compression.
     
   Once pre-loaded, the dictionary can be re-initialized while
   maintaining the pre-loaded portion of the dictionary and static data,  
   a major savings in execution speed.  With LZSS, it will be necessary
   to compress the static data each time to rebuild the binary tree.     



J. Heath                                                        [Page 6]

Internet Draft           SIP Compression using V.44            Sept 2001 


2.3 Data Integrity

   If a CRC or other such mechanism is not used by the underlying link
   layer to insure the integrity of the messages transferred over the
   RAN, then it may be necessary to prepend a 16 bit CRC or LRC to the
   compressed data to insure its integrity prior to decompression.

3. LZJH Pre-load Test Results
  
   The LZJH algorithm was modified as described in section 2 and tests
   were run on the SIP messages described in section 3.1.1 of
   <draft-ietf-sip-call-flows-05.txt> [FLOWS].
   
3.1 Test 1

   The first INVITE message and ACK message sent from User A to User
   B in section 3.1.1 of [FLOWS] are compressed separately with the LZJH
   dictionary pre-loaded with expected strings. 

   message    uncompressed bytes    compressed bytes    ratio     
   ----------------------------------------------------------
   INVITE            423                     152         2.8
   ACK               212                      88         2.4
      
   Note in section 4.1 the LZJH algorithm without a pre-load reduced
   the same INVITE message to 266 bytes.  The pre-load improved the 
   compression ratio by 75% on the INVITE.
   
   This is the compression ratio on an INVITE and ACK messages 
   compressed separately, using LZJH Packet Method, where a reliable 
   transport or any other mechanism to maintain data between messages
   is not required.  Both used the same set of pre-loaded strings.           

3.2 Test 1A

   The same INVITE message and ACK message from Test 1 are compressed
   using Multi-Packet Method where the dictionary is maintained 
   between the two messages. 

   message    uncompressed bytes    compressed bytes    ratio     
   ----------------------------------------------------------
   INVITE            423                     152         2.8
   ACK               212                      29         7.3
      
3.3 Test 2

   The first INVITE message in section 3.1.1 of [FLOWS] is repeated 5 
   times with certain fields changed for each repitition, such as To URL 
   and display name, rtpmap, etc.  This is to provide a ballpark 
   comparison with the "friendly" test in section 5 of UDPCOMP SIP 
   compression [UDP]. 



J. Heath                                                        [Page 7]

Internet Draft           SIP Compression using V.44            Sept 2001 


   message    uncompressed bytes    compressed bytes    ratio     
   ----------------------------------------------------------
   INVITE 1            423                     152       2.8
   INVITE 2            418                      48       8.7  
   INVITE 3            426                      43       9.9
   INVITE 4            425                      37      11.5
   INVITE 5            429                      37      11.6
   
   As can be seen, compression ratio improvements in the first 2 INVITES 
   are dramatic compared to UDPCOMP and even the last 3 INVITES are 
   better even though the INVITE messages are larger.  The total 
   compression ratio of all 5 INVITES is 6.7 compared to the total
   ratio of 2.7 (2009 / 737) for UDPCOMP, more than double.
   
   This test shows the compression ratio on a series of 5 INVITE 
   messages using LZJH Multi-Packet Method over a reliable transport.

3.4 Test 3

   Each of the 5 INVITE messages compressed in Test 2, above, is 
   compressed separately using LZJH Packet Method without a reliable
   transport or any other mechanism to maintain dictionaries between
   each message.

   message    uncompressed bytes    compressed bytes    ratio     
   ----------------------------------------------------------
   INVITE 1            423                     152       2.8
   INVITE 2            418                     155       2.7  
   INVITE 3            426                     158       2.7
   INVITE 4            425                     158       2.7
   INVITE 5            429                     159       2.7      
   
   The average of the 5 INVITES is a compression ratio of 2.7 which is
   exactly the same total compression ratio achieved by UDPCOMP on 5
   INVITES.  Each compressed message stands on its own and will obtain
   the same compression regardless if previous messages are received.
   
3.5 Test Conclusions

   In an enviroment where the codebook or dictionary of SIP messages
   is maintained between messages via a reliable transport or other
   mechanism, LZJH with a pre-loaded dictionary can obtain compression
   ratios that are more than double those of UDPCOMP (refer to Test 2).  
   
   In environment where each message stands on its own, LZJH with a 
   pre-loaded dictionary can reduce a SIP message by some 60% where
   UDPCOMP actually expands the first INVITE message in [UDP] section 5.
   
   SIP signalling requires the exchange of just a few messages, if it 
   takes a few messages for a complicated mechanism to obtain good 
   compression ratios over the total of all messages then any advantage 
   is negated.


J. Heath                                                        [Page 8]

Internet Draft           SIP Compression using V.44            Sept 2001 


   LZJH can obtain very good compression ratios using Multi-Packet
   Method over a reliable transport.  However, since LZJH can reduce SIP
   messages by more than half without any mechanism where the data in
   one message is used to compress following messages, the complication
   of a reliable transport or other such mechanisms may offset the extra
   savings.  Since LZJH supports both methods, it is up to network
   designers to determine which method to use on each call flow.
   
4. Algorithm Comparison

   Compression ratio comparisons between three algorithms, LZJH, LZSS,
   and LZW, are provided in this section.
   
4.1 Comparison Example 1

   The first INVITE message in section 3.1.1 of [FLOWS] is compressed
   by each algorithm without the pre-load of any algorithm.  This shows
   the raw ability of the algorithm to compress a single SIP message.
   
   algorithm    uncompressed bytes    compressed bytes     delta
   -------------------------------------------------------------
   LZJH            423                     266               -
   LZSS            423                     329              23.7%
   LZW             423                     329              23.7%
   
   In the above example, LZJH obtains a compression ratio of 1.55 on a 
   single SIP message without a pre-load of the dictionary.
   
4.2 Comparison Example 2

   The first INVITE message in section 3.1.1 of [FLOWS] is duplicated 
   into a single message and then compressed by each algorithm without 
   the pre-load of any algorithm.  This compares the ability of the 
   algorithm to compress a SIP message when the same SIP message is 
   already in the dictionary or history memory.
   
   algorithm    uncompressed bytes    compressed bytes     delta
   -------------------------------------------------------------
   LZJH            846                     273               -
   LZSS            846                     357              30.8%
   LZW             846                     536              96.3%
   
   This example shows that LZJH uses just 7 additional bytes to encode
   the entire duplicated 2nd SIP message (273 - 266 = 7).  LZSS uses
   28 additional bytes, and LZW 207 additional bytes.

4.3 Comparison Example 3

   An HTML file with significant redundancy is compressed.  This is
   to test both the compression ratio and encoder speed of each
   algorithm.  The SIP messages are too small for accurate encoder speed
   measurements.  The file is compressed as separate packets of 1500


J. Heath                                                        [Page 9]

Internet Draft           SIP Compression using V.44            Sept 2001 

   
   bytes each with compression dictionaries re-initialized between each
   packet, the compression ratio is the average for all packets.  The
   encoder time is total for all packets.
   
   Note that at this writing, the author does not have a version of
   LZSS using binary trees for speed measurement, thus, a variant of
   LZSS is used for the speed measurements, LZS which uses hash tables
   and is faster than LZSS.  Compression ratio measurements are LZSS.
   
   algorithm    encoder    uncompressed    compressed      delta
                  time        bytes          bytes
   -------------------------------------------------------------
   LZJH            2          97030          35081           -              
   LZS/LZSS        13         97030          42562         21.3%     
   LZW             3          97030          62586         78.4%
   
   The above example shows that both LZJH and LZW encoders are
   several times faster than that of LZSS.  Decoders are not 
   measured here, however, previous testing has shown that the
   LZJH decoder is about 3 times faster than the LZJH encoder.  The
   LZSS decoder should be as fast as the LZJH decoder and the LZW
   decoder should be slower by about a facter of 2.
   
4.4 Comparison Conclusions
   
   Among the 3 algorithms, LZJH is the obvious choice for both speed
   and compression performance.                              
   
5. Security Considerations

   There are no specific security considerations regarding the 
   proposal in this Internet Draft.  However, the application of
   data compression, as described herein, should always be done at a
   layer above any encryption, such as IPSec. 
   
6. Intellectual Property Right Considerations
   
   The LZJH data compression algorithm is referenced in one or more
   patents or patent applications.  To the extent that some of the 
   concepts in this document are adopted in a specification, Hughes 
   Network Systems agrees to license patents technically necessary to 
   implement the specification on fair, reasonable, and 
   nondiscriminatory terms.  Since HNS is a potential manufacturer of 
   3rd generation satellite terminals, this may be based on 
   reciprocity where possible.

7. References

   [IPCOMP]   Shacham, A., "IP Payload Compression Protocol (IPComp)",
              RFC 2393, December 1998.

   [LZ77]     Lempel, A., and Ziv, J., "A Universal Algorithm for


J. Heath                                                       [Page 10]

Internet Draft           SIP Compression using V.44            Sept 2001 


              Sequential Data Compression", IEEE Transactions On
              Information Theory, Vol.  IT-23, No. 3, May 1977.

   [LZ78]     Lempel, A., and Ziv, J., "Compression of Individual
              Sequences via Variable Rate Coding", IEEE Transactions
              On Information Theory, Vol.  IT-24, No. 5, Sep 1978.

   [V44]      ITU Telecommunication Standardization Sector (ITU-T)
              Recommendation V.44 "Data Compression Procedures",
              November 2000.
	      
   [FLOWS]    A. Johnston, S. Donovan, C. Cunningham, D. Willis,
              J. Rosenberg, K. Summers, H. Schulzrinne, "SIP Call Flow
	      Examples" Internet Draft, Internet Engineering Task Force,
	      June 2001.

   [UDP]      J. Rosenberg, "Compression of SIP" Internet Draft, 
              Internet Engineering Task Force, July, 2001. 

8. Authors' Address

   Jeff Heath
   Hughes Network Systems
   10450 Pacific Center Ct.
   San Diego, CA  92121

   voice: 858-452-4826
   fax: 858-597-8979
   e-mail: jheath@hns.com

9.  Full Copyright Statement

   Copyright (C) The Internet Society (1998).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.
   
   This document and the information contained herein is provided on an   


J. Heath                                                       [Page 11]

Internet Draft           SIP Compression using V.44            Sept 2001 


   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

















































J. Heath                                                       [Page 12]