Internet DRAFT - draft-fujimoto-sipping-header-lang

draft-fujimoto-sipping-header-lang






                                                                         
    Internet Draft                                            S.Fujimoto 
    Document: draft-fujimoto-sipping-header-lang-00.    Fujitsu Labs LTD 
    Expires: March 2003                                   September 2002 
     
     
                 SIP Header Language Information Extension 
     
     
 Status of this Memo 
     
    This document is an Internet-Draft and is in full conformance with 
    all provisions of Section 10 of RFC2026 [1].  
     
    Internet-Drafts are working documents of the Internet Engineering 
    Task Force (IETF), its areas, and its working groups.  Note that      
    other groups may also distribute working documents as Internet-
    Drafts. 
     
    Internet-Drafts are draft documents valid for a maximum of six 
    months and may be updated, replaced, or obsoleted by other 
    documents at any time.  It is inappropriate to use Internet-Drafts 
    as reference material or to cite them other than as "work in 
    progress." 
     
    The list of current Internet-Drafts can be accessed at 
         http://www.ietf.org/ietf/1id-abstracts.txt 
    The list of Internet-Draft Shadow Directories can be accessed at 
         http://www.ietf.org/shadow.html. 
     
 Abstract 
     
    This draft explains the problem when we use the UTF-8 character set 
    to represent the language text other than English. This draft also 
    describes the requirements to solve the problems, and proposes the 
    extended syntax to SIP protocol message syntax specification.  
     
 Conventions used in this document 
     
    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
    "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in 
    this document are to be interpreted as described in RFC-2119 [2]. 
     
 Table of Contents 
     
    1. Introduction..................................................2 
    2. Scope.........................................................2 
    3. Requirements..................................................3 
       3.1 Backward Compatibility....................................3 

  
  
 Fujimoto                 Expires - March 2003                 [Page 1] 
                 draft-fujimoto-sipping-header-lang-00  September 2002 
  
  
       3.2 Data Size.................................................3 
       3.3 Multiple Languages on one field value.....................3 
       3.4 Extensibility.............................................3 
       3.5 Country Information.......................................4 
    4. SIP Header I18N...............................................4 
       4.1 Language Tag..............................................4 
       4.2 SIP Header i18n Extension.................................4 
    5. Formal Syntax.................................................5 
    6. Examples......................................................5 
       6.1 Display Name..............................................5 
       6.2 Generic Header Value......................................5 
    7. Security Considerations.......................................5 
    8. References....................................................6 
    9. Author's Addresses............................................6 
     
     
 1. Introduction 
     
    SIP [3] allows users to use any UTF-8[4] characters in SIP message 
    header fields, e.g. display-name at From: field. However, there are 
    no standard ways to add the language information which give UAs 
    hints to choose appropriate font set for representation on the User 
    Interfaces. This problem comes from CJK(Chinese Japanese Korean) 
    Kanji characters are used to express different letters in different 
    languages. Additionally, people who use alphabet letters pronounce 
    the same word differently in their languages. We believe adding 
    language information on UTF-8 text will solve these kind of 
    problems.  
     
    UTF-8 allows to represent any UCS-2/4 characters, including recent 
    proposed "language tag" which used to specifying the languages 
    information for following character strings. However, the deployed 
    SIP implementations may not have enough knowledge to handle them 
    correctly.  
     
    RFC2231[5] defines the standard way to add the language information 
    on MIME part 3[6]. However, this specification does not provide any 
    encording type to allow representing UTF-8 characters as is, since 
    mail delivery infrastructures are still "8-bit unsafe". 
     
    This internet draft describes the requirements for adding the 
    language information on SIP message header, and proposes a method 
    of adding the language information on UTF-8 header fields values. 
     
 2. Scope 
     
    General requirements for header filed value with non UTF-8 
    characters are out of scope of this draft. 
     
  
  
 Fujimoto                 Expires - March 2003                 [Page 2] 
                 draft-fujimoto-sipping-header-lang-00  September 2002 
  
  
     
 3. Requirements 
     
 3.1 Backward Compatibility 
     
    This requirement states: 
    The extensions for SIP header internationalization (i18n-ext) MUST 
    be fully conforming to the current SIP specifications. 
     
    Even on the implementation is not compliant with i18n-ext, these 
    headers MUST be recognized as valid SIP headers. 
     
    This means all SIP implementation can relay the SIP message with 
    SIP the i18n-ext headers. 
      
 3.2 Data Size 
     
    This requirement states: 
     
    The i18n-ext SHOULD NOT increase the size of header data 
    drastically. 
     
    If i18n-ext is archived by adding some information on original 
    header value, the result MUST work fine with line folding. 
     
 3.3 Multiple Languages on one field value 
     
    This requirement states: 
     
    The i18n-ext SHOULD be applicable on the header value which 
    consists of more than two languages, e.g. combination of Chinese 
    and Japanese. string 
     
 3.4 Extensibility 
     
    This requirement states: 
     
    The i18n-ext MUST be easy to add new set of languages for header 
    values. 
     
    The i18n-ext SHOULD be easy to add new i18n-ext enabled headers 
    which added in the future. 
     
    The i18n-ext MUST provide the mean to determine if the 
    implementation can handle the specific part of i18n-ext header 
    value. 
     
     

  
  
 Fujimoto                 Expires - March 2003                 [Page 3] 
                 draft-fujimoto-sipping-header-lang-00  September 2002 
  
  
 3.5 Country Information 
     
    This requirement states: 
     
    The i18n-ext MUST provide enough information for cultural dependent 
    process. 
     
    This means even same characters and wording is used, it is possible 
    to switch the way to pronounce with US and UK style on reading 
    texts. 
     
 4. SIP Header I18N 
  
 4.1 Language Tag 
     
    This specification uses 'Language-Tag' to identify the language for 
    UTF-8 string for handling. The 'Language-Tag' syntax is imported 
    from RFC3066 "Tags for the identification of language"[7]. 
     
 4.2 SIP Header i18n Extension 
     
    This document defines new syntax component 'i18n-text' which can be 
    used as 'qdtext', or 'UTF8-TRIM' in RFC3961[3] 
     
    'qdtext' is used for display-name in From and To header, and 'UTF8-
    TRIM' is used for other UTF-8 enabled headers. 
     
    This specification reserves the characters "=", "?", and """ for 
    special purpose, and those characters MUST be escaped within 'i18n-
    text'. 
     
    All compliant implementations MUST replace "?" character with "=2F" 
    where 'i18n-text' is allowed using. 
     
    'i18n-text' starts with preamble sequence "=?", followed by 
    language tag, followed by delimiter "?", followed by escaped UTF-8 
    strings, and followed by epilogue sequence "=?". 
     
    Escaped UTF-8 strings are represented as: 
     
    1) Any octet of UTF-8 printable characters MAY be represented as 
    the format of starting with "=" character, and followed by 2-letter 
    hex value. 
     
    2) "=", "?", and """ characters MUST be represented as hex value 
    representation which defined in 1). 
      
     

  
  
 Fujimoto                 Expires - March 2003                 [Page 4] 
                 draft-fujimoto-sipping-header-lang-00  September 2002 
  
  
 5. Formal Syntax 
     
    The following syntax specification uses the augmented Backus-Naur 
    Form (BNF) as described in RFC-2234 [8]. 
     
    'i18n-text' MAY be used where 'qdtext' or 'UTF8-TRIM' is used in 
    RFC-3261. However, using both 'qdtext' and 'i18n-text' within 
    'quoted-string' is not allowed. 
     
    i18n-text = "=?" Language-Tag "?" escaped-utf8-string "?=" 
     
    'Language-Tag' is imported from RFC3066 "Tags for the 
    identification of language"[7] 
     
    Language-Tag = Primary-subtag *( "-" Subtag ) 
    Primary-subtag = 1*8ALPHA 
    Subtag = 1*8(ALPHA / DIGIT) 
     
    escaped-utf8-char = "=" 2*2HEXDIGIT  
    HEXDIGIT = %x30-x39 / %x41-%46 
     
    escaped-utf8-string = 1*(UTF8-NON-ASCII  
                             / %x21-33 / %x35-%x3C / %x3E / %x40-%7E  
                             / escaped-utf8-char ) 
     
                                              
 6. Examples 
  
 6.1 Display Name 
     
    From: "=?ja-JP?(escaped-japanese-display-name in UTF8)?=" 
          <sip:shingo_fujimoto@jp.fujitsu.com> 
     
    From: "=?en-GB?Alice in Wonderland?=" 
          <sip:alice@wonderland.com> 
      
     
 6.2 Generic Header Value 
     
    Subject: =?en-AU?How are you today=20=3F?= 
     
    Organization: =?it?Ciao?= =?en?Travel Inc.?= 
     
     
 7. Security Considerations 
     
    This draft does not discuss security issues and is not believed to 
    raise any security issues on fully conforming implementations of 
    SIP. 
  
  
 Fujimoto                 Expires - March 2003                 [Page 5] 
                 draft-fujimoto-sipping-header-lang-00  September 2002 
  
  
     
 8. References 
                      
    [1]  Bradner, S., "The Internet Standards Process -- Revision 3", 
       BCP 9, RFC 2026, October 1996. 
     
    [2]  Bradner, S., "Key words for use in RFCs to Indicate 
       Requirement Levels", BCP 14, RFC 2119, March 1997 
     
    [3]  J.Rosenberg, H.Schulzrinne,  G.Camarillo, A.Johnston, 
       J.Peterson, R.Sparks,M.Handley, E.Schooler, "SIP: Session 
       Initiation Protocol", RFC3261, June 2002 
     
    [4]  Yergeau, F., "UTF-8: a transformation format of ISO 10646", 
       RFC2279, November 1998 
     
    [5]  Freed, N., and Moore, K., "MIME Parameter Value and Encoded 
       Word Extensions: Character Sets, Languages, and Continuations", 
       RFC2231, November 1997 
     
    [6]  Moore, K., "Multipurpose Internet Mail Extensions Part Three: 
       Representation of non-ASCII text in Internet Message Headers", 
       RFC2047, December 1996 
     
    [7]  "Tags for the Identification of Languages", RFC3066, January 
       2001. 
     
    [8]  Crocker, D. and Overell, P.(Editors), "Augmented BNF for 
       Syntax Specifications: ABNF", RFC 2234, Internet Mail Consortium 
       and Demon Internet Ltd., November 1997 
     
    
    
 Author's Addresses 
    
   Shingo Fujimoto 
   Fujitsu Laboratories LTD 
   Okubocho Nishiwaki 64 
   Akashi HYOGO JAPAN 
   Phone: +81 78-934-8248 
   Email: shingo_fujimoto@jp.fujitsu.com 
     







  
  
 Fujimoto                 Expires - March 2003                 [Page 6]