Internet DRAFT - draft-fujimoto-sipping-header-lang
draft-fujimoto-sipping-header-lang
Internet Draft S.Fujimoto
Document: draft-fujimoto-sipping-header-lang-00. Fujitsu Labs LTD
Expires: March 2003 September 2002
SIP Header Language Information Extension
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 [1].
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-Drafts
as reference material or to cite them other than as "work in
progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This draft explains the problem when we use the UTF-8 character set
to represent the language text other than English. This draft also
describes the requirements to solve the problems, and proposes the
extended syntax to SIP protocol message syntax specification.
Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in RFC-2119 [2].
Table of Contents
1. Introduction..................................................2
2. Scope.........................................................2
3. Requirements..................................................3
3.1 Backward Compatibility....................................3
Fujimoto Expires - March 2003 [Page 1]
draft-fujimoto-sipping-header-lang-00 September 2002
3.2 Data Size.................................................3
3.3 Multiple Languages on one field value.....................3
3.4 Extensibility.............................................3
3.5 Country Information.......................................4
4. SIP Header I18N...............................................4
4.1 Language Tag..............................................4
4.2 SIP Header i18n Extension.................................4
5. Formal Syntax.................................................5
6. Examples......................................................5
6.1 Display Name..............................................5
6.2 Generic Header Value......................................5
7. Security Considerations.......................................5
8. References....................................................6
9. Author's Addresses............................................6
1. Introduction
SIP [3] allows users to use any UTF-8[4] characters in SIP message
header fields, e.g. display-name at From: field. However, there are
no standard ways to add the language information which give UAs
hints to choose appropriate font set for representation on the User
Interfaces. This problem comes from CJK(Chinese Japanese Korean)
Kanji characters are used to express different letters in different
languages. Additionally, people who use alphabet letters pronounce
the same word differently in their languages. We believe adding
language information on UTF-8 text will solve these kind of
problems.
UTF-8 allows to represent any UCS-2/4 characters, including recent
proposed "language tag" which used to specifying the languages
information for following character strings. However, the deployed
SIP implementations may not have enough knowledge to handle them
correctly.
RFC2231[5] defines the standard way to add the language information
on MIME part 3[6]. However, this specification does not provide any
encording type to allow representing UTF-8 characters as is, since
mail delivery infrastructures are still "8-bit unsafe".
This internet draft describes the requirements for adding the
language information on SIP message header, and proposes a method
of adding the language information on UTF-8 header fields values.
2. Scope
General requirements for header filed value with non UTF-8
characters are out of scope of this draft.
Fujimoto Expires - March 2003 [Page 2]
draft-fujimoto-sipping-header-lang-00 September 2002
3. Requirements
3.1 Backward Compatibility
This requirement states:
The extensions for SIP header internationalization (i18n-ext) MUST
be fully conforming to the current SIP specifications.
Even on the implementation is not compliant with i18n-ext, these
headers MUST be recognized as valid SIP headers.
This means all SIP implementation can relay the SIP message with
SIP the i18n-ext headers.
3.2 Data Size
This requirement states:
The i18n-ext SHOULD NOT increase the size of header data
drastically.
If i18n-ext is archived by adding some information on original
header value, the result MUST work fine with line folding.
3.3 Multiple Languages on one field value
This requirement states:
The i18n-ext SHOULD be applicable on the header value which
consists of more than two languages, e.g. combination of Chinese
and Japanese. string
3.4 Extensibility
This requirement states:
The i18n-ext MUST be easy to add new set of languages for header
values.
The i18n-ext SHOULD be easy to add new i18n-ext enabled headers
which added in the future.
The i18n-ext MUST provide the mean to determine if the
implementation can handle the specific part of i18n-ext header
value.
Fujimoto Expires - March 2003 [Page 3]
draft-fujimoto-sipping-header-lang-00 September 2002
3.5 Country Information
This requirement states:
The i18n-ext MUST provide enough information for cultural dependent
process.
This means even same characters and wording is used, it is possible
to switch the way to pronounce with US and UK style on reading
texts.
4. SIP Header I18N
4.1 Language Tag
This specification uses 'Language-Tag' to identify the language for
UTF-8 string for handling. The 'Language-Tag' syntax is imported
from RFC3066 "Tags for the identification of language"[7].
4.2 SIP Header i18n Extension
This document defines new syntax component 'i18n-text' which can be
used as 'qdtext', or 'UTF8-TRIM' in RFC3961[3]
'qdtext' is used for display-name in From and To header, and 'UTF8-
TRIM' is used for other UTF-8 enabled headers.
This specification reserves the characters "=", "?", and """ for
special purpose, and those characters MUST be escaped within 'i18n-
text'.
All compliant implementations MUST replace "?" character with "=2F"
where 'i18n-text' is allowed using.
'i18n-text' starts with preamble sequence "=?", followed by
language tag, followed by delimiter "?", followed by escaped UTF-8
strings, and followed by epilogue sequence "=?".
Escaped UTF-8 strings are represented as:
1) Any octet of UTF-8 printable characters MAY be represented as
the format of starting with "=" character, and followed by 2-letter
hex value.
2) "=", "?", and """ characters MUST be represented as hex value
representation which defined in 1).
Fujimoto Expires - March 2003 [Page 4]
draft-fujimoto-sipping-header-lang-00 September 2002
5. Formal Syntax
The following syntax specification uses the augmented Backus-Naur
Form (BNF) as described in RFC-2234 [8].
'i18n-text' MAY be used where 'qdtext' or 'UTF8-TRIM' is used in
RFC-3261. However, using both 'qdtext' and 'i18n-text' within
'quoted-string' is not allowed.
i18n-text = "=?" Language-Tag "?" escaped-utf8-string "?="
'Language-Tag' is imported from RFC3066 "Tags for the
identification of language"[7]
Language-Tag = Primary-subtag *( "-" Subtag )
Primary-subtag = 1*8ALPHA
Subtag = 1*8(ALPHA / DIGIT)
escaped-utf8-char = "=" 2*2HEXDIGIT
HEXDIGIT = %x30-x39 / %x41-%46
escaped-utf8-string = 1*(UTF8-NON-ASCII
/ %x21-33 / %x35-%x3C / %x3E / %x40-%7E
/ escaped-utf8-char )
6. Examples
6.1 Display Name
From: "=?ja-JP?(escaped-japanese-display-name in UTF8)?="
<sip:shingo_fujimoto@jp.fujitsu.com>
From: "=?en-GB?Alice in Wonderland?="
<sip:alice@wonderland.com>
6.2 Generic Header Value
Subject: =?en-AU?How are you today=20=3F?=
Organization: =?it?Ciao?= =?en?Travel Inc.?=
7. Security Considerations
This draft does not discuss security issues and is not believed to
raise any security issues on fully conforming implementations of
SIP.
Fujimoto Expires - March 2003 [Page 5]
draft-fujimoto-sipping-header-lang-00 September 2002
8. References
[1] Bradner, S., "The Internet Standards Process -- Revision 3",
BCP 9, RFC 2026, October 1996.
[2] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997
[3] J.Rosenberg, H.Schulzrinne, G.Camarillo, A.Johnston,
J.Peterson, R.Sparks,M.Handley, E.Schooler, "SIP: Session
Initiation Protocol", RFC3261, June 2002
[4] Yergeau, F., "UTF-8: a transformation format of ISO 10646",
RFC2279, November 1998
[5] Freed, N., and Moore, K., "MIME Parameter Value and Encoded
Word Extensions: Character Sets, Languages, and Continuations",
RFC2231, November 1997
[6] Moore, K., "Multipurpose Internet Mail Extensions Part Three:
Representation of non-ASCII text in Internet Message Headers",
RFC2047, December 1996
[7] "Tags for the Identification of Languages", RFC3066, January
2001.
[8] Crocker, D. and Overell, P.(Editors), "Augmented BNF for
Syntax Specifications: ABNF", RFC 2234, Internet Mail Consortium
and Demon Internet Ltd., November 1997
Author's Addresses
Shingo Fujimoto
Fujitsu Laboratories LTD
Okubocho Nishiwaki 64
Akashi HYOGO JAPAN
Phone: +81 78-934-8248
Email: shingo_fujimoto@jp.fujitsu.com
Fujimoto Expires - March 2003 [Page 6]