Internet DRAFT - draft-deng-idn-tsmodule

draft-deng-idn-tsmodule



Internet Draft                                     Authors: Xiang Deng
<draft-deng-idn-tsmodule-00.txt>                      	       CNNIC  
September , 2001                                                      
Expires in six months	                                   


       	   The Selective Module for The Conversion 
	Between Traditional/Simplified Characters in DNS

Status of this Memo

This document is an Internet-Draft and is in full conformance with all 
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task 
Force (IETF), its areas, and its working groups. Note that other groups 
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months 
and may be updated, replaced, or obsoleted by other documents at any 
time. It is inappropriate to use Internet-Drafts as reference material 
or to cite them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html

Terminology

The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and
"MAY" in this document are to be interpreted as described in RFC 2119
[RFC2119].


Abstract

This document puts forward a practical scheme on implementing the selective
module of the Conversion between Traditional/Simplified characters in DNS 
through setting up postulated conditions and consequence analyzing process.

1. The Origin of the Issue
The original intention for establishing the IDN WG is to permit the 
Internet users, making no distinguish of the nationality and race, 
to locate and access the Internet resources by using the languages 
they familiar with.  Therefore, the common dream of all the members 
of such IDN WG is to achieve this object.

The problem of Traditional/Simplified Chinese Conversion ("TSconv" 
for short) is not simply the issue of the application of Chinese 
characters in DNS. The essential point is how to truly embody the 
intention for establishing the IDN WG.

>From the angle of character encoding, the Conversions are dealing 
with how to build up a mapping relationship between different glyphs 
of one single character in certain language environment (e.g. Chinese 
Language environment).

2. The Language Relativity of the Internationalized Domain Name
Whatever the ACE-Nameprep-IDNA ("ANI" for short) solution adopted
by IETF IDN WG, or the solutions of UTF-8 or UNICODE, we must 
properly resolve the problem of converting the native Language that 
users type into the standard format (e.g. ACE, UTF-8, UNICODE, etc.).

In the ANI solution , the flow chart is supposed to be:
	+-----------------------------------------+
	| User Typing the Code of Native Language |
	+-----------------------------------------+
			      |
			      V
	+----------------------------------------------+
	| the Code of Native Language -> UNICODE (NtoU)|
	+----------------------------------------------+
			      |
			      V
	+-----------------------------------------+
	|               +------------+            |
	|  IDNA         |  Nameprep  |            |
	|               +------------+            |
	|                     |                   |
	|                     V                   |
	|               +------------+            |
	|               | Resolver   |            |
	|               +------------+            |
	|                     |                   | 
	|                     V                   |
        |               +------------+            |
	|               |    DNS     |            |
	|               +------------+            |
	+-----------------------------------------+   

Although each of the languages in the world has their distinguishing 
specificality, it still cannot change the goal of the IDN WG ¿C to 
bring all the languages in the world into the Internet. Therefore, 
the WG shall firstly accomplish the course of "NtoU" (Native Language 
to UNICODE Conversion) in both technical and strategic term before 
carrying out the IDNA.  

Suppose that the NtoU will be accomplished by user's operating system, 
engineers should firstly know the coding type of the language 
(i.e. Language coding type) that the user input, then they will be 
able to convert such language into UNICODE accurately. 

Since the language coding type dose exist, it is possible for us to 
define the specific NtoU in accordance with the characteristics of certain 
language. It is also the point in which the IDN technology can fully 
demonstrate its creativity.

Examples presented below are concerning about the T/S Conversion. We 
are aiming at illustrating different technical solutions. 

3. The orientation of the language specificality in the ANI solution 
(1). Situation A: Define the Language specificality in NtoU, and 
     accomplish NtoU by application programs

Supposes:
  a. NtoU does not belong to the Nameprep, NtoU is accomplished by 
     application programs;
  b. If one registers a domain name of ".xx" domain, the system will 
     strictly follows the rule of TSconv. All the data loaded in the 
     DNS server will be Simplified Chinese Domain Names complying with
     this rule;
  c. If one registers a domain name of ".yy" domain, the system will 
     strictly follows the "yyRule". All the data loaded in the DNS 
     server will be Domain Names complying with the yyRule;
  d. The domain ".zz" does not comply with any convertible regulation;
  e. The application 1 (App.1) complies with the TSconv rule in NtoU; 
     the App.2 complies with the yyRule in NtoU; the App.3 does not 
     comply with any rules.

Conclusion:
  a. Users of App.1 can access domain names of domain ".xx",".yy",".zz"  
     accurately by typing simplified Chinese characters ("SC" for short);

     The users can access ".xx" domain accurately by typing Traditional
     Chinese characters ("TC" for short).

     But the TC domain names in the domain ".yy" and ".zz" can not be 
     accessed by the users of App.1 permanently.

  b. Users of App.2 can access domain names of domain ".xx",".yy" and
     ".zz" accurately by typing characters comply with yyRule.

     Users can access ".yy" domain accurately by typing characters which
     do not comply with yyRule;

     But domain names in the domain ".xx" and ".zz" that do not comply 
     with yyRule can not be accessed by the users of App.2 permanently.

  c. Users of App.3 can access domain names of domain ".xx", ".yy" and
     ".zz" by typing characters that comply with both TSconv Rule and 
     yyRule Rule.
 
     Users can not access ".xx" domain names by typing characters that 
     do not comply with the TSconv rule.
     
     Users can not access ".yy" domain names by typing characters that 
     do not comply with yyRule.

     Users can access domain names in domain ".zz" by typing characters
     that do not comply with either TSconv or yyRule.

Summary: 
  Adopting this method that separate NtoU from Nameprep will bring about 
variance of the applications and chaos among users. 

(2). Situation B: Define the Specificality of the language in the Nameprep

Supposes:
  a. NtoU belongs to the Nameprep, NtoU is accomplished in the Nameprep.
  b. If one registers a domain name of domain ".xx", the system will 
     strictly follows the TSconv. The data loaded in the DNS server will 
     be only the Simplified Chinese domain names that comply with TSconv.
  c. If one registers a domain name of domain ".yy", the system will 
     strictly follows yyRule, the data loaded in the DNS server will be
     only the domain names that comply with yyRuleú©
  d. Domain ".zz" does not comply with any rules.

Conclusions:
  a. User can access domain names of domain ".xx", ".yy" and ".zz"
     accurately by typing SC that comply with yyRule;

     Users can access ".xx" domain accurately by typing TC;

     Users can access ".yy" domain accurately by typing characters that 
     do not comply with yyRule;

     But domain names in domain ".xx" and ".zz"" that do not comply 
     with yyRule can not be accessed by users permanently;

     Domain names in domain ".yy" and ".zz" that do not comply with 
     TSconv rule can not be accessed by users permanently;

Summary: 
  a. achieved the coherence of the applications
  b. Domain names that do not follow the rules can not be accessed 
     by users. 

(3). Situation C: Register Solution of language encoding tag 
Supposes:
  a. NtoU belongs to the Nameprep, NtoU is accomplished in the Nameprep;
  d. If one registers a Chinese domain name of domain ".xx" or ".zz",
     the system will strictly require user to register two domain name 
     according to the language encoding tag and two different rules 
     (TSconv and STconv).  

Conclusions:
  a. Two records will be created once a Chinese domain name being 
     registered. In the domain ".xx" or ".zz", one record complies
     with TSconv and the other one complies with STconv.

  b. Users can access domain names in the domain ".xx" and ".zz" by 
     typing SC or TC whether they comply with TSconv or not.
 
     Users can access domain names in the domain ".xx" and ".zz" by 
     typing SC or TC whether they comply with STconv or not. 

  c. On the analogy of this, other language rules can be applied.

Summary:
  a. achieved the coherence of the applications
  b. achieved the coherence of the access results

3. Realizing Scheme
(1). Registration Scheme
	+--------+ +--------+
	|  .xx   | |  .zz   |
 	+--------+ +--------+
          ^     ^   ^    ^ 
          |      \ /     |
          |      / \     |
     +------------+-----------+
     |    TSconv  |  STconv   |
     +------------+-----------+       
     | Register simultaneously|    
     +------------------------+
               ^     ^  	
                \   /
     +------------+-----------+
     |   User (SC)|(TC)       |
     +------------+-----------+
	
(2). Resolution process
    +-------+  +-------+  +-------+
    |  .xx  |  |  .zz  |  |  .yy  |
    +-------+  +-------+  +-------+
      ^            ^            ^
      |            |            |
      +------------+------------+
        ^          ^          ^
        |          |          |
    +--------+ +--------+ +--------+
    | TSconv | | STconv | | yyRule |
    +--------+ +--------+ +--------------------+
    |user(SC)| |user(TC)| |user(other language)|
    +--------+ +--------+ +--------------------+


4. Authors' Address
Xiang Deng
China Internet Network Information Center 
NO.4  South 4th ST. Beijing, P.R.China, 100080, PO BOX 349
Tel: +86-10-62619750 

5. Acknowledgement
Yang Yu		<yuyang@cnnic.net.cn>
YunGang Chen	<chen@cnnic.net.cn>
Yanfeng WANG    <wyf@cnnic.net.cn>
XiaoDong Li	<lee@cnnic.net.cn>
GuoNian Sun	<sun@cnnic.net.cn>


6. References

[IDNREQ]  Requirements of Internationalized Domain Names, Zita Wenzel, 
                James Seng, draft-ietf-idn-requirements

[NAMEPREP] Paul Hoffman & Marc Blanchet, Preparation of
           Internationalized Host Names, draft-ietf-idn-nameprep

[RFC2119] Scott Bradner, Key words for use in RFCs to Indicate
          Requirement Levels, March 1997, RFC 2119.

[STD13]   Paul Mockapetris, Domain names - implementation and
          specification, November 1987, STD 13 (RFC 1034 and 1035).
             
[UNAME]   Internationalized Domain Names and Unique Identifiers/Names
                Li Ming TSENG, Jan Ming HO, Hua Lin QIAN, Kenny HUANG
                draft-ietf-idn-uname
                
[TSCONV]  Traditional and Simplified Chinese Conversion
		Xiao Dong Lee, Nai Wen Hsu, Erin Chen, Guo Nian Sun
		draft-ietf-idn-tsconv
		
[ISO10646] ISO/IEC 10646-1:2000. International Standard -- Information
           technology -- Universal Multiple-Octet Coded Character Set 
           (UCS) -- Part 1: Architecture and Basic Multilingual Plane.

[Unicode3] The Unicode Consortium, "The Unicode Standard -- Version3.0",
           ISBN 0-201-61633-5.