Internet DRAFT - draft-deng-idn-tsmodule
draft-deng-idn-tsmodule
Internet Draft Authors: Xiang Deng
<draft-deng-idn-tsmodule-00.txt> CNNIC
September , 2001
Expires in six months
The Selective Module for The Conversion
Between Traditional/Simplified Characters in DNS
Status of this Memo
This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other groups
may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Terminology
The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and
"MAY" in this document are to be interpreted as described in RFC 2119
[RFC2119].
Abstract
This document puts forward a practical scheme on implementing the selective
module of the Conversion between Traditional/Simplified characters in DNS
through setting up postulated conditions and consequence analyzing process.
1. The Origin of the Issue
The original intention for establishing the IDN WG is to permit the
Internet users, making no distinguish of the nationality and race,
to locate and access the Internet resources by using the languages
they familiar with. Therefore, the common dream of all the members
of such IDN WG is to achieve this object.
The problem of Traditional/Simplified Chinese Conversion ("TSconv"
for short) is not simply the issue of the application of Chinese
characters in DNS. The essential point is how to truly embody the
intention for establishing the IDN WG.
>From the angle of character encoding, the Conversions are dealing
with how to build up a mapping relationship between different glyphs
of one single character in certain language environment (e.g. Chinese
Language environment).
2. The Language Relativity of the Internationalized Domain Name
Whatever the ACE-Nameprep-IDNA ("ANI" for short) solution adopted
by IETF IDN WG, or the solutions of UTF-8 or UNICODE, we must
properly resolve the problem of converting the native Language that
users type into the standard format (e.g. ACE, UTF-8, UNICODE, etc.).
In the ANI solution , the flow chart is supposed to be:
+-----------------------------------------+
| User Typing the Code of Native Language |
+-----------------------------------------+
|
V
+----------------------------------------------+
| the Code of Native Language -> UNICODE (NtoU)|
+----------------------------------------------+
|
V
+-----------------------------------------+
| +------------+ |
| IDNA | Nameprep | |
| +------------+ |
| | |
| V |
| +------------+ |
| | Resolver | |
| +------------+ |
| | |
| V |
| +------------+ |
| | DNS | |
| +------------+ |
+-----------------------------------------+
Although each of the languages in the world has their distinguishing
specificality, it still cannot change the goal of the IDN WG ¿C to
bring all the languages in the world into the Internet. Therefore,
the WG shall firstly accomplish the course of "NtoU" (Native Language
to UNICODE Conversion) in both technical and strategic term before
carrying out the IDNA.
Suppose that the NtoU will be accomplished by user's operating system,
engineers should firstly know the coding type of the language
(i.e. Language coding type) that the user input, then they will be
able to convert such language into UNICODE accurately.
Since the language coding type dose exist, it is possible for us to
define the specific NtoU in accordance with the characteristics of certain
language. It is also the point in which the IDN technology can fully
demonstrate its creativity.
Examples presented below are concerning about the T/S Conversion. We
are aiming at illustrating different technical solutions.
3. The orientation of the language specificality in the ANI solution
(1). Situation A: Define the Language specificality in NtoU, and
accomplish NtoU by application programs
Supposes:
a. NtoU does not belong to the Nameprep, NtoU is accomplished by
application programs;
b. If one registers a domain name of ".xx" domain, the system will
strictly follows the rule of TSconv. All the data loaded in the
DNS server will be Simplified Chinese Domain Names complying with
this rule;
c. If one registers a domain name of ".yy" domain, the system will
strictly follows the "yyRule". All the data loaded in the DNS
server will be Domain Names complying with the yyRule;
d. The domain ".zz" does not comply with any convertible regulation;
e. The application 1 (App.1) complies with the TSconv rule in NtoU;
the App.2 complies with the yyRule in NtoU; the App.3 does not
comply with any rules.
Conclusion:
a. Users of App.1 can access domain names of domain ".xx",".yy",".zz"
accurately by typing simplified Chinese characters ("SC" for short);
The users can access ".xx" domain accurately by typing Traditional
Chinese characters ("TC" for short).
But the TC domain names in the domain ".yy" and ".zz" can not be
accessed by the users of App.1 permanently.
b. Users of App.2 can access domain names of domain ".xx",".yy" and
".zz" accurately by typing characters comply with yyRule.
Users can access ".yy" domain accurately by typing characters which
do not comply with yyRule;
But domain names in the domain ".xx" and ".zz" that do not comply
with yyRule can not be accessed by the users of App.2 permanently.
c. Users of App.3 can access domain names of domain ".xx", ".yy" and
".zz" by typing characters that comply with both TSconv Rule and
yyRule Rule.
Users can not access ".xx" domain names by typing characters that
do not comply with the TSconv rule.
Users can not access ".yy" domain names by typing characters that
do not comply with yyRule.
Users can access domain names in domain ".zz" by typing characters
that do not comply with either TSconv or yyRule.
Summary:
Adopting this method that separate NtoU from Nameprep will bring about
variance of the applications and chaos among users.
(2). Situation B: Define the Specificality of the language in the Nameprep
Supposes:
a. NtoU belongs to the Nameprep, NtoU is accomplished in the Nameprep.
b. If one registers a domain name of domain ".xx", the system will
strictly follows the TSconv. The data loaded in the DNS server will
be only the Simplified Chinese domain names that comply with TSconv.
c. If one registers a domain name of domain ".yy", the system will
strictly follows yyRule, the data loaded in the DNS server will be
only the domain names that comply with yyRuleú©
d. Domain ".zz" does not comply with any rules.
Conclusions:
a. User can access domain names of domain ".xx", ".yy" and ".zz"
accurately by typing SC that comply with yyRule;
Users can access ".xx" domain accurately by typing TC;
Users can access ".yy" domain accurately by typing characters that
do not comply with yyRule;
But domain names in domain ".xx" and ".zz"" that do not comply
with yyRule can not be accessed by users permanently;
Domain names in domain ".yy" and ".zz" that do not comply with
TSconv rule can not be accessed by users permanently;
Summary:
a. achieved the coherence of the applications
b. Domain names that do not follow the rules can not be accessed
by users.
(3). Situation C: Register Solution of language encoding tag
Supposes:
a. NtoU belongs to the Nameprep, NtoU is accomplished in the Nameprep;
d. If one registers a Chinese domain name of domain ".xx" or ".zz",
the system will strictly require user to register two domain name
according to the language encoding tag and two different rules
(TSconv and STconv).
Conclusions:
a. Two records will be created once a Chinese domain name being
registered. In the domain ".xx" or ".zz", one record complies
with TSconv and the other one complies with STconv.
b. Users can access domain names in the domain ".xx" and ".zz" by
typing SC or TC whether they comply with TSconv or not.
Users can access domain names in the domain ".xx" and ".zz" by
typing SC or TC whether they comply with STconv or not.
c. On the analogy of this, other language rules can be applied.
Summary:
a. achieved the coherence of the applications
b. achieved the coherence of the access results
3. Realizing Scheme
(1). Registration Scheme
+--------+ +--------+
| .xx | | .zz |
+--------+ +--------+
^ ^ ^ ^
| \ / |
| / \ |
+------------+-----------+
| TSconv | STconv |
+------------+-----------+
| Register simultaneously|
+------------------------+
^ ^
\ /
+------------+-----------+
| User (SC)|(TC) |
+------------+-----------+
(2). Resolution process
+-------+ +-------+ +-------+
| .xx | | .zz | | .yy |
+-------+ +-------+ +-------+
^ ^ ^
| | |
+------------+------------+
^ ^ ^
| | |
+--------+ +--------+ +--------+
| TSconv | | STconv | | yyRule |
+--------+ +--------+ +--------------------+
|user(SC)| |user(TC)| |user(other language)|
+--------+ +--------+ +--------------------+
4. Authors' Address
Xiang Deng
China Internet Network Information Center
NO.4 South 4th ST. Beijing, P.R.China, 100080, PO BOX 349
Tel: +86-10-62619750
5. Acknowledgement
Yang Yu <yuyang@cnnic.net.cn>
YunGang Chen <chen@cnnic.net.cn>
Yanfeng WANG <wyf@cnnic.net.cn>
XiaoDong Li <lee@cnnic.net.cn>
GuoNian Sun <sun@cnnic.net.cn>
6. References
[IDNREQ] Requirements of Internationalized Domain Names, Zita Wenzel,
James Seng, draft-ietf-idn-requirements
[NAMEPREP] Paul Hoffman & Marc Blanchet, Preparation of
Internationalized Host Names, draft-ietf-idn-nameprep
[RFC2119] Scott Bradner, Key words for use in RFCs to Indicate
Requirement Levels, March 1997, RFC 2119.
[STD13] Paul Mockapetris, Domain names - implementation and
specification, November 1987, STD 13 (RFC 1034 and 1035).
[UNAME] Internationalized Domain Names and Unique Identifiers/Names
Li Ming TSENG, Jan Ming HO, Hua Lin QIAN, Kenny HUANG
draft-ietf-idn-uname
[TSCONV] Traditional and Simplified Chinese Conversion
Xiao Dong Lee, Nai Wen Hsu, Erin Chen, Guo Nian Sun
draft-ietf-idn-tsconv
[ISO10646] ISO/IEC 10646-1:2000. International Standard -- Information
technology -- Universal Multiple-Octet Coded Character Set
(UCS) -- Part 1: Architecture and Basic Multilingual Plane.
[Unicode3] The Unicode Consortium, "The Unicode Standard -- Version3.0",
ISBN 0-201-61633-5.