Internet DRAFT - draft-hurtta-eai-encapsulation
draft-hurtta-eai-encapsulation
Email Address Internationalization K. Hurtta
(EAI) March 17, 2007
Internet-Draft
Intended status: Experimental
Expires: September 18, 2007
Encapsulation mechanism for Internationalized Email
draft-hurtta-eai-encapsulation-01
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 18, 2007.
Copyright Notice
Copyright (C) The IETF Trust (2007).
Abstract
The Email Address Internationalization (EAI) is implemented by
allowing UTF-8 characters in SMTP envelope and mail headers. To
deliver email which uses UTF-8 in email headers through EAI non-
compliant environment converting (i.e downgrading) or encapsulation
mechanism is required. Some UTF-8 email may sign email headers or
email header fields. This document describes mechanism for
encapsulation when converting can not be used because of signed email
Hurtta Expires September 18, 2007 [Page 1]
Internet-Draft EAI Encapsulation March 2007
headers. Encapsulation may also be used to forward EAI email through
EAI non-compliant environment that way that original EAI email can be
recovered.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Addition to internationalized email header . . . . . . . . . . 4
3.1. "Downgrade-Method" header field . . . . . . . . . . . . . 4
3.2. "I18N-Received" header field . . . . . . . . . . . . . . . 5
3.3. Registration of Downgrade-Method header field . . . . . . 5
3.4. Registration of I18N-Received header field . . . . . . . . 5
4. Encapsulation format . . . . . . . . . . . . . . . . . . . . . 6
4.1. "multipart/utf8-encapsulated" media type . . . . . . . . . 7
4.2. Registration of media type multipart/utf8-encapsulated . . 7
4.3. Registration of media type text/utf8-header . . . . . . . 9
5. Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . 10
5.1. Generic encapsulation . . . . . . . . . . . . . . . . . . 10
5.1.1. Encapsulation of recursive part . . . . . . . . . . . 15
5.1.2. Encapsulation example . . . . . . . . . . . . . . . . 18
5.1.3. Multipart encapsulation example . . . . . . . . . . . 19
5.1.4. Unknown top level type encapsulation example #1 . . . 21
5.1.5. Unknown top level type encapsulation example #2 . . . 22
5.2. Downgrading of internationalized email message . . . . . . 22
5.2.1. Encapsulation example . . . . . . . . . . . . . . . . 25
5.2.2. Multipart/signed encapsulation example . . . . . . . . 26
5.3. Attaching internationalized email message . . . . . . . . 29
5.3.1. Attaching example . . . . . . . . . . . . . . . . . . 29
6. Decoding encapsulation . . . . . . . . . . . . . . . . . . . . 31
6.1. Generic decoding . . . . . . . . . . . . . . . . . . . . . 31
6.1.1. Decoding of recursive part . . . . . . . . . . . . . . 34
6.2. Upgrading of internationalized email message . . . . . . . 35
6.2.1. Upgrading example . . . . . . . . . . . . . . . . . . 36
6.3. Retrieving attached internationalized email message . . . 38
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39
8. Security Considerations . . . . . . . . . . . . . . . . . . . 39
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 40
10.1. Normative References . . . . . . . . . . . . . . . . . . . 40
10.2. Informative References . . . . . . . . . . . . . . . . . . 41
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 41
Intellectual Property and Copyright Statements . . . . . . . . . . 42
Hurtta Expires September 18, 2007 [Page 2]
Internet-Draft EAI Encapsulation March 2007
1. Introduction
Internationalized email includes UTF-8 characters [RFC3629] in email
headers. When internationalized email is delivered to EAI non-
compliant environment it's email header fields are converted (i.e.
downgraded) to ASCII compatible form. When email comes back EAI
compliant environment it is upgraded to internationalized form by
decoding ASCII compatible encodings.
When internationalized email is downgraded to ASCII compatible form
and then upgraded to internationalized form, the result is not
necessary original mail. For example some header fields may have
originally used ASCII compatible form, but upgrading converts them to
UTF-8 form.
Sometimes, however, it be required that original internationalized
email header part can be recovered. This document describes
mechanism for encapsulation which allows recovering the original
internationalized email.
If mail headers or some mail header fields and message parts are
cryptographically signed, this can require that the original mail is
recovered before signature of mail is checked.
This document provides an encapsulation method which has the
following properties:
o The encapsulation does not produce nesting encodings.
o Content of an encapsulated mail is accessible to EAI non-compliant
user agents.
o The encapsulation does not hide original MIME parts although
original MIME structure may be obscured.
o The encapsulation provides way to recover a original
internationalized email.
Media types "multipart/utf8-encapsulated" and "text/utf8-header" are
introduced.
This document provides markup which indicates when downgrading to EAI
non-compliant environment should be done with this encapsulation.
That is done by adding "Downgrade-Method: encapsulate" header field.
If internationalized email is encapsulated, "Downgrade-Method:
encapsulated" header field is used.
Only minimal amount of header fields are generated or left to header
part of encapsulated message. This is used to hide signatures, which
are placed to header fields, during encapsulation. For example
Domain Keys Identified Mail [DKIM-Charter] uses this kind of
signatures. Original header fields are stored in "text/utf8-header"
Hurtta Expires September 18, 2007 [Page 3]
Internet-Draft EAI Encapsulation March 2007
MIME part.
The "multipart/signed" media type [RFC1847] signs header fields from
MIME header. After encapsulation signature fails, because MIME
header is changed. That signature is hidden by replacing "multipart/
signed" in the "Content-Type" header field with "multipart/mixed"
value. Original "Content-Type" header field is stored in "text/
utf8-header" MIME part.
This encapsulation copies all header fields of internationalized
email to "text/utf8-header" MIME part. Saved header fields from
"text/utf8-header" MIME part and "Received" header fields from
encapsulation email are used when upgrading. Because this may cause
duplication of "Received" header fields, original "Received" header
fields are renamed to "I18N-Received" during encapsulation.
2. Terminology
Terminology for this document is defined in [ietf-eai-framework] and
[RFC2045].
3. Addition to internationalized email header
New header fields are introduced.
3.1. "Downgrade-Method" header field
The "Downgrade-Method" header field is added to Internet Message
Format [RFC2822] as specified below:
fields /= downgrade-method
downgrade-method = "Downgrade-Method:" downgrade-code CRLF
downgrade-code = "encapsulate" / "encapsulated"
; term <fields> is defined in [RFC2822]
Value "encapsulate" tells that downgrading is requested to be done
with the encapsulation defined in this document and value
"encapsulated" indicates that downgrading is done with encapsulation
defined in this document.
Hurtta Expires September 18, 2007 [Page 4]
Internet-Draft EAI Encapsulation March 2007
3.2. "I18N-Received" header field
The "I18N-Received" header field is added to Internet Message Format
[RFC2822] as specified below:
received /= "I18N-Received:" name-val-list ";" date-time CRLF
; terms <received>, <name-val-list> and <date-time>
; are defined in [RFC2822]
Original "Received" header fields are renamed as "I18N-Received"
during encapsulation. "Received" header fields are saved with
original name to "text/utf8-header" MIME part.
3.3. Registration of Downgrade-Method header field
This section provides the header field registration application (as
per [RFC3864]).
Header field name: Downgrade-Method
Applicable protocol: mail
Status: experimental
Author/Change controller:
Kari Hurtta
hurtta-ietf@elmme-mailer.org
Specification document(s): RFC XXXX
Related information:
Downgrade-Method is used for signaling encapsulation with
multipart/utf8-encapsulated media type.
3.4. Registration of I18N-Received header field
This section provides the header field registration application (as
per [RFC3864]).
Header field name: I18N-Received
Applicable protocol: mail
Status: experimental
Hurtta Expires September 18, 2007 [Page 5]
Internet-Draft EAI Encapsulation March 2007
Author/Change controller:
Kari Hurtta
hurtta-ietf@elmme-mailer.org
Specification document(s): RFC XXXX
Related information:
I18N-Received is used together with multipart/utf8-encapsulated
media type.
4. Encapsulation format
A "multipart/utf8-encapsulated" media type splits internationalized
email message [ietf-eai-utf8headers] or MIME body part to two parts:
o The header part of internationalized email message or header part
of MIME body part is put to first body part of the "multipart/
utf8-encapsulated" media type. Media type of the first body part
is "text/utf8-header".
o The body part of internationalized email message or header part of
MIME body part is put to second body part of of the "multipart/
utf8-encapsulated" media type. Media type of the second body part
is same than the media type of original internationalized email
message or the original MIME body part. However, if media type of
original internationalized email message or original MIME body
part was "multipart/signed" , media type of a second body part is
"multipart/mixed". In some cases media type is "application/
octet-stream".
NOTE: This encapsulation assumes that the "preamble" and "epilogue"
areas of multipart media types include only ASCII. If these areas
include UTF-8 text, that text is lost if encapsulating "multipart/
utf8-encapsulated" is converted to ASCII compatible format (i.e.
during 8BITMIME downgrading [RFC1652].)
This lost UTF-8 text on "preamble" and "epilogue" areas of
multipart media types can be solved by adding third and fourth
body part to the "multipart/utf8-encapsulated" media type.
However author believes that this unnecessarily complicates
encapsulation format and algorithm. The author assumes that
messages which use signing do not put UTF-8 text to "preamble" and
"epilogue" areas of multipart media types. If message is not
signed, lost "preamble" and "epilogue" areas do not cause harm.
Hurtta Expires September 18, 2007 [Page 6]
Internet-Draft EAI Encapsulation March 2007
4.1. "multipart/utf8-encapsulated" media type
The "multipart/utf8-encapsulated" can be used on three different
roles. The "type" parameter is defined for "multipart/
utf8-encapsulated" media type. Value of "type" parameter is defined
as following:
type-value = "encapsulated" / "message" / "part"
o Value "encapsulated" is used, when "multipart/utf8-encapsulated"
media type is used as downgrading format of internationalized
email. Value of "type" is set to "encapsulated" when
internationalized email is downgraded because of
"downgrade=encapsulate" value on "Header-Type" header field.
o Value "message" is used, when "multipart/utf8-encapsulated" media
type is used same purpose, to indicate which media type "message/
rfc822" is used for the non-EAI content. Value of "type" is set
to "message" when internationalized email is attached to or
included in the message. Roughly "multipart/utf8-encapsulated;
type=message" is equivalent of "message/rfc822" except that format
of attachment is different.
o Value "part" is used, "multipart/utf8-encapsulated" media type is
used as downgrading format of MIME body part. Value of "type" is
set to "part" MIME structure of internationalized email or MIME
body part is recursively downgraded, and MIME body part with UTF-8
header is found.
4.2. Registration of media type multipart/utf8-encapsulated
This section provides the media type registration application (as per
[RFC4288]).
Type name: multipart
Subtype name: utf8-encapsulated
Required parameters:
The "boundary" parameter is requires as per RFC 2046.
The "type" parameter is required as per RFC XXXX.
Optional parameters:
Encoding considerations: 8bit or binary
Security considerations:
Hurtta Expires September 18, 2007 [Page 7]
Internet-Draft EAI Encapsulation March 2007
This media type provides a method to encapsulate mail data.
Specially this media type provides a method to smuggle mail header
fields so that mail scanners do not see them. This may cause new
security threats.
This encapsulation does not hide original MIME parts. However,
original MIME structure may be obscured. This may provide a
method to smuggle MIME parts so that mail scanners do not see
them. This may cause new security threats.
This encapsulation preserves only "Received" header fields from
encapsulating message. This may hide information when
encapsulated message is upgraded to internationalized email
format.
Interoperability considerations:
This media type provides a method to encapsulate internationalized
email. Recipient of encapsulated email must decode encapsulation,
before the email is fully accessible. However original MIME parts
are not hidden from mail agents which do not know encapsulation
used by this media type.
Published specification: RFC XXXX
Applications that use this media type:
Internationalized mail user agents (MUAs), mail transport agents
(MTAs) and IMAP servers.
Additional information:
Magic number(s):
File extension(s):
Macintosh file type code(s):
Person & email address to contact for further information:
Kari Hurtta
hurtta-ietf@elmme-mailer.org
Intended usage: common
Restrictions on usage:
Author: Kari Hurtta
Change controller: Kari Hurtta
Hurtta Expires September 18, 2007 [Page 8]
Internet-Draft EAI Encapsulation March 2007
4.3. Registration of media type text/utf8-header
This section provides the media type registration application (as per
[RFC4288]).
Type name: text
Subtype name: utf8-header
Required parameters:
The "charset" with value "UTF-8", if UTF-8 header in fact is
encapsulated.
Optional parameters:
charset
Encoding considerations: 7bit or 8bit
"8bit", if UTF-8 header in fact is encapsulated.
Security considerations:
This media type provides a method to encapsulate mail data.
Specially this media type provides a method to smuggle mail header
fields so that mail scanners do not see them. This may cause new
security threats.
Interoperability considerations:
Mail agents which do not know this media type, treat this as text/
plain media type.
Published specification: RFC XXXX
Applications that use this media type:
Internationalized mail user agents (MUAs), mail transport agents
(MTAs) and IMAP servers.
Additional information:
On some cases ASCII header part is encapsulated instead of UTF-8
header part.
Hurtta Expires September 18, 2007 [Page 9]
Internet-Draft EAI Encapsulation March 2007
Magic number(s):
File extension(s):
Macintosh file type code(s):
Person & email address to contact for further information:
Kari Hurtta
hurtta-ietf@elmme-mailer.org
Intended usage: common
Restrictions on usage:
Author: Kari Hurtta
Change controller: Kari Hurtta
5. Encapsulation
On encapsulation the internationalized email message or MIME body
part is split to two MIME body parts of "multipart/utf8-encapsulated"
media type.
There is three cases of encapsulation:
o When internationalized email message is downgraded.
o When internationalized email message is attached to or included to
message.
o When MIME body part is encapsulated because there is UTF-8 text on
header. This is in the recursive part of algorithm.
On "Generic Encapsulation" (Section 5.1) are described common parts
of these three encapsulations.
5.1. Generic encapsulation
Both an email message and MIME body part follow same general syntax:
o Both have a header part and a body part.
o Header part is followed by body part and these are separated by an
empty line.
Term "entity" refers to both an email message and a MIME body part.
NOTE: Subtypes of "message" (i.e. media type is message/*) other
than "message/rfc822" and other composite types than "multipart"
or "message" are treated specially. This is done so that this
algorithm is stable and result does not change when new subtypes
of "message" are registered or new composite top level types are
standardised. Unknown top level types are treated same way
Hurtta Expires September 18, 2007 [Page 10]
Internet-Draft EAI Encapsulation March 2007
because it is not possible to know if the top level type is
composite. Also in these cases type is treated as "application/
octet-stream" if body part of original entity includes non-ASCII
characters. Type information is not lost, because the whole
header part of original entity is stored to the body of first MIME
body part of the encapsulating entity. Processing as
"application/octet-stream" is done because the algorithm does not
know how to encapsulate it as a composite type. If the body part
of original entity includes only ASCII characters, there can not
be UTF-8 headers (when it is treated as composite type).
NOTE: This algorithm looks complex. This complexity is result of
the requirement that so called "the nested encoding rule" is not
violated. This requirement causes that composite media types must
be processed recursively.
Special handling of unknown composite types, which includes non-
ASCII characters, as "application/octet-stream" causes that "the
nested encoding rule" is violated. In case of unknown types this
is unavoidable, because it is not possible to parse internal
structure of unknown types.
Encapsulating entity is generated from original internationalized
email message or MIME body part in the following way:
o New header part for encapsulating entity is generated.
* Media type for encapsulating entity is "multipart/
utf8-encapsulated".
o New body part for encapsulating entity is generated. This body
part consists two MIME body parts.
* Media type for first MIME body part is "text/utf8-header".
+ Value of "charset" parameter is "UTF-8", if header part of
original entity includes UTF-8 characters.
+ NOTE: This seems strange, but in certain cases also ASCII-
only header part is encapsulated. In that case "charset"
parameter is not required.
* Body part of the first MIME body part is the header part of
original entity (original internationalized email message or
MIME body part).
* It is strongly recommended that the body of the first MIME body
part is base64 encoded (and of course "content-transfer-
encoding" header field is updated correspondingly).
* Generation of second MIME body part is described in the next
chapters.
Hurtta Expires September 18, 2007 [Page 11]
Internet-Draft EAI Encapsulation March 2007
NOTE: An Unix mailbox format changes "From" on beginning of line to
">From". Therefore it is useful that "text/utf8-header" is
encoded with base64 even when it includes only ASCII header
fields.
Actually it is more common to replace "From " with ">From ". This
does not touch "From" header field (if there is no space between
"From" and ":").
The second MIME body part of encapsulating entity is generated in
following way:
1. Original entity is checked for following cases:
* Media type value (type/subtype) of the original entity
includes other than ASCII characters [ASCII]. This is an
error condition.
* Media type of the original entity is a subtype of "message"
(i.e. media type is message/*) and it is not "message/rfc822"
and the body part of original entity includes non-ASCII
character.
* Top level type of the original entity is unknown, the body
part of the original entity includes non-ASCII characters and
encoding of the original entity is identity (i.e. "content-
transfer-encoding" is "8bit" or "binary")
* Top level type of the original entity is other composite type
than "multipart" or "message" and the body part of original
entity includes non-ASCII characters.
* Media type of the original entity is a subtype of "multipart"
(i.e. media type is multipart/*) and "boundary" parameter is
missing. This is an error condition.
If found,
* Media type for the second MIME body part is "application/
octet-stream"
* Body part of the second MIME body part is body part of the
original entity.
* "Content-transfer-encoding" value for second MIME body part is
copied from the original entity, if it includes only ASCII
characters. Otherwise it is set to "7bit", "8bit" or "binary"
as appropriate. Non-ASCII value is an error condition.
2. Otherwise if the media type of original entity is "multipart/
signed", then
* Media type for the second MIME body part is "multipart/mixed"
+ Generation of the "boundary" parameter is described on next
chapters.
* Body part of the second MIME body part is "Composite
encapsulated body part". Generation if this is described in
the next chapters.
Hurtta Expires September 18, 2007 [Page 12]
Internet-Draft EAI Encapsulation March 2007
* "Content-transfer-encoding" is set to "7bit", "8bit" or
"binary" as appropriate.
+ If the "Content-transfer-encoding" value of original entity
is other than "7bit", "8bit" or "binary", this is an error
condition.
3. Otherwise original entity is checked for following cases:
* Top level type of original entity is other type than
"multipart" or "message".
+ This includes all discrete media types.
+ This includes all unknown top level types.
* Media type of original entity is subtype of "message" (i.e.
media type is message/*) and it is not "message/rfc822".
If found,
* Media type for second MIME body part is same than media type
of original entity.
+ Copying of media type parameters from original entity to
second MIME body part is described on next chapters.
* Body part of second MIME body part is body part of original
entity.
* "Content-transfer-encoding" value for second MIME body part is
copied from original entity, if it includes only ASCII
characters. Otherwise it is set to "7bit", "8bit" or "binary"
as appropriate. Non-ASCII value is error condition.
4. Otherwise if original entity is composite type ("multipart" or
"message/rfc822"),
* Media type for second MIME body part is same than media type
of original entity.
+ Copying of media type parameters from original entity to
second MIME body part is described on next chapters.
* Generation of body part for second MIME body part is handled
specially when media type of original entity is composite.
+ Body of original entity is scanned when entity is
composite.
+ Generally that causes that processing is recursive.
+ Body part of second MIME body part is called with term
"composite encapsulated body part", if media type of
original entity is composite. Generating of this body part
is described on next chapter.
* "Content-transfer-encoding" it is set to "7bit", "8bit" or
"binary" as appropriate.
+ If "Content-transfer-encoding" value of original entity is
other than "7bit", "8bit" or "binary", this is error
condition.
Media type parameters from original entity to second MIME body part
is copied on following way
Hurtta Expires September 18, 2007 [Page 13]
Internet-Draft EAI Encapsulation March 2007
o This copying is done when media type for second MIME body part is
same than media type of original entity.
o ASCII parameters are copied (however see special note about
"boundary" on next chapters.)
o UTF-8 comments are removed.
o Parameters which have UTF-8 value are encoded according of
[RFC2231] when copied.
* If required parameters of media type are known, and parameter
is not required for media type, it is not required that it is
copied (and encoded according of [RFC2231]).
o If parameter name have UTF-8 characters, this is error condition
and parameter is not copied.
o If "boundary" parameter value of multipart media type have UTF-8
characters, it is handled specially. This is described on next
chapters.
The "Composite encapsulated body part" is generated in following way:
o If "application/octet-stream" was assigned to media type for
second MIME body part, body part of original entity is resulting
"Composite encapsulated body part". This case is mentioned on
previous chapter.
o If media type of original entity is subtype of "message" (i.e.
media type is message/*) and it is not "message/rfc822", body part
of original entity is resulting "Composite encapsulated body
part". This case is mentioned on previous chapter.
o If media type of original entity is "message/rfc822", body part of
original entity parsed (to header and body part) and is processed
as described on "Encapsulation of recursive part" (Section 5.1.1).
Result of processing is "Composite encapsulated body part".
o If media type of original entity is subtype of "multipart" (i.e.
media type is multipart/*), body part of original entity is
processed as described on next chapter. Result of processing is
"Composite encapsulated body part".
o If top level type of original entity is other composite type than
"multipart" or "message", it is treated as unknown type. This
processing is described on previous chapter.
For multipart types "Composite encapsulated body part" is generated
as following:
1. A "boundary" parameter value from original entity is remembered.
* Handling of missing "boundary" parameter is described on
previous chapters.
2. A "boundary" parameter value for second MIME body part is
selected.
* Selected "boundary" parameter value must include only ASCII
characters.
Hurtta Expires September 18, 2007 [Page 14]
Internet-Draft EAI Encapsulation March 2007
* In general this can be same than a "boundary" parameter value
from original entity.
* If a "boundary" parameter value from original entity includes
UTF-8 characters, new ASCII-only value must selected.
3. The "preamble" area from body of original entity is copied to
"Composite encapsulated body part".
* If "preamble" area includes non-ASCII characters, this is an
error condition.
4. Body parts of multipart (from body of original entity) are
handled:
1. A boundary delimiter line is copied to "Composite
encapsulated body part", but that way that a boundary of
original entity is replaced with selected boundary of second
MIME body part.
2. A body part is parsed (to header and body part) and is
processed as described on "Encapsulation of recursive part"
(Section 5.1.1). Result is copied to "Composite encapsulated
body part".
5. A final boundary delimiter line is copied (from body of original
entity), to "Composite encapsulated body part" but that way that
boundary of original entity is replaced with selected boundary of
second MIME body part.
* A final final boundary delimiter line is not generated to
"Composite encapsulated body part" if a final boundary
delimiter line is missing on original entity. This is an
error condition.
6. The "epilogue" area from body of original entity is copied to
"Composite encapsulated body part".
* If "epilogue" area includes non-ASCII characters, this is
error condition.
NOTE: The CRLF preceding the boundary delimiter line is conceptually
attached to the boundary (as per [RFC2046]). That CRLF is not
part of body part of multipart. If encapsulation and decoding of
encapsulation process this CRLF different way, this encapsulation
do not preserve all CRLFes or add extra CRLFes.
NOTE: If original "Content-transfer-encoding" includes non-ASCII
characters, this algorithm do not able to decode resulting
encapsulation. Therefore it is recommended that internationalized
email message is bounced or rejected on that error condition.
5.1.1. Encapsulation of recursive part
An encapsulation of recursive part is done following way:
1. If header part of recursive part includes UTF-8 characters or if
media type of recursive part is "multipart/signed" then
Hurtta Expires September 18, 2007 [Page 15]
Internet-Draft EAI Encapsulation March 2007
* Recursive part is considered to be "original entity" and
"Generic encapsulation" (Section 5.1) is applied.
* Value of parameter "type" is set to "part" for resulting
"multipart/utf8-encapsulated" encapsulating entity.
* Resulting encapsulating entity result is result for
"Encapsulation on recursive part".
2. Otherwise if media type of recursive part is discrete, result for
"Encapsulation on recursive part" is recursive part itself.
3. Otherwise if media type of recursive part is "message/rfc822",
then
* Header part of result for "Encapsulation on recursive part"
result, is header part of recursive part.
* Body part of recursive part is parsed (to header and body
part) and is processed as described on "Encapsulation of
recursive part" (Section 5.1.1). Body part of result for
"Encapsulation on recursive part" is result of processing.
4. Otherwise if top level type is multipart (i.e. media type is
multipart/*) and "boundary" parameter exists, handling of it is
described on next chapter.
* Missing "boundary" parameter on multipart types is error
condition.
5. Otherwise if recursive part is ASCII only (body is ASCII, i.e.
"content-transfer-encoding" is "7bit") result for "Encapsulation
on recursive part" is recursive part itself.
6. Otherwise if encoding of recursive part is not identity (i.e.
"content-transfer-encoding" is not "8bit" or "binary") result for
"Encapsulation on recursive part" is recursive part itself.
7. Otherwise
* Recursive part is considered to be "original entity" and
"Generic encapsulation" (Section 5.1) is applied.
* For resulting "multipart/utf8-encapsulated" encapsulating
entity parameter "type" is set "part" as value.
* Resulting encapsulating entity result is result for
"Encapsulation on recursive part".
* NOTE: This seems strange, but unknown composite media types
are always encapsulated, if there is possibility that they
include embedded UTF-8 headers.
If top level type is multipart, result for "Encapsulation on
recursive part" is generated following way:
1. Header part of result for "Encapsulation on recursive part"
result is header part of recursive part.
2. A "boundary" parameter value from recursive part is remembered.
* Handling of missing "boundary" parameter is described on
previous chapters.
3. Body part for "Encapsulation on recursive part" result is
initiated.
Hurtta Expires September 18, 2007 [Page 16]
Internet-Draft EAI Encapsulation March 2007
4. A "preamble" area from body of recursive part is copied to body
part for "Encapsulation on recursive part" result.
* If a "preamble" area includes non-ASCII characters, this is an
error condition.
5. Body parts of multipart (from body of recursive part) are
handled:
1. A boundary delimiter line is copied to body part for
"Encapsulation on recursive part" result.
2. A body part is parsed (to header and body part) and is
processed as described on "Encapsulation of recursive part"
(Section 5.1.1). Result is copied to body part for
"Encapsulation on recursive part" result.
6. A final boundary delimiter line is copied (from body of recursive
part) to body part for "Encapsulation on recursive part" result.
* A final final boundary delimiter line is not generated to body
part for "Encapsulation on recursive part" result if final
boundary delimiter line is missing on recursive part. This is
an error condition.
7. An "epilogue" area from body of recursive part is copied to body
part for "Encapsulation on recursive part" result.
* If "epilogue" area includes non-ASCII characters, this is
error condition.
Hurtta Expires September 18, 2007 [Page 17]
Internet-Draft EAI Encapsulation March 2007
5.1.2. Encapsulation example
An encapsulation example
Original internationalized entity:
==========================================
Some-Header: { UTF-8 content }
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
==========================================
Encapsulated entity:
==========================================
Content-Type: multipart/utf8-encapsulated;
type={specified later}; boundary="12345"
Content-Transfer-Encoding: 8bit
--12345
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Some-Header: { UTF-8 content }
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
--12345
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
--12345--
==========================================
An empty line on end of "text/utf8-header" body is not copied from
original encapsulated headers. It is part of next boundary line of
"multipart/utf8-encapsulated".
Hurtta Expires September 18, 2007 [Page 18]
Internet-Draft EAI Encapsulation March 2007
NOTE: On this example "text/utf8-header" part is not base64 encoded
for clarity. Base64 encoding is recommended.
5.1.3. Multipart encapsulation example
An multipart encapsulation example
Original internationalized entity:
==========================================
Content-Type: Multipart/mixed; boundary=12345
Content-Transfer-Encoding: 8bit
--12345
Some-Header: { UTF-8 content }
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
--12345
==========================================
Encapsulated entity:
==========================================
Content-Type: multipart/utf8-encapsulated;
type={specified later}; boundary="67890"
Content-Transfer-Encoding: 8bit
--67890
Content-Type: text/utf8-header; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Type: Multipart/mixed; boundary=12345
Content-Transfer-Encoding: 8bit
--67890
Content-Type: Multipart/mixed; boundary=12345
Content-Transfer-Encoding: 8bit
--12345
Content-Type: multipart/utf8-encapsulated;
type=part; boundary="abcde"
Content-Transfer-Encoding: 8bit
Hurtta Expires September 18, 2007 [Page 19]
Internet-Draft EAI Encapsulation March 2007
--abcde
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Some-Header: { UTF-8 content }
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
--abcde
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
--abcde--
--12345--
--67890--
==========================================
"Generic encapsulation" causes that a top level header part is always
encapsulated, even when it is US-ASCII only. In general it is
assumed that on internationalized email there is always some header
fields which require this encapsulation.
Hurtta Expires September 18, 2007 [Page 20]
Internet-Draft EAI Encapsulation March 2007
5.1.4. Unknown top level type encapsulation example #1
An encapsulation example for unknown top level type with 7-bit body
Original internationalized entity:
==========================================
Some-Header: { UTF-8 content }
Content-Type: X-message8/plain
Content-Transfer-Encoding: 7bit
{ ASCII text }
==========================================
Encapsulated entity:
==========================================
Content-Type: multipart/utf8-encapsulated;
type={specified later}; boundary="12345"
Content-Transfer-Encoding: 8bit
--12345
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Some-Header: { UTF-8 content }
Content-Type: X-message8/plain
Content-Transfer-Encoding: 7bit
--12345
Content-Type: X-message8/plain
Content-Transfer-Encoding: 7bit
{ ASCII text }
--12345--
==========================================
Hurtta Expires September 18, 2007 [Page 21]
Internet-Draft EAI Encapsulation March 2007
5.1.5. Unknown top level type encapsulation example #2
An encapsulation example for unknown top level type with 8-bit body
Original internationalized entity:
==========================================
Some-Header: { UTF-8 content }
Content-Type: X-message8/plain
Content-Transfer-Encoding: 8bit
{ non-ASCII text }
==========================================
==========================================
Content-Type: multipart/utf8-encapsulated;
type={specified later}; boundary="12345"
Content-Transfer-Encoding: 8bit
--12345
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Some-Header: { UTF-8 content }
Content-Type: X-message8/plain
Content-Transfer-Encoding: 8bit
--12345
Content-Type: application/octet-stream
Content-Transfer-Encoding: 8bit
{ non-ASCII text }
--12345--
==========================================
5.2. Downgrading of internationalized email message
When an internationalized email [ietf-eai-utf8headers] leaves EAI
compliant environment downgrade is required. [ietf-eai-downgrade]
describes when downgrade occurs.
This document defines "Downgrade-Method" header field. Downgrading
method is selected following way:
o If "Downgrade-Method" header field value is "encapsulate",
downgrading of header part (and body) of mail is done as described
on this section.
Hurtta Expires September 18, 2007 [Page 22]
Internet-Draft EAI Encapsulation March 2007
o Otherwise all message headers (including header fields from MIME
body parts) may need to be parsed to discover that message is
internationalized email and is downgrading candidate.
* If a downgrading gateway is configured for tunneling operation
for some recipients of mail, downgrading of header part (and
body) of mail for these recipients is done as described on this
section.
* If "Downgrade-Method" header field header field does not exists
and a downgrading gateway is not configured for tunneling
operation, downgrading of header part (and body) of mail is
done according of [ietf-eai-downgrade].
* If "Downgrade-Method" header field exists and it's value is not
"encapsulate", this specification is not used for downgrading
of header part (and body) of mail.
Downgrading of internationalized email is done following way:
o Internationalized email is considered to be "original entity" and
"Generic encapsulation" (Section 5.1) is applied.
o For resulting "multipart/utf8-encapsulated" encapsulating entity
parameter "type" is set "encapsulated" as value.
o Resulting encapsulating entity is downgraded internationalized
email.
* Addition of email header fields to downgraded internationalized
email is described on next chapter.
When mail is downgraded, some email header fields must be added.
"Generic encapsulation" (Section 5.1) do not produce these email
header fields.
o "Downgrade-Method" header field with value "Encapsulated" is
added.
o "Received" header fields are copied from original international
email added with new header field name. "I18N-Received" header
field name is used for copied header fields. If "for" clause on
"Received" header field includes non-ASCII, it is removed when
"Received" header field is copied to "I18N-Received" header field.
If some header field (excluding "for" clause) includes non-ASCII
characters, it is not copied.
o "Mime-Version" header field with value "1.0" is added.
o "Date" header field is copied from original international email,
if it includes only ASCII characters. Otherwise it is generated.
o "From" header field is added. Several different values for "From"
header field which can be used:
* If "From" header field from original internationalized email
can be used, if it includes only ASCII characters.
* Algorithm from [ietf-eai-downgrade] can be used.
* Value for "From" header field can be taken from downgraded
envelope sender address.
Hurtta Expires September 18, 2007 [Page 23]
Internet-Draft EAI Encapsulation March 2007
* ASCII address which refers of a downgrading gateway, can be
used.
o "Subject" header field is copied from original internationalized
email, if it includes only ASCII characters. Otherwise several
different values for "Subject" header field can be used:
* Algorithm from [ietf-eai-downgrade] or from [RFC2047] can be
used.
* ASCII subject which refers to downgrading operation, can be
used.
o If "From" and "Subject" are from original internationalized email
and "Message-ID" header field on original internationalized email
includes only ASCII characters, "Message-ID" header field is
copied (from original internationalized email). Otherwise it is
optionally generated.
o Optionally "To" header is added. Several different values for
"To" header field which can be used:
* "To" header field from original international email can be
used, if it includes only ASCII characters.
* Algorithm from [ietf-eai-downgrade] can be used.
o Optionally "Cc" header is added. Several different values for
"Cc" header field which can be used:
* "Cc" header field from original international email can be
used, if it includes only ASCII characters.
* Algorithm from [ietf-eai-downgrade] can be used.
o It is important that all ASCII header fields are NOT copied. Some
header fields may be used for signatures. If signature is checked
from encapsulated form, it fails. For example Domain Keys
Identified Mail [DKIM-Charter] uses these kind signatures.
"Encapsulation on recursive part" (Section 5.1.1) mentions several
error conditions. Although it defines output on that case converting
MTA is permitted to bounce (return NDN) or reject (on SMTP level)
internationalized email message. Downgrading MUA can refuse
downgrading internationalized email message and give error message to
user or produce downgraded message and give warning message to user.
Silent operation is not recommended when error condition happens (on
downgrading MUA).
NOTE: Only "Date" and "From" header fields are required on email (as
per [RFC2822])
However "multipart/utf8-encapsulated" format is also usable for
non-EAI compliant MUAs assuming that they support MIME. Specially
if an original internationalized email message was using UTF-8
characters only on main header part and not on header part of MIME
body parts. Therefore it is useful if "From", "Subject", "To" and
"Cc" header fields are derived from original internationalized
email according of [ietf-eai-downgrade]. This allows reply
Hurtta Expires September 18, 2007 [Page 24]
Internet-Draft EAI Encapsulation March 2007
-commands work on non-EAI compliant MUAs.
5.2.1. Encapsulation example
An encapsulation example
Original internationalized email:
==========================================
Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: encapsulate
From: { UTF-8 address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { UTF-8 subject }
X-Foobar: XvrT
Mime-Version: 1.0
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
==========================================
Encapsulated internationalized email:
==========================================
I18N-Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: Encapsulated
From: { downgraded address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { RFC 2047 encoded subject }
Mime-Version: 1.0
Content-Type: multipart/utf8-encapsulated;
type=encapsulated; boundary="12345"
Content-Transfer-Encoding: 8bit
--12345
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Hurtta Expires September 18, 2007 [Page 25]
Internet-Draft EAI Encapsulation March 2007
Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: encapsulate
From: { UTF-8 address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { UTF-8 subject }
X-Foobar: XvrT
Mime-Version: 1.0
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
--12345
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
--12345--
==========================================
NOTE: On this example "text/utf8-header" part is not base64 encoded
for clarity. Base64 encoding is recommended -- especially because
it includes line starting with "From".
5.2.2. Multipart/signed encapsulation example
Multipart/signed [RFC1847] encapsulation example
Original internationalized email:
==========================================
Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: encapsulate
From: { UTF-8 address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { UTF-8 subject }
X-Foobar: XvrT
Mime-Version: 1.0
Content-Type: multipart/signed;
protocol="application/XYZ-signature";
Hurtta Expires September 18, 2007 [Page 26]
Internet-Draft EAI Encapsulation March 2007
micalg="ABC"; boundary=12345
Content-Transfer-Encoding: 8bit
--12345
Content-Description: { UTF-8 description }
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
--12345
Content-Type: application/XYZ-signature
{ signature data }
--12345--
==========================================
Encapsulated internationalized email:
==========================================
I18N-Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: Encapsulated
From: { downgraded address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { RFC 2047 encoded subject }
Mime-Version: 1.0
Content-Type: multipart/utf8-encapsulated;
type=encapsulated; boundary="45678"
Content-Transfer-Encoding: 8bit
--45678
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: encapsulate
From: { UTF-8 address }
To: someone@example.org
Hurtta Expires September 18, 2007 [Page 27]
Internet-Draft EAI Encapsulation March 2007
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { UTF-8 subject }
X-Foobar: XvrT
Mime-Version: 1.0
Content-Type: multipart/signed;
protocol="application/XYZ-signature";
micalg="ABC"; boundary=12345
Content-Transfer-Encoding: 8bit
--45678
Content-Type: multipart/mixed;
boundary=12345
Content-Transfer-Encoding: 8bit
--12345
Content-Type: multipart/utf8-encapsulated;
type=part; boundary="abcde"
Content-Transfer-Encoding: 8bit
--abcde
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-Description: { UTF-8 description }
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
--abcde
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
--abcde--
--12345
Content-Type: application/XYZ-signature
{ signature data }
--12345--
--45678--
==========================================
Hurtta Expires September 18, 2007 [Page 28]
Internet-Draft EAI Encapsulation March 2007
Media type "multipart/signed" is replaced with "multipart/mixed" on
encapsulated message. This allows encapsulation of signed header on
MIME body part.
NOTE: Again, "text/utf8-header" should be base64 encoded. It is not
done for clarity.
NOTE: Normally various multipart/signed protocols defined that body
of signed content must be quoted-printable or base64 encoded if it
includes 8-bit characters. If it includes 8-bit characters,
signature is broken, when email is 8BITMIME downgraded [RFC1652].
Note that generally this encapsulation algorithm do not protect
against breaking of signature on that case. On that example is
may protect it, but that is side effect protection required for
encapsulation of "Content-Description" header field.
5.3. Attaching internationalized email message
"Message/rfc822" can be used to attach internationalized email
messages on EAI compliant environments if "message/rfc822" allows
UTF-8 header fields. "Multipart/utf8-encapsulated" with "type"
parameter value "message" can be used to attach internationalized
email messages on EAI non-compliant environments.
NOTE: When inside of "message/rfc822" have "Multipart/
utf8-encapsulated" with "type" parameter value "encapsulated",
this also represents attached internationalized email message.
However author believes that "multipart/utf8-encapsulated" with
"type" parameter value "message" provides useful shorthand.
NOTE: If internationalized email was stored inside of "message/
rfc822" media type and "message/rfc822" is inside of mime
structure which is encapsulated, "Encapsulation on recursive part"
(Section 5.1.1) produces where inside of "message/rfc822" have
"Multipart/utf8-encapsulated" with "type" parameter value "part".
"Multipart/utf8-encapsulated" media type, which represents
internationalized email message, is done following way:
o Internationalized email is considered to be "original entity" and
"Generic encapsulation" (Section 5.1) is applied.
o Value of parameter "type" is set to "message" for resulting
"multipart/utf8-encapsulated" encapsulating entity.
5.3.1. Attaching example
On following example mail from earlier example (Section 5.2.1) is
attached to message, which is sent to outside of EAI compliant
environment.
Hurtta Expires September 18, 2007 [Page 29]
Internet-Draft EAI Encapsulation March 2007
Encapsulating message:
==========================================
From: someone@example.org
To: A@CC.example.org
Subject: { UTF-8 subject } (fwd)
Mime-Version: 1.0
Content-Type: multipart/mixed;
boundary="12345"
Content-Transfer-Encoding: 8bit
--12345
Content-Type: Text/plain
See attached message.
--12345
Content-Type: multipart/utf8-encapsulated;
type=message; boundary="67890"
Content-Transfer-Encoding: 8bit
--67890
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: 8bit
Downgrade-Method: encapsulate
From: { UTF-8 address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { UTF-8 subject }
X-Foobar: XvrT
Mime-Version: 1.0
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
--67890
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
--67890--
--12345--
==========================================
Hurtta Expires September 18, 2007 [Page 30]
Internet-Draft EAI Encapsulation March 2007
In this example it is assumed that MUA knows that A@CC.example.org do
not handle UTF8SMTP messages and therefore encapsulates it.
Recipient (A@CC.example.org) may need helper application for media
type multipart/utf8-encapsulated although message is mostly readable
without helper.
6. Decoding encapsulation
There is three cases of encapsulation:
o When internationalized email message is tunneled through EAI non-
compliant environment, media type of message is "multipart/
utf8-encapsulated" with "type" parameter value "encapsulated".
Original message is inside of that type.
o When internationalized email message is included or attached
message, media type "multipart/utf8-encapsulated" with "type"
parameter value "message" represents included or attached message.
o When MIME body part is encapsulated, media type "multipart/
utf8-encapsulated" with "type" parameter value "part" encapsulates
original MIME body part.
On "Generic decoding" (Section 6.1) is described common parts of
decoding these three encapsulations.
6.1. Generic decoding
On decoding an internationalized email message or a MIME body part
from "multipart/utf8-encapsulated" are extracted. Both an email
message and a MIME body part are refereed with term "entity".
On error conditions encapsulating entity is not decoded. Instead
original encapsulating entity is returned.
Decoded internationalized entity is generated from encapsulating
entity (multipart/utf8-encapsulated) in following way:
o If media type of an encapsulating entity is not "multipart/
utf8-encapsulated", this is an error condition.
o If number of MIME body parts on encapsulating entity is not two
(2), this is an error condition.
o If media type of first MIME body part is not "text/utf8-header",
this is an error condition.
o If value of "charset" parameter of first MIME body part is not
"UTF-8" or "US-ASCII", this is an error condition. Missing
"charset" parameter is treated as equivalent of "US-ASCII" as per
[RFC2046].
o Body of first MIME body part forms header part of decoded entity.
Encoding (as given on "content-transfer-encoding" header field) is
decoded.
Hurtta Expires September 18, 2007 [Page 31]
Internet-Draft EAI Encapsulation March 2007
o Body of second MIME body parts forms body part of decoded entity.
Generating of body part of decoded entity is described on next
chapter.
Body of decoded entity is generated on following way:
o If both media type of second MIME body part is discrete and media
type for decoded entity (from body of first MIME body part) is
discrete, then
* If encoding for decoded entity (from body of first MIME body
part) is identity (i.e. "content-transfer-encoding" is "7bit",
"8bit" or "binary")
+ Body of second MIME body parts forms body part of decoded
entity.
+ Encoding (as given on "content-transfer-encoding" header
field on second MIME body part) is decoded.
* If encoding for decoded entity (from body of first MIME body
part) is same than encoding of second MIME body part,
+ Body of second MIME body parts forms body part of decoded
entity.
+ Encoding is not decoded.
* Otherwise this is an error condition.
o Otherwise if both top level type of second MIME body part is
"multipart" and top level type for decoded entity (from body of
first MIME body part) is "multipart", then
* Generating of body part of decoded entity is described on next
chapters ("Decoding of multipart").
o Otherwise if both media type of second MIME body part is "message/
rfc822" and media type for decoded entity (from body of first MIME
body part) is "message/rfc822", then
* A body of second MIME body part is parsed (to header and body
part) and is processed as described on "Decoding of recursive
part" (Section 6.1.1). Result is copied to body of decoded
entity.
o Otherwise if media type of second MIME body part is "application/
octet-stream", then
* If encoding for decoded entity (from body of first MIME body
part) is identity (i.e. "content-transfer-encoding" is "7bit",
"8bit" or "binary")
+ Body of second MIME body parts forms body part of decoded
entity.
+ Encoding (as given on "content-transfer-encoding" header
field on second MIME body part) is decoded.
* If encoding for decoded entity (from body of first MIME body
part) is same than encoding of second MIME body part,
+ Body of second MIME body part forms body part of decoded
entity.
Hurtta Expires September 18, 2007 [Page 32]
Internet-Draft EAI Encapsulation March 2007
+ Encoding is not decoded.
* Otherwise this is an error condition.
o Otherwise if media type of second MIME body part is same than
media type for decoded entity (from body of first MIME body part),
then
* If encoding for decoded entity (from body of first MIME body
part) is identity (i.e. "content-transfer-encoding" is "7bit",
"8bit" or "binary")
+ Body of second MIME body part forms body part of decoded
entity.
+ Encoding (as given on "content-transfer-encoding" header
field on second MIME body part) is decoded.
* If encoding for decoded entity (from body of first MIME body
part) is same than encoding of second MIME body part,
+ Body of second MIME body part forms body part of decoded
entity.
+ Encoding is not decoded.
* Otherwise this is an error condition.
+ NOTE: This algorithm do not handle cases where body part is
re-encoded (for example quoted-printable to base64.)
Reverse re-enconfig of course is possible, but it does not
necessary give exactly same representation.
* NOTE: This handles unknown media types. But unknown composite
media types was stored as "application/octet-stream", if they
includes non-ASCII characters, so this handles mostly discrete
media types. It is possible that generator of encapsulation
knows that type is discrete, but decoder of encapsulation do
not know it.
o Otherwise this is an error condition.
Body of decoded entity is generated following way when media type is
multipart (both on second MIME body part and on decoded entity):
1. A "boundary" parameter value from second MIME body part is
remembered.
* If "boundary" parameter is missing, this is a error condition.
2. A "boundary" parameter value from decoded entity (from body of
first MIME body part) is remembered. This is new boundary, which
is used on generated body of decoded entity.
* If a "boundary" parameter is missing, this is a error
condition.
3. A "preamble" area from body of second MIME body part is copied to
body of decoded entity.
* It is not an error condition on decoding if "preamble" area
includes non-ASCII characters.
4. Body parts of multipart (from body of second MIME body part) are
handled:
Hurtta Expires September 18, 2007 [Page 33]
Internet-Draft EAI Encapsulation March 2007
1. A boundary delimiter line is copied to body of decoded
entity, but that way that boundary of second MIME body part
is replaced with boundary of decoded entity.
2. A body part is parsed (to header and body part) and is
processed as described on "Decoding of recursive part"
(Section 6.1.1). Result is copied to body of decoded entity.
5. A final boundary delimiter line is copied to body of decoded
entity, but that way that boundary of second MIME body part is
replaced with boundary of decoded entity.
* A final final boundary delimiter line is not generated to
decoded entity if final boundary delimiter line is missing on
second MIME body part. This is not an error condition on
decoding.
6. An "epilogue" area from body of second MIME body part is copied
to body of decoded entity.
* It is not an error condition on decoding if "epilogue" area
includes non-ASCII characters.
NOTE: The CRLF preceding the boundary delimiter line is conceptually
attached to the boundary (as per [RFC2046]). That CRLF is not
part of body part of multipart.
6.1.1. Decoding of recursive part
Decoding of recursive part is done following way:
1. If media type of recursive part is "multipart/utf8-encapsulated"
and "type" parameter is "part" as value:
1. Recursive part is considered to be "encapsulating entity" and
"Generic decoding" (Section 6.1) is applied.
2. Resulting decoded entity is result for "Decoding of recursive
part".
2. If media type of recursive part is discrete, result for "Decoding
of recursive part" is recursive part itself.
3. Otherwise if media type of recursive part is "message/rfc822",
then
* Header part of result for "Decoding of recursive part" result,
is header part of recursive part.
* Body part of recursive part is parsed (to header and body
part) and is processed as described on "Decoding of recursive
part" (Section 6.1.1). Body part of result for "Decoding of
recursive part" is result of processing.
4. Otherwise if top level type of recursive part is multipart (i.e.
media type is multipart/*) and "boundary" parameter exists,
handling of it is described on next chapter.
* Missing "boundary" parameter on multipart types is not error
condition on decoding.
Hurtta Expires September 18, 2007 [Page 34]
Internet-Draft EAI Encapsulation March 2007
5. Otherwise if recursive part is ASCII only and encoding of
recursive part is identity (i.e. "content-transfer-encoding" is
"7bit", "8bit" or "binary") result for "Decoding of recursive
part" is recursive part itself.
6. Otherwise this is error condition.
* NOTE: This means that missing "boundary" parameter is error
condition for decoding if body is not ASCII only (or required
encoding).
* NOTE: This means that unknown composite types is error
condition, if body is not ASCII only (or required encoding).
If top level type is multipart, result for "Decoding of recursive
part" is generated following way:
1. Header part of result for "Decoding of recursive part" result, is
header part of recursive part.
2. A "boundary" parameter value from recursive part is remembered.
* Handling of missing "boundary" parameter is described on
previous chapters.
3. Body part for "Decoding of recursive part" result is initiated.
4. A "preamble" area from body of recursive part is copied to body
part for "Decoding of recursive part" result.
* It is not an error condition on decoding if "preamble" area
includes non-ASCII characters.
5. Body parts of multipart (from body of recursive part) are
handled:
1. A boundary delimiter line is copied to body part for
"Decoding of recursive part" result.
2. A body part is parsed (to header and body part) and is
processed as described on "Decoding of recursive part"
(Section 6.1.1). Result is copied to body part for "Decoding
of recursive part" result.
6. A final boundary delimiter line is copied (from body of recursive
part) to body part for "Decoding of recursive part" result.
* A final final boundary delimiter line is not generated to body
part for "Decoding of recursive part" result if final boundary
delimiter line is missing on second MIME body part. This is
not an error condition on decoding.
7. An "epilogue" area from body of recursive part is copied to body
part for "Decoding of recursive part" result.
* It is not an error condition on decoding if "epilogue" area
includes non-ASCII characters.
6.2. Upgrading of internationalized email message
When downgraded internationalized email enters EAI compliant
environment upgrade is allowed. [ietf-eai-downgrade] describes when
upgrade occurs.
Hurtta Expires September 18, 2007 [Page 35]
Internet-Draft EAI Encapsulation March 2007
This document defines "Encapsulated" value to "Downgrade-Method"
header field. "Header-Type" header field defines how upgrade occurs.
o If header field "Downgraded" exits, upgrading of header part (and
body) of mail is done according of [ietf-eai-downgrade].
o If header field "Downgrade-Method" exists with value is
"Encapsulated", upgrading of header part (and body) of mail is
done as described on this section.
o If both header field "Downgraded" and "Downgrade-Method" exists,
this is error condition and upgrading is not node.
Encapsulating entity is not decoded on error conditions. Instead
original encapsulating entity is returned.
Upgrading of internationalized email is done following way:
o If media type of downgraded internationalized email is not
"multipart/utf8-encapsulated" or if parameter "type" have not
"encapsulated" as value, this is a error condition.
o Downgraded internationalized email is considered to be
"encapsulating entity" and "Generic decoding" (Section 6.1) is
applied.
o Resulting decoded internationalized entity is upgraded
internationalized email.
o "Received" header fields from downgraded internationalized are
prepended to upgraded internationalized email.
* Upgraded internationalized email already includes all original
header fields. This adds trace header fields which are
inserted to mail after it was downgrading. This do not re-add
trace header fields which was added before downgrading, because
them are renamed to "I18N-Received" on downgraded
internationalized email.
6.2.1. Upgrading example
An upgrading example of mail from earlier example (Section 5.2.1) is
used. Mail is assumed 8BITMIME downgraded afterwards. This process
was added also some extra header fields to mime parts.
Downgraded internationalized email:
==========================================
Received: from fw.example.org
by upgrade.example.org with ESMTP
id JAX77356;
Wed, 13 Sep 2006 22:27:32 +0300
Received: from downgrade.example.org
by fw.example.org with ESMTP
id JAX77356;
Hurtta Expires September 18, 2007 [Page 36]
Internet-Draft EAI Encapsulation March 2007
Wed, 13 Sep 2006 22:27:29 +0300
I18N-Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: Encapsulated
From: { downgraded address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { RFC 2047 encoded subject }
Mime-Version: 1.0
Content-Type: multipart/utf8-encapsulated;
type=encapsulated; boundary="12345"
Content-Transfer-Encoding: 7bit
--12345
Content-Type: text/utf8-header; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable
by downgrade.example.org id JGR17356
Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: encapsulate
From: { q-p encoded UTF-8 address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { q-p encoded UTF-8 subject }
X-Foobar: XvrT
Mime-Version: 1.0
Content-Type: Text/plain; charset=3DUTF-8
Content-Transfer-Encoding: 8bit
--12345
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable
by downgrade.example.org id JGR17356
{ q-p encoded UTF-8 text }
--12345--
==========================================
Upgraded internationalized email:
Hurtta Expires September 18, 2007 [Page 37]
Internet-Draft EAI Encapsulation March 2007
==========================================
Received: from fw.example.org
by upgrade.example.org with ESMTP
id JAX77356;
Wed, 13 Sep 2006 22:27:32 +0300
Received: from downgrade.example.org
by fw.example.org with ESMTP
id JAX77356;
Wed, 13 Sep 2006 22:27:29 +0300
Received: from {idn-encoded-name}
by downgrade.example.org with ESMTP
id JGR17356;
Wed, 13 Sep 2006 22:27:25 +0300
Downgrade-Method: encapsulate
From: { UTF-8 address }
To: someone@example.org
Date: Wed, 13 Sep 2006 22:27:25 +0300
Subject: { UTF-8 subject }
X-Foobar: XvrT
Mime-Version: 1.0
Content-Type: Text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
{ UTF-8 text }
==========================================
Note handling of Received: header fields. That is only header field
what was preserved from downgraded internationalized email. All
other header fields are got from "text/utf8-header" MIME part. This
also means that upgrading do not need add "Header-Type" header field,
because it necessary already on "text/utf8-header" MIME part. On
downgraded e-mail there was empty line after "UTF-8 text", but on
upgraded email it is disappeared because it was part of multipart
boundary.
6.3. Retrieving attached internationalized email message
"Multipart/utf8-encapsulated" with "type" parameter value "message"
can be used to attach internationalized email messages on EAI non-
compliant environments.
Retrieving internationalized email can be done following way:
o If media type is "message/rfc822", then
* It is parsed (to header and body part).
Hurtta Expires September 18, 2007 [Page 38]
Internet-Draft EAI Encapsulation March 2007
* Body part is processed as described "Upgrading of
internationalized email message" (Section 6.2)
* Result is internationalized email.
o If media type is "Multipart/utf8-encapsulated" and parameter
"type" value is "message", then
* It is considered to be "encapsulating entity" and "Generic
decoding" (Section 6.1) is applied.
* Result is internationalized email.
7. IANA Considerations
IANA is requested to register I18N-Received and Downgrade-Method
header fields and multipart/utf8-encapsulated and text/utf8-header
media types as given on registration applications on this document.
8. Security Considerations
This "multipart/utf8-encapsulated" media type provides method to
encapsulate mail data. Specially this media type provides method to
smuggle mail header fields so that mail scanners do not see them.
This may provide new security threats.
This encapsulation do not hide original MIME parts. However original
MIME structure may be obscured. This may provide method to smuggle
MIME parts so that mail scanners do not see them. This may provide
new security threats.
This encapsulation preservers only "Received" header fields from
encapsulating message. This may hide information when encapsulated
message is upgraded to internationalized email format.
9. Acknowledgements
Originally this encapsulation format is suggested on former IMAA
mailing list discussions.
Various ideas are suggested on IMA mailing list discussions.
John C. Klensin was strongly encouraging author to write this
documentation.
10. References
Hurtta Expires September 18, 2007 [Page 39]
Internet-Draft EAI Encapsulation March 2007
10.1. Normative References
[ASCII] American National Standards Institute (formerly United
States of America Standards Institute), "USA Code for
Information Interchange", ANSI X3.4-1968, 1968.
ANSI X3.4-1968 has been replaced by newer versions with
slight modifications, but the 1968 version remains
definitive for the Internet.
[ietf-eai-framework]
Klensin, J. and Y. Ko, "Overview and Framework for
Internationalized Email", draft-ietf-eai-framework-05
(work in progress), February 2007.
[ietf-eai-downgrade]
YONEYA, Y., Ed. and K. Fujiwara, Ed., "Downgrading
mechanism for Email Address Internationalization",
draft-ietf-eai-downgrade-03 (work in progress),
March 2007.
[ietf-eai-utf8headers]
Yeh, J., Ed. and Abel, Ed., "Internationalized Email
Headers", draft-ietf-eai-utf8headers-04 (work in
progress), March 2007.
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bodies", RFC 2045, November 1996.
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046,
November 1996.
[RFC2047] Moore, K., "Multipurpose Internet Mail Extensions (MIME)
Part Three: Message Header Extensions for Non-ASCII Text",
RFC 2047, November 1996.
[RFC2822] Resnick, P., "Internet Message Format", RFC 2822,
April 2001.
[RFC2231] Freed, N. and K. Moore, "MIME Parameter Value and Encoded
Word Extensions: Character Sets, Languages, and
Continuations", RFC 2231, November 1997.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
10646", RFC 3629, November 2003.
Hurtta Expires September 18, 2007 [Page 40]
Internet-Draft EAI Encapsulation March 2007
10.2. Informative References
[DKIM-Charter]
IETF, "Domain Keys Identified Mail (dkim)", October 2006,
<http://www.ietf.org/html.charters/dkim-charter.html>.
[RFC1847] Galvin, J., Murphy, S., Crocker, S., and N. Freed,
"Security Multiparts for MIME: Multipart/Signed and
Multipart/Encrypted", RFC 1847, October 1995.
[RFC1652] Freed, N., Ed., Rose, M., Stefferud, E., and D. Crocker,
"SMTP Service Extension for 8bit-MIMEtransport", RFC 1652,
July 1994.
[RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and
Registration Procedures", RFC 4288, BCP 13, December 2005.
[RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration
Procedures for Message Header Fields", RFC 3864, BCP 90,
September 2004.
Author's Address
Kari Hurtta
Kala-Matti 4 B 24
02230 Espoo
FI
Email: hurtta-ietf@elmme-mailer.org
URI: http://iki.fi/keh/
Hurtta Expires September 18, 2007 [Page 41]
Internet-Draft EAI Encapsulation March 2007
Full Copyright Statement
Copyright (C) The IETF Trust (2007).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Hurtta Expires September 18, 2007 [Page 42]