Internet DRAFT - draft-hurtta-eai-encapsulation

draft-hurtta-eai-encapsulation





Email Address Internationalization                             K. Hurtta
(EAI)                                                     March 17, 2007
Internet-Draft
Intended status: Experimental
Expires: September 18, 2007


          Encapsulation mechanism for Internationalized Email
                   draft-hurtta-eai-encapsulation-01

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 18, 2007.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   The Email Address Internationalization (EAI) is implemented by
   allowing UTF-8 characters in SMTP envelope and mail headers.  To
   deliver email which uses UTF-8 in email headers through EAI non-
   compliant environment converting (i.e downgrading) or encapsulation
   mechanism is required.  Some UTF-8 email may sign email headers or
   email header fields.  This document describes mechanism for
   encapsulation when converting can not be used because of signed email



Hurtta                 Expires September 18, 2007               [Page 1]

Internet-Draft              EAI Encapsulation                 March 2007


   headers.  Encapsulation may also be used to forward EAI email through
   EAI non-compliant environment that way that original EAI email can be
   recovered.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Addition to internationalized email header . . . . . . . . . .  4
     3.1.  "Downgrade-Method" header field  . . . . . . . . . . . . .  4
     3.2.  "I18N-Received" header field . . . . . . . . . . . . . . .  5
     3.3.  Registration of Downgrade-Method header field  . . . . . .  5
     3.4.  Registration of I18N-Received header field . . . . . . . .  5
   4.  Encapsulation format . . . . . . . . . . . . . . . . . . . . .  6
     4.1.  "multipart/utf8-encapsulated" media type . . . . . . . . .  7
     4.2.  Registration of media type multipart/utf8-encapsulated . .  7
     4.3.  Registration of media type text/utf8-header  . . . . . . .  9
   5.  Encapsulation  . . . . . . . . . . . . . . . . . . . . . . . . 10
     5.1.  Generic encapsulation  . . . . . . . . . . . . . . . . . . 10
       5.1.1.  Encapsulation of recursive part  . . . . . . . . . . . 15
       5.1.2.  Encapsulation example  . . . . . . . . . . . . . . . . 18
       5.1.3.  Multipart encapsulation example  . . . . . . . . . . . 19
       5.1.4.  Unknown top level type encapsulation example #1  . . . 21
       5.1.5.  Unknown top level type encapsulation example #2  . . . 22
     5.2.  Downgrading of internationalized email message . . . . . . 22
       5.2.1.  Encapsulation example  . . . . . . . . . . . . . . . . 25
       5.2.2.  Multipart/signed encapsulation example . . . . . . . . 26
     5.3.  Attaching internationalized email message  . . . . . . . . 29
       5.3.1.  Attaching example  . . . . . . . . . . . . . . . . . . 29
   6.  Decoding encapsulation . . . . . . . . . . . . . . . . . . . . 31
     6.1.  Generic decoding . . . . . . . . . . . . . . . . . . . . . 31
       6.1.1.  Decoding of recursive part . . . . . . . . . . . . . . 34
     6.2.  Upgrading of internationalized email message . . . . . . . 35
       6.2.1.  Upgrading example  . . . . . . . . . . . . . . . . . . 36
     6.3.  Retrieving attached internationalized email message  . . . 38
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 39
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 39
   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 40
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 40
     10.2. Informative References . . . . . . . . . . . . . . . . . . 41
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 41
   Intellectual Property and Copyright Statements . . . . . . . . . . 42







Hurtta                 Expires September 18, 2007               [Page 2]

Internet-Draft              EAI Encapsulation                 March 2007


1.  Introduction

   Internationalized email includes UTF-8 characters [RFC3629] in email
   headers.  When internationalized email is delivered to EAI non-
   compliant environment it's email header fields are converted (i.e.
   downgraded) to ASCII compatible form.  When email comes back EAI
   compliant environment it is upgraded to internationalized form by
   decoding ASCII compatible encodings.

   When internationalized email is downgraded to ASCII compatible form
   and then upgraded to internationalized form, the result is not
   necessary original mail.  For example some header fields may have
   originally used ASCII compatible form, but upgrading converts them to
   UTF-8 form.

   Sometimes, however, it be required that original internationalized
   email header part can be recovered.  This document describes
   mechanism for encapsulation which allows recovering the original
   internationalized email.

   If mail headers or some mail header fields and message parts are
   cryptographically signed, this can require that the original mail is
   recovered before signature of mail is checked.

   This document provides an encapsulation method which has the
   following properties:
   o  The encapsulation does not produce nesting encodings.
   o  Content of an encapsulated mail is accessible to EAI non-compliant
      user agents.
   o  The encapsulation does not hide original MIME parts although
      original MIME structure may be obscured.
   o  The encapsulation provides way to recover a original
      internationalized email.
   Media types "multipart/utf8-encapsulated" and "text/utf8-header" are
   introduced.

   This document provides markup which indicates when downgrading to EAI
   non-compliant environment should be done with this encapsulation.
   That is done by adding "Downgrade-Method: encapsulate" header field.

   If internationalized email is encapsulated, "Downgrade-Method:
   encapsulated" header field is used.

   Only minimal amount of header fields are generated or left to header
   part of encapsulated message.  This is used to hide signatures, which
   are placed to header fields, during encapsulation.  For example
   Domain Keys Identified Mail [DKIM-Charter] uses this kind of
   signatures.  Original header fields are stored in "text/utf8-header"



Hurtta                 Expires September 18, 2007               [Page 3]

Internet-Draft              EAI Encapsulation                 March 2007


   MIME part.

   The "multipart/signed" media type [RFC1847] signs header fields from
   MIME header.  After encapsulation signature fails, because MIME
   header is changed.  That signature is hidden by replacing "multipart/
   signed" in the "Content-Type" header field with "multipart/mixed"
   value.  Original "Content-Type" header field is stored in "text/
   utf8-header" MIME part.

   This encapsulation copies all header fields of internationalized
   email to "text/utf8-header" MIME part.  Saved header fields from
   "text/utf8-header" MIME part and "Received" header fields from
   encapsulation email are used when upgrading.  Because this may cause
   duplication of "Received" header fields, original "Received" header
   fields are renamed to "I18N-Received" during encapsulation.


2.  Terminology

   Terminology for this document is defined in [ietf-eai-framework] and
   [RFC2045].


3.  Addition to internationalized email header

   New header fields are introduced.

3.1.  "Downgrade-Method" header field

   The "Downgrade-Method" header field is added to Internet Message
   Format [RFC2822] as specified below:

   fields /= downgrade-method

   downgrade-method = "Downgrade-Method:" downgrade-code CRLF

   downgrade-code = "encapsulate" / "encapsulated"

   ; term <fields> is defined in [RFC2822]

   Value "encapsulate" tells that downgrading is requested to be done
   with the encapsulation defined in this document and value
   "encapsulated" indicates that downgrading is done with encapsulation
   defined in this document.







Hurtta                 Expires September 18, 2007               [Page 4]

Internet-Draft              EAI Encapsulation                 March 2007


3.2.  "I18N-Received" header field

   The "I18N-Received" header field is added to Internet Message Format
   [RFC2822] as specified below:

   received /= "I18N-Received:" name-val-list ";" date-time CRLF

   ; terms <received>, <name-val-list> and <date-time>
   ; are defined in [RFC2822]

   Original "Received" header fields are renamed as "I18N-Received"
   during encapsulation.  "Received" header fields are saved with
   original name to "text/utf8-header" MIME part.

3.3.  Registration of Downgrade-Method header field

   This section provides the header field registration application (as
   per [RFC3864]).

   Header field name: Downgrade-Method

   Applicable protocol: mail

   Status: experimental

   Author/Change controller:

      Kari Hurtta
      hurtta-ietf@elmme-mailer.org

   Specification document(s): RFC XXXX

   Related information:

      Downgrade-Method is used for signaling encapsulation with
      multipart/utf8-encapsulated media type.

3.4.  Registration of I18N-Received header field

   This section provides the header field registration application (as
   per [RFC3864]).

   Header field name: I18N-Received

   Applicable protocol: mail

   Status: experimental




Hurtta                 Expires September 18, 2007               [Page 5]

Internet-Draft              EAI Encapsulation                 March 2007


   Author/Change controller:

      Kari Hurtta
      hurtta-ietf@elmme-mailer.org

   Specification document(s): RFC XXXX

   Related information:

      I18N-Received is used together with multipart/utf8-encapsulated
      media type.


4.  Encapsulation format

   A "multipart/utf8-encapsulated" media type splits internationalized
   email message [ietf-eai-utf8headers] or MIME body part to two parts:
   o  The header part of internationalized email message or header part
      of MIME body part is put to first body part of the "multipart/
      utf8-encapsulated" media type.  Media type of the first body part
      is "text/utf8-header".
   o  The body part of internationalized email message or header part of
      MIME body part is put to second body part of of the "multipart/
      utf8-encapsulated" media type.  Media type of the second body part
      is same than the media type of original internationalized email
      message or the original MIME body part.  However, if media type of
      original internationalized email message or original MIME body
      part was "multipart/signed" , media type of a second body part is
      "multipart/mixed".  In some cases media type is "application/
      octet-stream".

   NOTE:  This encapsulation assumes that the "preamble" and "epilogue"
      areas of multipart media types include only ASCII.  If these areas
      include UTF-8 text, that text is lost if encapsulating "multipart/
      utf8-encapsulated" is converted to ASCII compatible format (i.e.
      during 8BITMIME downgrading [RFC1652].)


      This lost UTF-8 text on "preamble" and "epilogue" areas of
      multipart media types can be solved by adding third and fourth
      body part to the "multipart/utf8-encapsulated" media type.
      However author believes that this unnecessarily complicates
      encapsulation format and algorithm.  The author assumes that
      messages which use signing do not put UTF-8 text to "preamble" and
      "epilogue" areas of multipart media types.  If message is not
      signed, lost "preamble" and "epilogue" areas do not cause harm.





Hurtta                 Expires September 18, 2007               [Page 6]

Internet-Draft              EAI Encapsulation                 March 2007


4.1.  "multipart/utf8-encapsulated" media type

   The "multipart/utf8-encapsulated" can be used on three different
   roles.  The "type" parameter is defined for "multipart/
   utf8-encapsulated" media type.  Value of "type" parameter is defined
   as following:

   type-value = "encapsulated" / "message" / "part"

   o  Value "encapsulated" is used, when "multipart/utf8-encapsulated"
      media type is used as downgrading format of internationalized
      email.  Value of "type" is set to "encapsulated" when
      internationalized email is downgraded because of
      "downgrade=encapsulate" value on "Header-Type" header field.
   o  Value "message" is used, when "multipart/utf8-encapsulated" media
      type is used same purpose, to indicate which media type "message/
      rfc822" is used for the non-EAI content.  Value of "type" is set
      to "message" when internationalized email is attached to or
      included in the message.  Roughly "multipart/utf8-encapsulated;
      type=message" is equivalent of "message/rfc822" except that format
      of attachment is different.
   o  Value "part" is used, "multipart/utf8-encapsulated" media type is
      used as downgrading format of MIME body part.  Value of "type" is
      set to "part" MIME structure of internationalized email or MIME
      body part is recursively downgraded, and MIME body part with UTF-8
      header is found.

4.2.  Registration of media type multipart/utf8-encapsulated

   This section provides the media type registration application (as per
   [RFC4288]).

   Type name: multipart

   Subtype name: utf8-encapsulated

   Required parameters:

      The "boundary" parameter is requires as per RFC 2046.

      The "type" parameter is required as per RFC XXXX.

   Optional parameters:

   Encoding considerations: 8bit or binary

   Security considerations:




Hurtta                 Expires September 18, 2007               [Page 7]

Internet-Draft              EAI Encapsulation                 March 2007


      This media type provides a method to encapsulate mail data.
      Specially this media type provides a method to smuggle mail header
      fields so that mail scanners do not see them.  This may cause new
      security threats.

      This encapsulation does not hide original MIME parts.  However,
      original MIME structure may be obscured.  This may provide a
      method to smuggle MIME parts so that mail scanners do not see
      them.  This may cause new security threats.

      This encapsulation preserves only "Received" header fields from
      encapsulating message.  This may hide information when
      encapsulated message is upgraded to internationalized email
      format.

   Interoperability considerations:

      This media type provides a method to encapsulate internationalized
      email.  Recipient of encapsulated email must decode encapsulation,
      before the email is fully accessible.  However original MIME parts
      are not hidden from mail agents which do not know encapsulation
      used by this media type.

   Published specification: RFC XXXX

   Applications that use this media type:

      Internationalized mail user agents (MUAs), mail transport agents
      (MTAs) and IMAP servers.

   Additional information:

      Magic number(s):
      File extension(s):
      Macintosh file type code(s):

   Person & email address to contact for further information:

      Kari Hurtta
      hurtta-ietf@elmme-mailer.org

   Intended usage: common

   Restrictions on usage:

   Author: Kari Hurtta

   Change controller: Kari Hurtta



Hurtta                 Expires September 18, 2007               [Page 8]

Internet-Draft              EAI Encapsulation                 March 2007


4.3.  Registration of media type text/utf8-header

   This section provides the media type registration application (as per
   [RFC4288]).

   Type name: text

   Subtype name: utf8-header

   Required parameters:

      The "charset" with value "UTF-8", if UTF-8 header in fact is
      encapsulated.

   Optional parameters:

      charset

   Encoding considerations: 7bit or 8bit

      "8bit", if UTF-8 header in fact is encapsulated.

   Security considerations:

      This media type provides a method to encapsulate mail data.
      Specially this media type provides a method to smuggle mail header
      fields so that mail scanners do not see them.  This may cause new
      security threats.

   Interoperability considerations:

      Mail agents which do not know this media type, treat this as text/
      plain media type.

   Published specification: RFC XXXX

   Applications that use this media type:

      Internationalized mail user agents (MUAs), mail transport agents
      (MTAs) and IMAP servers.

   Additional information:

      On some cases ASCII header part is encapsulated instead of UTF-8
      header part.






Hurtta                 Expires September 18, 2007               [Page 9]

Internet-Draft              EAI Encapsulation                 March 2007


      Magic number(s):
      File extension(s):
      Macintosh file type code(s):

   Person & email address to contact for further information:

      Kari Hurtta
      hurtta-ietf@elmme-mailer.org

   Intended usage: common

   Restrictions on usage:

   Author: Kari Hurtta

   Change controller: Kari Hurtta



5.  Encapsulation

   On encapsulation the internationalized email message or MIME body
   part is split to two MIME body parts of "multipart/utf8-encapsulated"
   media type.

   There is three cases of encapsulation:
   o  When internationalized email message is downgraded.
   o  When internationalized email message is attached to or included to
      message.
   o  When MIME body part is encapsulated because there is UTF-8 text on
      header.  This is in the recursive part of algorithm.
   On "Generic Encapsulation" (Section 5.1) are described common parts
   of these three encapsulations.

5.1.  Generic encapsulation

   Both an email message and MIME body part follow same general syntax:
   o  Both have a header part and a body part.
   o  Header part is followed by body part and these are separated by an
      empty line.
   Term "entity" refers to both an email message and a MIME body part.

   NOTE:  Subtypes of "message" (i.e. media type is message/*) other
      than "message/rfc822" and other composite types than "multipart"
      or "message" are treated specially.  This is done so that this
      algorithm is stable and result does not change when new subtypes
      of "message" are registered or new composite top level types are
      standardised.  Unknown top level types are treated same way



Hurtta                 Expires September 18, 2007              [Page 10]

Internet-Draft              EAI Encapsulation                 March 2007


      because it is not possible to know if the top level type is
      composite.  Also in these cases type is treated as "application/
      octet-stream" if body part of original entity includes non-ASCII
      characters.  Type information is not lost, because the whole
      header part of original entity is stored to the body of first MIME
      body part of the encapsulating entity.  Processing as
      "application/octet-stream" is done because the algorithm does not
      know how to encapsulate it as a composite type.  If the body part
      of original entity includes only ASCII characters, there can not
      be UTF-8 headers (when it is treated as composite type).

   NOTE:  This algorithm looks complex.  This complexity is result of
      the requirement that so called "the nested encoding rule" is not
      violated.  This requirement causes that composite media types must
      be processed recursively.

      Special handling of unknown composite types, which includes non-
      ASCII characters, as "application/octet-stream" causes that "the
      nested encoding rule" is violated.  In case of unknown types this
      is unavoidable, because it is not possible to parse internal
      structure of unknown types.

   Encapsulating entity is generated from original internationalized
   email message or MIME body part in the following way:
   o  New header part for encapsulating entity is generated.
      *  Media type for encapsulating entity is "multipart/
         utf8-encapsulated".
   o  New body part for encapsulating entity is generated.  This body
      part consists two MIME body parts.
      *  Media type for first MIME body part is "text/utf8-header".
         +  Value of "charset" parameter is "UTF-8", if header part of
            original entity includes UTF-8 characters.
         +  NOTE: This seems strange, but in certain cases also ASCII-
            only header part is encapsulated.  In that case "charset"
            parameter is not required.
      *  Body part of the first MIME body part is the header part of
         original entity (original internationalized email message or
         MIME body part).
      *  It is strongly recommended that the body of the first MIME body
         part is base64 encoded (and of course "content-transfer-
         encoding" header field is updated correspondingly).
      *  Generation of second MIME body part is described in the next
         chapters.








Hurtta                 Expires September 18, 2007              [Page 11]

Internet-Draft              EAI Encapsulation                 March 2007


   NOTE:  An Unix mailbox format changes "From" on beginning of line to
      ">From".  Therefore it is useful that "text/utf8-header" is
      encoded with base64 even when it includes only ASCII header
      fields.

      Actually it is more common to replace "From " with ">From ".  This
      does not touch "From" header field (if there is no space between
      "From" and ":").

   The second MIME body part of encapsulating entity is generated in
   following way:
   1.  Original entity is checked for following cases:
       *  Media type value (type/subtype) of the original entity
          includes other than ASCII characters [ASCII].  This is an
          error condition.
       *  Media type of the original entity is a subtype of "message"
          (i.e. media type is message/*) and it is not "message/rfc822"
          and the body part of original entity includes non-ASCII
          character.
       *  Top level type of the original entity is unknown, the body
          part of the original entity includes non-ASCII characters and
          encoding of the original entity is identity (i.e. "content-
          transfer-encoding" is "8bit" or "binary")
       *  Top level type of the original entity is other composite type
          than "multipart" or "message" and the body part of original
          entity includes non-ASCII characters.
       *  Media type of the original entity is a subtype of "multipart"
          (i.e. media type is multipart/*) and "boundary" parameter is
          missing.  This is an error condition.
       If found,
       *  Media type for the second MIME body part is "application/
          octet-stream"
       *  Body part of the second MIME body part is body part of the
          original entity.
       *  "Content-transfer-encoding" value for second MIME body part is
          copied from the original entity, if it includes only ASCII
          characters.  Otherwise it is set to "7bit", "8bit" or "binary"
          as appropriate.  Non-ASCII value is an error condition.
   2.  Otherwise if the media type of original entity is "multipart/
       signed", then
       *  Media type for the second MIME body part is "multipart/mixed"
          +  Generation of the "boundary" parameter is described on next
             chapters.
       *  Body part of the second MIME body part is "Composite
          encapsulated body part".  Generation if this is described in
          the next chapters.





Hurtta                 Expires September 18, 2007              [Page 12]

Internet-Draft              EAI Encapsulation                 March 2007


       *  "Content-transfer-encoding" is set to "7bit", "8bit" or
          "binary" as appropriate.
          +  If the "Content-transfer-encoding" value of original entity
             is other than "7bit", "8bit" or "binary", this is an error
             condition.
   3.  Otherwise original entity is checked for following cases:
       *  Top level type of original entity is other type than
          "multipart" or "message".
          +  This includes all discrete media types.
          +  This includes all unknown top level types.
       *  Media type of original entity is subtype of "message" (i.e.
          media type is message/*) and it is not "message/rfc822".
       If found,
       *  Media type for second MIME body part is same than media type
          of original entity.
          +  Copying of media type parameters from original entity to
             second MIME body part is described on next chapters.
       *  Body part of second MIME body part is body part of original
          entity.
       *  "Content-transfer-encoding" value for second MIME body part is
          copied from original entity, if it includes only ASCII
          characters.  Otherwise it is set to "7bit", "8bit" or "binary"
          as appropriate.  Non-ASCII value is error condition.
   4.  Otherwise if original entity is composite type ("multipart" or
       "message/rfc822"),
       *  Media type for second MIME body part is same than media type
          of original entity.
          +  Copying of media type parameters from original entity to
             second MIME body part is described on next chapters.
       *  Generation of body part for second MIME body part is handled
          specially when media type of original entity is composite.
          +  Body of original entity is scanned when entity is
             composite.
          +  Generally that causes that processing is recursive.
          +  Body part of second MIME body part is called with term
             "composite encapsulated body part", if media type of
             original entity is composite.  Generating of this body part
             is described on next chapter.
       *  "Content-transfer-encoding" it is set to "7bit", "8bit" or
          "binary" as appropriate.
          +  If "Content-transfer-encoding" value of original entity is
             other than "7bit", "8bit" or "binary", this is error
             condition.

   Media type parameters from original entity to second MIME body part
   is copied on following way





Hurtta                 Expires September 18, 2007              [Page 13]

Internet-Draft              EAI Encapsulation                 March 2007


   o  This copying is done when media type for second MIME body part is
      same than media type of original entity.
   o  ASCII parameters are copied (however see special note about
      "boundary" on next chapters.)
   o  UTF-8 comments are removed.
   o  Parameters which have UTF-8 value are encoded according of
      [RFC2231] when copied.
      *  If required parameters of media type are known, and parameter
         is not required for media type, it is not required that it is
         copied (and encoded according of [RFC2231]).
   o  If parameter name have UTF-8 characters, this is error condition
      and parameter is not copied.
   o  If "boundary" parameter value of multipart media type have UTF-8
      characters, it is handled specially.  This is described on next
      chapters.

   The "Composite encapsulated body part" is generated in following way:
   o  If "application/octet-stream" was assigned to media type for
      second MIME body part, body part of original entity is resulting
      "Composite encapsulated body part".  This case is mentioned on
      previous chapter.
   o  If media type of original entity is subtype of "message" (i.e.
      media type is message/*) and it is not "message/rfc822", body part
      of original entity is resulting "Composite encapsulated body
      part".  This case is mentioned on previous chapter.
   o  If media type of original entity is "message/rfc822", body part of
      original entity parsed (to header and body part) and is processed
      as described on "Encapsulation of recursive part" (Section 5.1.1).
      Result of processing is "Composite encapsulated body part".
   o  If media type of original entity is subtype of "multipart" (i.e.
      media type is multipart/*), body part of original entity is
      processed as described on next chapter.  Result of processing is
      "Composite encapsulated body part".
   o  If top level type of original entity is other composite type than
      "multipart" or "message", it is treated as unknown type.  This
      processing is described on previous chapter.

   For multipart types "Composite encapsulated body part" is generated
   as following:
   1.  A "boundary" parameter value from original entity is remembered.
       *  Handling of missing "boundary" parameter is described on
          previous chapters.
   2.  A "boundary" parameter value for second MIME body part is
       selected.
       *  Selected "boundary" parameter value must include only ASCII
          characters.





Hurtta                 Expires September 18, 2007              [Page 14]

Internet-Draft              EAI Encapsulation                 March 2007


       *  In general this can be same than a "boundary" parameter value
          from original entity.
       *  If a "boundary" parameter value from original entity includes
          UTF-8 characters, new ASCII-only value must selected.
   3.  The "preamble" area from body of original entity is copied to
       "Composite encapsulated body part".
       *  If "preamble" area includes non-ASCII characters, this is an
          error condition.
   4.  Body parts of multipart (from body of original entity) are
       handled:
       1.  A boundary delimiter line is copied to "Composite
           encapsulated body part", but that way that a boundary of
           original entity is replaced with selected boundary of second
           MIME body part.
       2.  A body part is parsed (to header and body part) and is
           processed as described on "Encapsulation of recursive part"
           (Section 5.1.1).  Result is copied to "Composite encapsulated
           body part".
   5.  A final boundary delimiter line is copied (from body of original
       entity), to "Composite encapsulated body part" but that way that
       boundary of original entity is replaced with selected boundary of
       second MIME body part.
       *  A final final boundary delimiter line is not generated to
          "Composite encapsulated body part" if a final boundary
          delimiter line is missing on original entity.  This is an
          error condition.
   6.  The "epilogue" area from body of original entity is copied to
       "Composite encapsulated body part".
       *  If "epilogue" area includes non-ASCII characters, this is
          error condition.

   NOTE:  The CRLF preceding the boundary delimiter line is conceptually
      attached to the boundary (as per [RFC2046]).  That CRLF is not
      part of body part of multipart.  If encapsulation and decoding of
      encapsulation process this CRLF different way, this encapsulation
      do not preserve all CRLFes or add extra CRLFes.

   NOTE:  If original "Content-transfer-encoding" includes non-ASCII
      characters, this algorithm do not able to decode resulting
      encapsulation.  Therefore it is recommended that internationalized
      email message is bounced or rejected on that error condition.

5.1.1.  Encapsulation of recursive part

   An encapsulation of recursive part is done following way:
   1.  If header part of recursive part includes UTF-8 characters or if
       media type of recursive part is "multipart/signed" then




Hurtta                 Expires September 18, 2007              [Page 15]

Internet-Draft              EAI Encapsulation                 March 2007


       *  Recursive part is considered to be "original entity" and
          "Generic encapsulation" (Section 5.1) is applied.
       *  Value of parameter "type" is set to "part" for resulting
          "multipart/utf8-encapsulated" encapsulating entity.
       *  Resulting encapsulating entity result is result for
          "Encapsulation on recursive part".
   2.  Otherwise if media type of recursive part is discrete, result for
       "Encapsulation on recursive part" is recursive part itself.
   3.  Otherwise if media type of recursive part is "message/rfc822",
       then
       *  Header part of result for "Encapsulation on recursive part"
          result, is header part of recursive part.
       *  Body part of recursive part is parsed (to header and body
          part) and is processed as described on "Encapsulation of
          recursive part" (Section 5.1.1).  Body part of result for
          "Encapsulation on recursive part" is result of processing.
   4.  Otherwise if top level type is multipart (i.e. media type is
       multipart/*) and "boundary" parameter exists, handling of it is
       described on next chapter.
       *  Missing "boundary" parameter on multipart types is error
          condition.
   5.  Otherwise if recursive part is ASCII only (body is ASCII, i.e.
       "content-transfer-encoding" is "7bit") result for "Encapsulation
       on recursive part" is recursive part itself.
   6.  Otherwise if encoding of recursive part is not identity (i.e.
       "content-transfer-encoding" is not "8bit" or "binary") result for
       "Encapsulation on recursive part" is recursive part itself.
   7.  Otherwise
       *  Recursive part is considered to be "original entity" and
          "Generic encapsulation" (Section 5.1) is applied.
       *  For resulting "multipart/utf8-encapsulated" encapsulating
          entity parameter "type" is set "part" as value.
       *  Resulting encapsulating entity result is result for
          "Encapsulation on recursive part".
       *  NOTE: This seems strange, but unknown composite media types
          are always encapsulated, if there is possibility that they
          include embedded UTF-8 headers.

   If top level type is multipart, result for "Encapsulation on
   recursive part" is generated following way:
   1.  Header part of result for "Encapsulation on recursive part"
       result is header part of recursive part.
   2.  A "boundary" parameter value from recursive part is remembered.
       *  Handling of missing "boundary" parameter is described on
          previous chapters.
   3.  Body part for "Encapsulation on recursive part" result is
       initiated.




Hurtta                 Expires September 18, 2007              [Page 16]

Internet-Draft              EAI Encapsulation                 March 2007


   4.  A "preamble" area from body of recursive part is copied to body
       part for "Encapsulation on recursive part" result.
       *  If a "preamble" area includes non-ASCII characters, this is an
          error condition.
   5.  Body parts of multipart (from body of recursive part) are
       handled:
       1.  A boundary delimiter line is copied to body part for
           "Encapsulation on recursive part" result.
       2.  A body part is parsed (to header and body part) and is
           processed as described on "Encapsulation of recursive part"
           (Section 5.1.1).  Result is copied to body part for
           "Encapsulation on recursive part" result.
   6.  A final boundary delimiter line is copied (from body of recursive
       part) to body part for "Encapsulation on recursive part" result.
       *  A final final boundary delimiter line is not generated to body
          part for "Encapsulation on recursive part" result if final
          boundary delimiter line is missing on recursive part.  This is
          an error condition.
   7.  An "epilogue" area from body of recursive part is copied to body
       part for "Encapsulation on recursive part" result.
       *  If "epilogue" area includes non-ASCII characters, this is
          error condition.





























Hurtta                 Expires September 18, 2007              [Page 17]

Internet-Draft              EAI Encapsulation                 March 2007


5.1.2.  Encapsulation example

   An encapsulation example

       Original internationalized entity:

       ==========================================
       Some-Header: { UTF-8 content }
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }
       ==========================================

       Encapsulated entity:

       ==========================================
       Content-Type: multipart/utf8-encapsulated;
         type={specified later}; boundary="12345"
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       Some-Header: { UTF-8 content }
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       --12345
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }

       --12345--
       ==========================================


   An empty line on end of "text/utf8-header" body is not copied from
   original encapsulated headers.  It is part of next boundary line of
   "multipart/utf8-encapsulated".








Hurtta                 Expires September 18, 2007              [Page 18]

Internet-Draft              EAI Encapsulation                 March 2007


   NOTE:  On this example "text/utf8-header" part is not base64 encoded
      for clarity.  Base64 encoding is recommended.

5.1.3.  Multipart encapsulation example

   An multipart encapsulation example

       Original internationalized entity:

       ==========================================
       Content-Type: Multipart/mixed; boundary=12345
       Content-Transfer-Encoding: 8bit


       --12345
       Some-Header: { UTF-8 content }
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }

       --12345
       ==========================================

       Encapsulated entity:

       ==========================================
       Content-Type: multipart/utf8-encapsulated;
         type={specified later}; boundary="67890"
       Content-Transfer-Encoding: 8bit


       --67890
       Content-Type: text/utf8-header; charset=US-ASCII
       Content-Transfer-Encoding: 7bit

       Content-Type: Multipart/mixed; boundary=12345
       Content-Transfer-Encoding: 8bit

       --67890
       Content-Type: Multipart/mixed; boundary=12345
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Type: multipart/utf8-encapsulated;
         type=part; boundary="abcde"
       Content-Transfer-Encoding: 8bit



Hurtta                 Expires September 18, 2007              [Page 19]

Internet-Draft              EAI Encapsulation                 March 2007


       --abcde
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       Some-Header: { UTF-8 content }
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       --abcde
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }

       --abcde--

       --12345--

       --67890--
       ==========================================



   "Generic encapsulation" causes that a top level header part is always
   encapsulated, even when it is US-ASCII only.  In general it is
   assumed that on internationalized email there is always some header
   fields which require this encapsulation.
























Hurtta                 Expires September 18, 2007              [Page 20]

Internet-Draft              EAI Encapsulation                 March 2007


5.1.4.  Unknown top level type encapsulation example #1

   An encapsulation example for unknown top level type with 7-bit body

       Original internationalized entity:

       ==========================================
       Some-Header: { UTF-8 content }
       Content-Type: X-message8/plain
       Content-Transfer-Encoding: 7bit

       { ASCII text }
       ==========================================


       Encapsulated entity:

       ==========================================
       Content-Type: multipart/utf8-encapsulated;
         type={specified later}; boundary="12345"
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       Some-Header: { UTF-8 content }
       Content-Type: X-message8/plain
       Content-Transfer-Encoding: 7bit

       --12345
       Content-Type: X-message8/plain
       Content-Transfer-Encoding: 7bit

       { ASCII text }

       --12345--
       ==========================================












Hurtta                 Expires September 18, 2007              [Page 21]

Internet-Draft              EAI Encapsulation                 March 2007


5.1.5.  Unknown top level type encapsulation example #2

   An encapsulation example for unknown top level type with 8-bit body

       Original internationalized entity:

       ==========================================
       Some-Header: { UTF-8 content }
       Content-Type: X-message8/plain
       Content-Transfer-Encoding: 8bit

       { non-ASCII text }
       ==========================================

       ==========================================
       Content-Type: multipart/utf8-encapsulated;
         type={specified later}; boundary="12345"
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       Some-Header: { UTF-8 content }
       Content-Type: X-message8/plain
       Content-Transfer-Encoding: 8bit

       --12345
       Content-Type: application/octet-stream
       Content-Transfer-Encoding: 8bit

       { non-ASCII text }

       --12345--
       ==========================================

5.2.  Downgrading of internationalized email message

   When an internationalized email [ietf-eai-utf8headers] leaves EAI
   compliant environment downgrade is required. [ietf-eai-downgrade]
   describes when downgrade occurs.

   This document defines "Downgrade-Method" header field.  Downgrading
   method is selected following way:
   o  If "Downgrade-Method" header field value is "encapsulate",
      downgrading of header part (and body) of mail is done as described
      on this section.



Hurtta                 Expires September 18, 2007              [Page 22]

Internet-Draft              EAI Encapsulation                 March 2007


   o  Otherwise all message headers (including header fields from MIME
      body parts) may need to be parsed to discover that message is
      internationalized email and is downgrading candidate.
      *  If a downgrading gateway is configured for tunneling operation
         for some recipients of mail, downgrading of header part (and
         body) of mail for these recipients is done as described on this
         section.
      *  If "Downgrade-Method" header field header field does not exists
         and a downgrading gateway is not configured for tunneling
         operation, downgrading of header part (and body) of mail is
         done according of [ietf-eai-downgrade].
      *  If "Downgrade-Method" header field exists and it's value is not
         "encapsulate", this specification is not used for downgrading
         of header part (and body) of mail.

   Downgrading of internationalized email is done following way:
   o  Internationalized email is considered to be "original entity" and
      "Generic encapsulation" (Section 5.1) is applied.
   o  For resulting "multipart/utf8-encapsulated" encapsulating entity
      parameter "type" is set "encapsulated" as value.
   o  Resulting encapsulating entity is downgraded internationalized
      email.
      *  Addition of email header fields to downgraded internationalized
         email is described on next chapter.

   When mail is downgraded, some email header fields must be added.
   "Generic encapsulation" (Section 5.1) do not produce these email
   header fields.
   o  "Downgrade-Method" header field with value "Encapsulated" is
      added.
   o  "Received" header fields are copied from original international
      email added with new header field name.  "I18N-Received" header
      field name is used for copied header fields.  If "for" clause on
      "Received" header field includes non-ASCII, it is removed when
      "Received" header field is copied to "I18N-Received" header field.
      If some header field (excluding "for" clause) includes non-ASCII
      characters, it is not copied.
   o  "Mime-Version" header field with value "1.0" is added.
   o  "Date" header field is copied from original international email,
      if it includes only ASCII characters.  Otherwise it is generated.
   o  "From" header field is added.  Several different values for "From"
      header field which can be used:
      *  If "From" header field from original internationalized email
         can be used, if it includes only ASCII characters.
      *  Algorithm from [ietf-eai-downgrade] can be used.
      *  Value for "From" header field can be taken from downgraded
         envelope sender address.




Hurtta                 Expires September 18, 2007              [Page 23]

Internet-Draft              EAI Encapsulation                 March 2007


      *  ASCII address which refers of a downgrading gateway, can be
         used.
   o  "Subject" header field is copied from original internationalized
      email, if it includes only ASCII characters.  Otherwise several
      different values for "Subject" header field can be used:
      *  Algorithm from [ietf-eai-downgrade] or from [RFC2047] can be
         used.
      *  ASCII subject which refers to downgrading operation, can be
         used.
   o  If "From" and "Subject" are from original internationalized email
      and "Message-ID" header field on original internationalized email
      includes only ASCII characters, "Message-ID" header field is
      copied (from original internationalized email).  Otherwise it is
      optionally generated.
   o  Optionally "To" header is added.  Several different values for
      "To" header field which can be used:
      *  "To" header field from original international email can be
         used, if it includes only ASCII characters.
      *  Algorithm from [ietf-eai-downgrade] can be used.
   o  Optionally "Cc" header is added.  Several different values for
      "Cc" header field which can be used:
      *  "Cc" header field from original international email can be
         used, if it includes only ASCII characters.
      *  Algorithm from [ietf-eai-downgrade] can be used.
   o  It is important that all ASCII header fields are NOT copied.  Some
      header fields may be used for signatures.  If signature is checked
      from encapsulated form, it fails.  For example Domain Keys
      Identified Mail [DKIM-Charter] uses these kind signatures.

   "Encapsulation on recursive part" (Section 5.1.1) mentions several
   error conditions.  Although it defines output on that case converting
   MTA is permitted to bounce (return NDN) or reject (on SMTP level)
   internationalized email message.  Downgrading MUA can refuse
   downgrading internationalized email message and give error message to
   user or produce downgraded message and give warning message to user.
   Silent operation is not recommended when error condition happens (on
   downgrading MUA).

   NOTE:  Only "Date" and "From" header fields are required on email (as
      per [RFC2822])

      However "multipart/utf8-encapsulated" format is also usable for
      non-EAI compliant MUAs assuming that they support MIME.  Specially
      if an original internationalized email message was using UTF-8
      characters only on main header part and not on header part of MIME
      body parts.  Therefore it is useful if "From", "Subject", "To" and
      "Cc" header fields are derived from original internationalized
      email according of [ietf-eai-downgrade].  This allows reply



Hurtta                 Expires September 18, 2007              [Page 24]

Internet-Draft              EAI Encapsulation                 March 2007


      -commands work on non-EAI compliant MUAs.

5.2.1.  Encapsulation example

   An encapsulation example

       Original internationalized email:

       ==========================================
       Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: encapsulate
       From: { UTF-8 address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { UTF-8 subject }
       X-Foobar: XvrT
       Mime-Version: 1.0
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }
       ==========================================

       Encapsulated internationalized email:

       ==========================================
       I18N-Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: Encapsulated
       From: { downgraded address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { RFC 2047 encoded subject }
       Mime-Version: 1.0
       Content-Type: multipart/utf8-encapsulated;
         type=encapsulated; boundary="12345"
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit




Hurtta                 Expires September 18, 2007              [Page 25]

Internet-Draft              EAI Encapsulation                 March 2007


       Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: encapsulate
       From: { UTF-8 address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { UTF-8 subject }
       X-Foobar: XvrT
       Mime-Version: 1.0
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       --12345
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }

       --12345--
       ==========================================


   NOTE:  On this example "text/utf8-header" part is not base64 encoded
      for clarity.  Base64 encoding is recommended -- especially because
      it includes line starting with "From".

5.2.2.  Multipart/signed encapsulation example

   Multipart/signed [RFC1847] encapsulation example

       Original internationalized email:

       ==========================================
       Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: encapsulate
       From: { UTF-8 address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { UTF-8 subject }
       X-Foobar: XvrT
       Mime-Version: 1.0
       Content-Type: multipart/signed;
               protocol="application/XYZ-signature";



Hurtta                 Expires September 18, 2007              [Page 26]

Internet-Draft              EAI Encapsulation                 March 2007


               micalg="ABC"; boundary=12345
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Description: { UTF-8 description }
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }

       --12345
       Content-Type: application/XYZ-signature

       { signature data }

       --12345--
       ==========================================

       Encapsulated internationalized email:

       ==========================================
       I18N-Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: Encapsulated
       From: { downgraded address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { RFC 2047 encoded subject }
       Mime-Version: 1.0
       Content-Type: multipart/utf8-encapsulated;
         type=encapsulated; boundary="45678"
       Content-Transfer-Encoding: 8bit


       --45678
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: encapsulate
       From: { UTF-8 address }
       To: someone@example.org



Hurtta                 Expires September 18, 2007              [Page 27]

Internet-Draft              EAI Encapsulation                 March 2007


       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { UTF-8 subject }
       X-Foobar: XvrT
       Mime-Version: 1.0
       Content-Type: multipart/signed;
               protocol="application/XYZ-signature";
               micalg="ABC"; boundary=12345
       Content-Transfer-Encoding: 8bit

       --45678
       Content-Type: multipart/mixed;
               boundary=12345
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Type: multipart/utf8-encapsulated;
         type=part; boundary="abcde"
       Content-Transfer-Encoding: 8bit


       --abcde
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       Content-Description: { UTF-8 description }
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       --abcde
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }

       --abcde--

       --12345
       Content-Type: application/XYZ-signature

       { signature data }

       --12345--

       --45678--
       ==========================================





Hurtta                 Expires September 18, 2007              [Page 28]

Internet-Draft              EAI Encapsulation                 March 2007


   Media type "multipart/signed" is replaced with "multipart/mixed" on
   encapsulated message.  This allows encapsulation of signed header on
   MIME body part.

   NOTE:  Again, "text/utf8-header" should be base64 encoded.  It is not
      done for clarity.

   NOTE:  Normally various multipart/signed protocols defined that body
      of signed content must be quoted-printable or base64 encoded if it
      includes 8-bit characters.  If it includes 8-bit characters,
      signature is broken, when email is 8BITMIME downgraded [RFC1652].
      Note that generally this encapsulation algorithm do not protect
      against breaking of signature on that case.  On that example is
      may protect it, but that is side effect protection required for
      encapsulation of "Content-Description" header field.

5.3.  Attaching internationalized email message

   "Message/rfc822" can be used to attach internationalized email
   messages on EAI compliant environments if "message/rfc822" allows
   UTF-8 header fields.  "Multipart/utf8-encapsulated" with "type"
   parameter value "message" can be used to attach internationalized
   email messages on EAI non-compliant environments.

   NOTE:  When inside of "message/rfc822" have "Multipart/
      utf8-encapsulated" with "type" parameter value "encapsulated",
      this also represents attached internationalized email message.
      However author believes that "multipart/utf8-encapsulated" with
      "type" parameter value "message" provides useful shorthand.

   NOTE:  If internationalized email was stored inside of "message/
      rfc822" media type and "message/rfc822" is inside of mime
      structure which is encapsulated, "Encapsulation on recursive part"
      (Section 5.1.1) produces where inside of "message/rfc822" have
      "Multipart/utf8-encapsulated" with "type" parameter value "part".

   "Multipart/utf8-encapsulated" media type, which represents
   internationalized email message, is done following way:
   o  Internationalized email is considered to be "original entity" and
      "Generic encapsulation" (Section 5.1) is applied.
   o  Value of parameter "type" is set to "message" for resulting
      "multipart/utf8-encapsulated" encapsulating entity.

5.3.1.  Attaching example

   On following example mail from earlier example (Section 5.2.1) is
   attached to message, which is sent to outside of EAI compliant
   environment.



Hurtta                 Expires September 18, 2007              [Page 29]

Internet-Draft              EAI Encapsulation                 March 2007


       Encapsulating message:

       ==========================================
       From: someone@example.org
       To: A@CC.example.org
       Subject: { UTF-8 subject } (fwd)
       Mime-Version: 1.0
       Content-Type: multipart/mixed;
          boundary="12345"
       Content-Transfer-Encoding: 8bit


       --12345
       Content-Type: Text/plain

       See attached message.

       --12345
       Content-Type: multipart/utf8-encapsulated;
          type=message; boundary="67890"
       Content-Transfer-Encoding: 8bit


       --67890
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       Downgrade-Method: encapsulate
       From: { UTF-8 address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { UTF-8 subject }
       X-Foobar: XvrT
       Mime-Version: 1.0
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       --67890
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }

       --67890--

       --12345--
       ==========================================




Hurtta                 Expires September 18, 2007              [Page 30]

Internet-Draft              EAI Encapsulation                 March 2007


   In this example it is assumed that MUA knows that A@CC.example.org do
   not handle UTF8SMTP messages and therefore encapsulates it.
   Recipient (A@CC.example.org) may need helper application for media
   type multipart/utf8-encapsulated although message is mostly readable
   without helper.


6.  Decoding encapsulation

   There is three cases of encapsulation:
   o  When internationalized email message is tunneled through EAI non-
      compliant environment, media type of message is "multipart/
      utf8-encapsulated" with "type" parameter value "encapsulated".
      Original message is inside of that type.
   o  When internationalized email message is included or attached
      message, media type "multipart/utf8-encapsulated" with "type"
      parameter value "message" represents included or attached message.
   o  When MIME body part is encapsulated, media type "multipart/
      utf8-encapsulated" with "type" parameter value "part" encapsulates
      original MIME body part.
   On "Generic decoding" (Section 6.1) is described common parts of
   decoding these three encapsulations.

6.1.  Generic decoding

   On decoding an internationalized email message or a MIME body part
   from "multipart/utf8-encapsulated" are extracted.  Both an email
   message and a MIME body part are refereed with term "entity".

   On error conditions encapsulating entity is not decoded.  Instead
   original encapsulating entity is returned.

   Decoded internationalized entity is generated from encapsulating
   entity (multipart/utf8-encapsulated) in following way:
   o  If media type of an encapsulating entity is not "multipart/
      utf8-encapsulated", this is an error condition.
   o  If number of MIME body parts on encapsulating entity is not two
      (2), this is an error condition.
   o  If media type of first MIME body part is not "text/utf8-header",
      this is an error condition.
   o  If value of "charset" parameter of first MIME body part is not
      "UTF-8" or "US-ASCII", this is an error condition.  Missing
      "charset" parameter is treated as equivalent of "US-ASCII" as per
      [RFC2046].
   o  Body of first MIME body part forms header part of decoded entity.
      Encoding (as given on "content-transfer-encoding" header field) is
      decoded.




Hurtta                 Expires September 18, 2007              [Page 31]

Internet-Draft              EAI Encapsulation                 March 2007


   o  Body of second MIME body parts forms body part of decoded entity.
      Generating of body part of decoded entity is described on next
      chapter.

   Body of decoded entity is generated on following way:
   o  If both media type of second MIME body part is discrete and media
      type for decoded entity (from body of first MIME body part) is
      discrete, then
      *  If encoding for decoded entity (from body of first MIME body
         part) is identity (i.e. "content-transfer-encoding" is "7bit",
         "8bit" or "binary")
         +  Body of second MIME body parts forms body part of decoded
            entity.
         +  Encoding (as given on "content-transfer-encoding" header
            field on second MIME body part) is decoded.
      *  If encoding for decoded entity (from body of first MIME body
         part) is same than encoding of second MIME body part,
         +  Body of second MIME body parts forms body part of decoded
            entity.
         +  Encoding is not decoded.
      *  Otherwise this is an error condition.
   o  Otherwise if both top level type of second MIME body part is
      "multipart" and top level type for decoded entity (from body of
      first MIME body part) is "multipart", then
      *  Generating of body part of decoded entity is described on next
         chapters ("Decoding of multipart").
   o  Otherwise if both media type of second MIME body part is "message/
      rfc822" and media type for decoded entity (from body of first MIME
      body part) is "message/rfc822", then
      *  A body of second MIME body part is parsed (to header and body
         part) and is processed as described on "Decoding of recursive
         part" (Section 6.1.1).  Result is copied to body of decoded
         entity.
   o  Otherwise if media type of second MIME body part is "application/
      octet-stream", then
      *  If encoding for decoded entity (from body of first MIME body
         part) is identity (i.e. "content-transfer-encoding" is "7bit",
         "8bit" or "binary")
         +  Body of second MIME body parts forms body part of decoded
            entity.
         +  Encoding (as given on "content-transfer-encoding" header
            field on second MIME body part) is decoded.
      *  If encoding for decoded entity (from body of first MIME body
         part) is same than encoding of second MIME body part,
         +  Body of second MIME body part forms body part of decoded
            entity.





Hurtta                 Expires September 18, 2007              [Page 32]

Internet-Draft              EAI Encapsulation                 March 2007


         +  Encoding is not decoded.
      *  Otherwise this is an error condition.
   o  Otherwise if media type of second MIME body part is same than
      media type for decoded entity (from body of first MIME body part),
      then
      *  If encoding for decoded entity (from body of first MIME body
         part) is identity (i.e. "content-transfer-encoding" is "7bit",
         "8bit" or "binary")
         +  Body of second MIME body part forms body part of decoded
            entity.
         +  Encoding (as given on "content-transfer-encoding" header
            field on second MIME body part) is decoded.
      *  If encoding for decoded entity (from body of first MIME body
         part) is same than encoding of second MIME body part,
         +  Body of second MIME body part forms body part of decoded
            entity.
         +  Encoding is not decoded.
      *  Otherwise this is an error condition.
         +  NOTE: This algorithm do not handle cases where body part is
            re-encoded (for example quoted-printable to base64.)
            Reverse re-enconfig of course is possible, but it does not
            necessary give exactly same representation.
      *  NOTE: This handles unknown media types.  But unknown composite
         media types was stored as "application/octet-stream", if they
         includes non-ASCII characters, so this handles mostly discrete
         media types.  It is possible that generator of encapsulation
         knows that type is discrete, but decoder of encapsulation do
         not know it.
   o  Otherwise this is an error condition.

   Body of decoded entity is generated following way when media type is
   multipart (both on second MIME body part and on decoded entity):
   1.  A "boundary" parameter value from second MIME body part is
       remembered.
       *  If "boundary" parameter is missing, this is a error condition.
   2.  A "boundary" parameter value from decoded entity (from body of
       first MIME body part) is remembered.  This is new boundary, which
       is used on generated body of decoded entity.
       *  If a "boundary" parameter is missing, this is a error
          condition.
   3.  A "preamble" area from body of second MIME body part is copied to
       body of decoded entity.
       *  It is not an error condition on decoding if "preamble" area
          includes non-ASCII characters.
   4.  Body parts of multipart (from body of second MIME body part) are
       handled:





Hurtta                 Expires September 18, 2007              [Page 33]

Internet-Draft              EAI Encapsulation                 March 2007


       1.  A boundary delimiter line is copied to body of decoded
           entity, but that way that boundary of second MIME body part
           is replaced with boundary of decoded entity.
       2.  A body part is parsed (to header and body part) and is
           processed as described on "Decoding of recursive part"
           (Section 6.1.1).  Result is copied to body of decoded entity.
   5.  A final boundary delimiter line is copied to body of decoded
       entity, but that way that boundary of second MIME body part is
       replaced with boundary of decoded entity.
       *  A final final boundary delimiter line is not generated to
          decoded entity if final boundary delimiter line is missing on
          second MIME body part.  This is not an error condition on
          decoding.
   6.  An "epilogue" area from body of second MIME body part is copied
       to body of decoded entity.
       *  It is not an error condition on decoding if "epilogue" area
          includes non-ASCII characters.

   NOTE:  The CRLF preceding the boundary delimiter line is conceptually
      attached to the boundary (as per [RFC2046]).  That CRLF is not
      part of body part of multipart.

6.1.1.  Decoding of recursive part

   Decoding of recursive part is done following way:
   1.  If media type of recursive part is "multipart/utf8-encapsulated"
       and "type" parameter is "part" as value:
       1.  Recursive part is considered to be "encapsulating entity" and
           "Generic decoding" (Section 6.1) is applied.
       2.  Resulting decoded entity is result for "Decoding of recursive
           part".
   2.  If media type of recursive part is discrete, result for "Decoding
       of recursive part" is recursive part itself.
   3.  Otherwise if media type of recursive part is "message/rfc822",
       then
       *  Header part of result for "Decoding of recursive part" result,
          is header part of recursive part.
       *  Body part of recursive part is parsed (to header and body
          part) and is processed as described on "Decoding of recursive
          part" (Section 6.1.1).  Body part of result for "Decoding of
          recursive part" is result of processing.
   4.  Otherwise if top level type of recursive part is multipart (i.e.
       media type is multipart/*) and "boundary" parameter exists,
       handling of it is described on next chapter.
       *  Missing "boundary" parameter on multipart types is not error
          condition on decoding.





Hurtta                 Expires September 18, 2007              [Page 34]

Internet-Draft              EAI Encapsulation                 March 2007


   5.  Otherwise if recursive part is ASCII only and encoding of
       recursive part is identity (i.e. "content-transfer-encoding" is
       "7bit", "8bit" or "binary") result for "Decoding of recursive
       part" is recursive part itself.
   6.  Otherwise this is error condition.
       *  NOTE: This means that missing "boundary" parameter is error
          condition for decoding if body is not ASCII only (or required
          encoding).
       *  NOTE: This means that unknown composite types is error
          condition, if body is not ASCII only (or required encoding).

   If top level type is multipart, result for "Decoding of recursive
   part" is generated following way:
   1.  Header part of result for "Decoding of recursive part" result, is
       header part of recursive part.
   2.  A "boundary" parameter value from recursive part is remembered.
       *  Handling of missing "boundary" parameter is described on
          previous chapters.
   3.  Body part for "Decoding of recursive part" result is initiated.
   4.  A "preamble" area from body of recursive part is copied to body
       part for "Decoding of recursive part" result.
       *  It is not an error condition on decoding if "preamble" area
          includes non-ASCII characters.
   5.  Body parts of multipart (from body of recursive part) are
       handled:
       1.  A boundary delimiter line is copied to body part for
           "Decoding of recursive part" result.
       2.  A body part is parsed (to header and body part) and is
           processed as described on "Decoding of recursive part"
           (Section 6.1.1).  Result is copied to body part for "Decoding
           of recursive part" result.
   6.  A final boundary delimiter line is copied (from body of recursive
       part) to body part for "Decoding of recursive part" result.
       *  A final final boundary delimiter line is not generated to body
          part for "Decoding of recursive part" result if final boundary
          delimiter line is missing on second MIME body part.  This is
          not an error condition on decoding.
   7.  An "epilogue" area from body of recursive part is copied to body
       part for "Decoding of recursive part" result.
       *  It is not an error condition on decoding if "epilogue" area
          includes non-ASCII characters.

6.2.  Upgrading of internationalized email message

   When downgraded internationalized email enters EAI compliant
   environment upgrade is allowed. [ietf-eai-downgrade] describes when
   upgrade occurs.




Hurtta                 Expires September 18, 2007              [Page 35]

Internet-Draft              EAI Encapsulation                 March 2007


   This document defines "Encapsulated" value to "Downgrade-Method"
   header field.  "Header-Type" header field defines how upgrade occurs.
   o  If header field "Downgraded" exits, upgrading of header part (and
      body) of mail is done according of [ietf-eai-downgrade].
   o  If header field "Downgrade-Method" exists with value is
      "Encapsulated", upgrading of header part (and body) of mail is
      done as described on this section.
   o  If both header field "Downgraded" and "Downgrade-Method" exists,
      this is error condition and upgrading is not node.

   Encapsulating entity is not decoded on error conditions.  Instead
   original encapsulating entity is returned.

   Upgrading of internationalized email is done following way:
   o  If media type of downgraded internationalized email is not
      "multipart/utf8-encapsulated" or if parameter "type" have not
      "encapsulated" as value, this is a error condition.
   o  Downgraded internationalized email is considered to be
      "encapsulating entity" and "Generic decoding" (Section 6.1) is
      applied.
   o  Resulting decoded internationalized entity is upgraded
      internationalized email.
   o  "Received" header fields from downgraded internationalized are
      prepended to upgraded internationalized email.
      *  Upgraded internationalized email already includes all original
         header fields.  This adds trace header fields which are
         inserted to mail after it was downgrading.  This do not re-add
         trace header fields which was added before downgrading, because
         them are renamed to "I18N-Received" on downgraded
         internationalized email.

6.2.1.  Upgrading example

   An upgrading example of mail from earlier example (Section 5.2.1) is
   used.  Mail is assumed 8BITMIME downgraded afterwards.  This process
   was added also some extra header fields to mime parts.


       Downgraded internationalized email:

       ==========================================
       Received: from fw.example.org
           by upgrade.example.org with ESMTP
           id JAX77356;
           Wed, 13 Sep 2006 22:27:32 +0300
       Received: from downgrade.example.org
           by fw.example.org with ESMTP
           id JAX77356;



Hurtta                 Expires September 18, 2007              [Page 36]

Internet-Draft              EAI Encapsulation                 March 2007


           Wed, 13 Sep 2006 22:27:29 +0300
       I18N-Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: Encapsulated
       From: { downgraded address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { RFC 2047 encoded subject }
       Mime-Version: 1.0
       Content-Type: multipart/utf8-encapsulated;
         type=encapsulated; boundary="12345"
       Content-Transfer-Encoding: 7bit


       --12345
       Content-Type: text/utf8-header; charset=UTF-8
       Content-Transfer-Encoding: quoted-printable
       X-MIME-Autoconverted: from 8bit to quoted-printable
           by downgrade.example.org id JGR17356

       Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: encapsulate
       From: { q-p encoded UTF-8 address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { q-p encoded UTF-8 subject }
       X-Foobar: XvrT
       Mime-Version: 1.0
       Content-Type: Text/plain; charset=3DUTF-8
       Content-Transfer-Encoding: 8bit

       --12345
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: quoted-printable
       X-MIME-Autoconverted: from 8bit to quoted-printable
           by downgrade.example.org id JGR17356

       { q-p encoded UTF-8 text }

       --12345--
       ==========================================

       Upgraded internationalized email:



Hurtta                 Expires September 18, 2007              [Page 37]

Internet-Draft              EAI Encapsulation                 March 2007


       ==========================================
       Received: from fw.example.org
           by upgrade.example.org with ESMTP
           id JAX77356;
           Wed, 13 Sep 2006 22:27:32 +0300
       Received: from downgrade.example.org
           by fw.example.org with ESMTP
           id JAX77356;
           Wed, 13 Sep 2006 22:27:29 +0300
       Received: from {idn-encoded-name}
           by downgrade.example.org with ESMTP
           id JGR17356;
           Wed, 13 Sep 2006 22:27:25 +0300
       Downgrade-Method: encapsulate
       From: { UTF-8 address }
       To: someone@example.org
       Date: Wed, 13 Sep 2006 22:27:25 +0300
       Subject: { UTF-8 subject }
       X-Foobar: XvrT
       Mime-Version: 1.0
       Content-Type: Text/plain; charset=UTF-8
       Content-Transfer-Encoding: 8bit

       { UTF-8 text }
       ==========================================



   Note handling of Received: header fields.  That is only header field
   what was preserved from downgraded internationalized email.  All
   other header fields are got from "text/utf8-header" MIME part.  This
   also means that upgrading do not need add "Header-Type" header field,
   because it necessary already on "text/utf8-header" MIME part.  On
   downgraded e-mail there was empty line after "UTF-8 text", but on
   upgraded email it is disappeared because it was part of multipart
   boundary.

6.3.  Retrieving attached internationalized email message

   "Multipart/utf8-encapsulated" with "type" parameter value "message"
   can be used to attach internationalized email messages on EAI non-
   compliant environments.

   Retrieving internationalized email can be done following way:
   o  If media type is "message/rfc822", then
      *  It is parsed (to header and body part).





Hurtta                 Expires September 18, 2007              [Page 38]

Internet-Draft              EAI Encapsulation                 March 2007


      *  Body part is processed as described "Upgrading of
         internationalized email message" (Section 6.2)
      *  Result is internationalized email.
   o  If media type is "Multipart/utf8-encapsulated" and parameter
      "type" value is "message", then
      *  It is considered to be "encapsulating entity" and "Generic
         decoding" (Section 6.1) is applied.
      *  Result is internationalized email.


7.  IANA Considerations

   IANA is requested to register I18N-Received and Downgrade-Method
   header fields and multipart/utf8-encapsulated and text/utf8-header
   media types as given on registration applications on this document.


8.  Security Considerations

   This "multipart/utf8-encapsulated" media type provides method to
   encapsulate mail data.  Specially this media type provides method to
   smuggle mail header fields so that mail scanners do not see them.
   This may provide new security threats.

   This encapsulation do not hide original MIME parts.  However original
   MIME structure may be obscured.  This may provide method to smuggle
   MIME parts so that mail scanners do not see them.  This may provide
   new security threats.

   This encapsulation preservers only "Received" header fields from
   encapsulating message.  This may hide information when encapsulated
   message is upgraded to internationalized email format.


9.  Acknowledgements

   Originally this encapsulation format is suggested on former IMAA
   mailing list discussions.

   Various ideas are suggested on IMA mailing list discussions.

   John C. Klensin was strongly encouraging author to write this
   documentation.


10.  References





Hurtta                 Expires September 18, 2007              [Page 39]

Internet-Draft              EAI Encapsulation                 March 2007


10.1.  Normative References

   [ASCII]    American National Standards Institute (formerly United
              States of America Standards Institute), "USA Code for
              Information Interchange", ANSI X3.4-1968, 1968.

              ANSI X3.4-1968 has been replaced by newer versions with
              slight modifications, but the 1968 version remains
              definitive for the Internet.

   [ietf-eai-framework]
              Klensin, J. and Y. Ko, "Overview and Framework for
              Internationalized Email", draft-ietf-eai-framework-05
              (work in progress), February 2007.

   [ietf-eai-downgrade]
              YONEYA, Y., Ed. and K. Fujiwara, Ed., "Downgrading
              mechanism for Email Address Internationalization",
              draft-ietf-eai-downgrade-03 (work in progress),
              March 2007.

   [ietf-eai-utf8headers]
              Yeh, J., Ed. and Abel, Ed., "Internationalized Email
              Headers", draft-ietf-eai-utf8headers-04 (work in
              progress), March 2007.

   [RFC2045]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
              Extensions (MIME) Part One: Format of Internet Message
              Bodies", RFC 2045, November 1996.

   [RFC2046]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
              Extensions (MIME) Part Two: Media Types", RFC 2046,
              November 1996.

   [RFC2047]  Moore, K., "Multipurpose Internet Mail Extensions (MIME)
              Part Three: Message Header Extensions for Non-ASCII Text",
              RFC 2047, November 1996.

   [RFC2822]  Resnick, P., "Internet Message Format", RFC 2822,
              April 2001.

   [RFC2231]  Freed, N. and K. Moore, "MIME Parameter Value and Encoded
              Word Extensions: Character Sets, Languages, and
              Continuations", RFC 2231, November 1997.

   [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
              10646", RFC 3629, November 2003.




Hurtta                 Expires September 18, 2007              [Page 40]

Internet-Draft              EAI Encapsulation                 March 2007


10.2.  Informative References

   [DKIM-Charter]
              IETF, "Domain Keys Identified Mail (dkim)", October 2006,
              <http://www.ietf.org/html.charters/dkim-charter.html>.

   [RFC1847]  Galvin, J., Murphy, S., Crocker, S., and N. Freed,
              "Security Multiparts for MIME: Multipart/Signed and
              Multipart/Encrypted", RFC 1847, October 1995.

   [RFC1652]  Freed, N., Ed., Rose, M., Stefferud, E., and D. Crocker,
              "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652,
              July 1994.

   [RFC4288]  Freed, N. and J. Klensin, "Media Type Specifications and
              Registration Procedures", RFC 4288, BCP 13, December 2005.

   [RFC3864]  Klyne, G., Nottingham, M., and J. Mogul, "Registration
              Procedures for Message Header Fields", RFC 3864, BCP 90,
              September 2004.


Author's Address

   Kari Hurtta
   Kala-Matti 4 B 24
   02230 Espoo
   FI

   Email: hurtta-ietf@elmme-mailer.org
   URI:   http://iki.fi/keh/




















Hurtta                 Expires September 18, 2007              [Page 41]

Internet-Draft              EAI Encapsulation                 March 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).





Hurtta                 Expires September 18, 2007              [Page 42]