This file documents MIME features of FLIM-LB, a fundamental library to process Internet Messages for GNU Emacsen.
FLIM is a library to provide basic features about message representation or encoding. FLIM-LB is a variant of FLIM, which features supports to latest Emacs.
Please eval following to use MIME features provided by FLIM:
(require 'mime)
According to RFC 2045 (RFC 2045), ‘The term “entity”, refers specifically to the MIME-defined header fields and contents of either a message or one of the parts in the body of a multipart entity.’ In this document, the term entity indicates all of header fields and body.
The definition of RFC 2045 indicates that a MIME message is a tree, and each node of the tree is an entity. Namely MIME extends message to tree structure.
FLIM uses mime-entity structure to represent information of entity. In this document, it is called simply ‘mime-entity’.
Open an entity and return it.
type is representation-type. (cf. Entity representations and implementations)
location is location of entity. Specification of it is depended on representation-type.
Parse buffer as message, and set the result to buffer local
variable mime-message-structure
of buffer as
mime-entity.
If buffer is omitted, current buffer is used.
type is representation-type of created mime-entity. (cf. Entity representations and implementations) Default value is buffer.
Structure of a MIME message is tree.
In the tree, root node is the entity indicates all of the message. In
this document, it is called root-entity or message.
In FLIM, it is indicated by buffer local variable
mime-message-structure
.
Each entity except root-entity has a parent. An entity may have children. We can indicate an entity by relative position from a base entity, based on the parent-child relationship.
In addition, we can indicate an entity by absolute position of the message.
Each entity, which is a node of the tree, can be numbered by depth and left-to-right order of the depth.
+-------+ | nil | +---+---+ +-------------------+-------------------+ +-+-+ +-+-+ +-+-+ | 0 | | 1 | | 2 | +-+-+ +-+-+ +-+-+ | +---------+---------+ | +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ | 0.0 | | 1.0 | | 1.1 | | 1.2 | | 2.0 | +-----+ +-----+ +-----+ +-----+ +-----+
Namely, if depth of a node is n, the node has a node-number, which is
consists of n integers. In this document, it is called
entity-number. An entity-number is represented by list of
integer, like (1 2 3)
.
mime-entity has also node-id. A node-id is represented by
reversed list of entity-number. For example, node-id corresponding with
1.2.3 is (3 2 1)
.
Each entity can be indicated by entity-number or node-id in
mime-message-structure
.
Buffer local variable to store mime-entity structure of message.
Return list of entities included in the entity.
Return parent entity of the entity.
If message is specified, it is regarded as root instead of
mime-message-structure
.
Return non-nil
if entity is root entity (message).
Return node-id of entity.
Return entity-number of entity.
Return entity from entity-number in message.
If message is not specified, mime-message-structure
is
used.
Return entity from entity-node-id in message.
If message is not specified, mime-message-structure
is
used.
Return entity from cid in message.
If message is not specified, mime-message-structure
is
used.
Return content-type of entity. (cf. mime-content-type structure)
Return content-disposition of entity. (cf. mime-content-disposition structure)
Return file name of entity.
Return content-transfer-encoding of entity. (cf. Encoding Method)
If the entity does not have Content-Transfer-Encoding field, this
function returns default-encoding. If it is nil, "7bit"
is
used as default value.
Return non-nil if contents of entity has been already code-converted.
Return field-body of field-name field in header of entity.
The results is network representation.
If entity is omitted, mime-message-structure
is used as
default value.
If field-name field is not found, this function returns
nil
.
Parse field-name field in header of entity, and return the result.
Format of result is depended on kind of field. For non-structured field, this function returns string. For structured field, it returns list corresponding with structure of the field.
Strings in the result will be converted to internal representation of Emacs.
If entity is omitted, mime-message-structure
is used as
default value.
If field-name field is not found, this function returns
nil
.
Insert before point a decoded contents of header of entity.
invisible-fields is list of regexps to match field-name to hide. visible-fields is list of regexps to match field-name to hide.
If a field-name is matched with some elements of invisible-fields and matched with none of visible-fields, this function don’t insert the field.
Each encoded-word (Network representation of header) in the header is decoded. “Raw
non us-ascii characters” are also decoded as
default-mime-charset
.
Insert before point a contents of entity as text entity.
Contents of the entity are decoded as MIME charset (MIME charset). If the entity does not have charset parameter of
Content-Type field, default-mime-charset
is used as default
value.
Symbol to indicate default value of MIME charset (MIME charset).
It is used when MIME charset is not specified.
It is originally variable of APEL.
Return content of entity as byte sequence.
Insert content of entity at point.
Write content of entity into filename.
Insert header and body of entity at point.
Write representation of entity into filename.
Write body of entity into filename.
Return buffer, which contains entity.
Return the start point of entity in the buffer which contains entity.
Return the end point of entity in the buffer which contains entity.
Return the start point of header of entity in the buffer which contains entity.
Return the end point of header of entity in the buffer which contains entity.
Return the start point of body of entity in the buffer which contains entity.
Return the end point of body of entity in the buffer which contains entity.
Entity is an abstraction. It is designed to use various data representations for their purposes.
Each entity has representation-type. It must be specified when an entity is created. (cf. Functions to create mime-entity)
Functions about entity are implemented by request processing to the entity. Each entity knows its representation-type. Each entity calls processing function corresponding with the representation-type. Such kind of function is called entity processing method. A module, consists of them corresponding with a representation-type, is called mm-backend.
Module name of each mm-backend consists of the prefix mm
and its representation-type. The module is required automatically
when its entity is created at first.
Send message to entity with args, and return the result.
args is arguments of the message.
Define type as a mm-backend.
If PARENTS is specified, type inherits parents. Each parent must be representation-type.
Example:
(mm-define-backend chao (generic))
Define name as a method function of (nth 1 (car args)) backend.
args is like an argument list of lambda, but (car args) must be specialized parameter. (car (car args)) is name of variable and (nth 1 (car args)) is name of backend (representation-type).
Example:
(mm-define-method entity-cooked-p ((entity chao)) nil)
Content-Type field is a field to indicate kind of contents or data format, such as media-type (media-type) and MIME charset. It is defined in RFC 2045 (RFC 2045).
[Memo]
Historically, Content-Type field was proposed in RFC 1049. In it, Content-Type did not distinguish type and subtype, and there are no mechanism to represent kind of character code like MIME charset.
FLIM provides parser for Content-Type field and structure mime-content-type to store information of Content-Type field.
Format of Content-Type field is defined as follows:
“Content-Type” “:” type “/” subtype *( “;” parameter )
For example:
Content-Type: image/jpeg
Content-Type: text/plain; charset=iso-2022-jp
‘type’ and ‘subtype’ indicate format of an entity. In this document, pair of them is called ‘media-type’. ‘image/jpeg’ or ‘text/plain’ is a media-type.
[Memo]
If an entity does not have Content-Type field, it is regarded as following:
Content-Type: text/plain; charset=us-ascii(cf. us-ascii)
Structure to store information of a Content-Type field.
Applications should use reference functions
mime-content-type-SLOT
to refer information of the
structure.
Slots of the structure are following:
primary type of media-type (symbol).
subtype of media-type (symbol).
parameters of Content-Type field (association-list).
&optional parameters
Constructor of content-type.
Return value of parameter of content-type.
Parse string as a field-body of Content-Type field, and return the result as mime-content-type (mime-content-type structure) structure.
Parse Content-Type field of the current buffer, and return the result as mime-content-type (mime-content-type structure) structure.
Return nil
if Content-Type field is not found.
Content-Disposition field is an optional field to specify presentation of an entity or attributes of an entity, such as file name.
[RFC 2183]
S. Dorner, K. Moore and R. Troost, “Communicating Presentation Information in Internet Messages: The Content-Disposition Header”, August 1997, Standards Track.
FLIM provides parser for Content-Disposition field and structure mime-content-disposition to store information of Content-Disposition field.
Structure to store information of a Content-Disposition field.
Applications should use reference functions
mime-content-disposition-SLOT
to refer information of the
structure.
Slots of the structure are following:
disposition-type (symbol).
parameters of Content-Disposition field (association-list).
Return value of parameter of content-disposition.
Return filename of content-disposition.
Parse string as field-body of Content-Disposition field, and return the result as mime-content-disposition (mime-content-disposition structure) structure.
Parse Content-Disposition field of the current buffer, and return the result as mime-content-disposition (mime-content-disposition structure) structure.
Return nil
if Content-Disposition field is not found.
Content-Transfer-Encoding field is a header field to indicate body encoding of a entity.
FLIM provides parser functions for Content-Transfer-Encoding field. They represent information of Content-Transfer-Encoding field as string.
In addition, FLIM provides encoder/decoder functions by Content-Transfer-Encoding.
Parse string as a field-body of Content-Transfer-Encoding field, and return the result.
Parse Content-Transfer-Encoding field of the current buffer, and return the result.
Return default-encoding if Content-Transfer-Encoding field is not
found. If it is not specified, nil
is used as the default value.
Encode region start to end of current buffer using encoding.
Decode region start to end of current buffer using encoding.
Decode string which is encoded in encoding, and return the result.
Insert file FILENAME encoded by ENCODING format.
Decode and write current region encoded by encoding into filename.
start and end are buffer positions.
Return list of Content-Transfer-Encoding.
If service is specified, it returns available list of Content-Transfer-Encoding for it.
Return table of Content-Transfer-Encoding for completion.
If service is specified, it returns available list of Content-Transfer-Encoding for it.
Define name as a method function of (nth 1 (car (last args))) backend.
args is like an argument list of lambda, but (car (last args)) must be specialized parameter. (car (car (last args))) is name of variable and (nth 1 (car (last args))) is name of backend (encoding).
Example:
(mel-define-method mime-write-decoded-region (start end filename (nil "base64")) "Decode and write current region encoded by base64 into FILENAME. START and END are buffer positions." (interactive (list (region-beginning) (region-end) (read-file-name "Write decoded region to file: "))) (let ((str (buffer-substring start end))) (with-temp-buffer (insert (decode-base64-string str)) (write-region-as-binary (point-min) (point-max) filename) )))
Set spec’s function definition to function.
First element of spec is service.
Rest of args is like an argument list of lambda, but (car (last args)) must be specialized parameter. (car (car (last args))) is name of variable and (nth 1 (car (last args))) is name of backend (encoding).
Example:
(mel-define-method-function (mime-encode-string string (nil "base64")) 'encode-base64-string)
Define name as a service for Content-Transfer-Encodings.
If args is specified, name is defined as a generic function for the service.
Example:
(mel-define-service encoded-text-encode-string (string encoding) "Encode STRING as encoded-text using ENCODING. ENCODING must be string.")
RFC 2047 defines the encoded-word which is a format to represent non-ASCII (ASCII) characters in a header.
[RFC 2047]
K. Moore, “MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text”, November 1996, Standards Track (obsolete RFC 1521,1522,1590).
The encoded-word is the only valid format to represent non-ASCII (ASCII) characters in a header, but there are also invalid styles. Such kinds of evil messages represent non-ASCII (ASCII) characters in headers without encoded-words (it is called "raw" non-ASCII (ASCII) characters).
FLIM provides encoding/decoding features of both encoded-word and invalid "raw" non-ASCII (ASCII) characters.
Decode MIME encoded-words in header fields.
If code-conversion is nil
, only encoded-words are decoded.
If code-conversion is a MIME charset (MIME charset),
non-ASCII bit patterns are decoded as the MIME charset. Otherwise
non-ASCII bit patterns are decoded as the
default-mime-charset
. (cf. Text presentation of entity)
If separator is not nil
, it is used as header separator.
Encode header fields to network representation, such as MIME encoded-word.
Each field is encoded as corresponding method specified by variable
mime-field-encoding-method-alist
.
Association list to specify field encoding method. Each element looks like (FIELD . METHOD).
If METHOD is mime
, the FIELD will be encoded into MIME format
(encoded-word).
If METHOD is nil
, the FIELD will not be encoded.
If METHOD is a MIME charset, the FIELD will be encoded as the charset when it must be convert into network-code.
Otherwise the FIELD will be encoded as variable
default-mime-charset
when it must be convert into network-code.
The group associated with functions related to MIME.
It belongs to mail
and news
.
7bit means any integer between 0 .. 127.
Any data represented by 7bit integers is called 7bit data.
Textual string consisted of Control characters between 0 .. 31 and 127, and space represented by 32, and graphic characters between 33 .. 236 are called 7bit (textual) string.
Conventional Internet MTA (MTA) can translate 7bit data, so it is no need to translate by Quoted-Printable (Quoted-Printable) or Base64 (Base64) for 7bit data.
However if there are too long lines, it can not translate by 7bit MTA even if it is 7bit data. RFC 822 (RFC 822) and RFC 2045 (RFC 2045) require lines in 7bit data must be less than 998 bytes. So if a “7bit data” has a line more than 999 bytes, it is regarded as binary (binary). For example, Postscript file should be encoded by Quoted-Printable.
8bit means any integer between 0 .. 255.
Any data represented by 8bit integers is called 8bit data.
Textual string consisted of Control characters between 0 .. 31, 127, and 128 .. 159, and space represented by 32, and graphic characters between 33 .. 236 and 160 .. 255 are called 8bit (textual) string.
For example, iso-8859-1 or euc-kr are coded-character-set represented by 8bit textual string.
Traditional Internet MTA (MTA) can translate only 7bit (7bit) data, so if a 8bit data will be translated such MTA, it must be encoded by Quoted-Printable (Quoted-Printable) or Base64 (Base64).
However 8bit MTA are increasing today.
However if there are too long lines, it can not translate by 8bit MTA even if it is 8bit data. RFC 2045 (RFC 2045) require lines in 8bit data must be less than 998 bytes. So if a “8bit data” has a line more than 999 bytes, it is regarded as binary (binary), so it must be encoded by Base64 or Quoted-Printable.
ASCII is a 94-character set contains primary latin characters (A-Z, a-z), numbers and some characters. It is a standard of the United States of America. It is a variant of ISO 646.
[ASCII]
“Coded Character Set – 7-Bit American Standard Code for Information Interchange”, ANSI X3.4:1986.
Base64 is a transfer encoding method of MIME (MIME) defined in RFC 2045 (RFC 2045).
The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Encoded characters represent integer 0 .. 63 or pad. Base64 data must be 4 * n bytes, so pad is used to adjust size.
These 65 characters are subset of all versions of ISO 646, including US-ASCII, and all versions of EBCDIC. So it is safe even if it is translated by non-Internet gateways.
Any byte stream is called binary.
It does not require structureof lines. It differs from from 8bit (8bit).
In addition, if line structured data contain too long line (more than 998 bytes), it is regarded as binary.
A set of unambiguous rules that establishes a character set and the one-to-one relationship between the characters of the set and their bit combinations.
media-type specifies the nature of the data in the body of MIME (MIME) entity (Message and Entity). It consists of type and subtype. It is defined in RFC 2046 (RFC 2046).
Currently there are following standard primary-types:
And there are various subtypes, for example, application/octet-stream, audio/basic, image/jpeg, multipart/mixed, text/plain, video/mpeg...
You can refer registered media types at MEDIA TYPES (ftp://ftp.isi.edu/in-notes/iana/assignments/media-types).
In addition, you can use private type or subtype using x-token, which as the prefix ‘x-’. However you can not use them in public.
(cf. Format of Content-Type field)
In this document, it means mail defined in RFC 822 (RFC 822) and news message defined in RFC 1036 (RFC 1036).
MIME stands for Multipurpose Internet Mail Extensions, it is an extension for RFC 822 (RFC 822).
According to RFC 2045:
STD 11, RFC 822, defines a message representation protocol specifying considerable detail about US-ASCII message headers, and leaves the message content, or message body, as flat US-ASCII text. This set of documents, collectively called the Multipurpose Internet Mail Extensions, or MIME, redefines the format of messages to allow for
It is defined in RFC 2045 (RFC 2045), RFC 2046 (RFC 2046), RFC 2047 (Network representation of header), RFC 2048 (RFC 2048) and RFC 2049 (RFC 2049).
Coded character set (Coded character set, Character code) used in Content-Type field (Format of Content-Type field) or charset parameter of encoded-word (Network representation of header).
It is defined in RFC 2045 (RFC 2045).
iso-2022-jp or euc-kr are kinds of it. (In this document, MIME charsets are written by small letters to distinguish graphic character set. For example, ISO 8859-1 is a graphic character set, and iso-8859-1 is a MIME charset)
Message Transfer Agent. It means mail transfer programs (ex. sendmail) and news servers.
(cf. MUA)
Quoted-Printable is a transfer encoding method of MIME (MIME) defined in RFC 2045 (RFC 2045).
If the data being encoded are mostly US-ASCII text, the encoded form of the data remains largely recognizable by humans.
(cf. Base64)
A RFC defines format of Internet mail message, mainly message header.
[Memo]
news message is based on RFC 822, so Internet message may be more suitable than Internet mail .
[RFC 822]
D. Crocker, “Standard for the Format of ARPA Internet Text Messages”, August 1982, STD 11.
A RFC defines format of USENET message. It is a subset of RFC 822 (RFC 822). It is not Internet standard, but a lot of netnews excepting Usenet uses it.
[USENET: RFC 1036]
M. Horton and R. Adams, “Standard for Interchange of USENET Messages”, December 1987, (obsolete RFC 850).
[RFC 2045]
N. Freed and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies”, November 1996, Standards Track (obsolete RFC 1521, 1522, 1590).
[RFC 2046]
N. Freed and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types”, November 1996, Standards Track (obsolete RFC 1521, 1522, 1590).
[RFC 2048]
N. Freed, J. Klensin and J. Postel, “Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures”, November 1996, Standards Track (obsolete RFC 1521, 1522, 1590).
[RFC 2049]
N. Freed and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Five: Conformance Criteria and Examples”, November 1996, Standards Track (obsolete RFC 1521, 1522, 1590).
A textual data represented by only coded character set (Coded character set, Character code). It does not have information about font or typesetting.
A MIME charset (MIME charset) for primary Latin script mainly written by English or other languages.
It is a 7bit coded character set (Coded character set, Character code) based on ISO 2022, it contains only ASCII (ASCII) and code extension is not allowed.
It is standard coded character set of Internet mail. If MIME charset is not specified, us-ascii is used as default.
In addition, ASCII of RFC 822 (RFC 822) should be interpreted as us-ascii.
Topics related to FLIM-LB are discussed in following mailing lists. The latest version is also announced there.
Wanderlust Mailing List <wl@ml.gentei.org>
In this list Japanese is mainly used for discussion. We also have a list for discussion in English:
Wanderlust List in English <wl-en@ml.gentei.org>
(Messages posted to this list are also forwarded to the former one.)
A guide can be obtained automatically by sending mail to
wl-ctl@ml.gentei.org
(or to wl-en-ctl@ml.gentei.org
for
the English one) with the body
# guide
Please send bug reports or patches to one of those lists. You have to subscribe the mailing list to post a message.
Notice that, we do not welcome bug reports about too old version. Bugs in old version might be fixed. So please try latest version at first.
You should write good bug report. If you write only “FLIM does not work”, we can not find such situations. At least, you should write name, type, variants and version of OS, emacs, APEL, FLIM, SEMI and MUA, and setting. In addition, if error occurs, to send backtrace is very important. (cf. Reporting Bugs in GNU Emacs Manual)
Bug may not appear only your environment, but also in a lot of environment (otherwise it might not bug). Therefor if you send mail to author directly, we must write a lot of mails. So please send mail to address for EMACS-MIME Mailing List instead of author.
FLIM-LB’s repository is published in GitHub.
If you send a pull request, please embed unindented ChangeLog entries in commit message like Emacs’s. See Commit messages section of Emacs’s CONTRIBUTE file 1.
If you send a bug report, please attach Backtrace with it. 2
FLIM の code の最古の部分は 榎並 嗣智 氏が書いた mime.el に起源し ます。この小さな program は Nemacs で動作する iso-2022-jp の B-encoding 専用の encoded-word の復号化プログラムでした。
その後、守岡 知彦 は mime.el を元にtiny-mime.el というプロ グラムを書きます。これは、Nemacs と Mule で動作する encoded-word の符号 化・復号化プログラムでした。tiny-mime.el は B-encoding だけでなく Q-encoding もsupport し、また、MULE で扱うことができるさまざまな MIME charset (MIME charset) を同時に使うことができました。この時、 Nemacs と Mule の双方を support するために用いられたテクニックは後に emu package にまとめられます。
この頃、守岡 知彦 は tiny-mime.el をさまざまな MUA で使うための設 定集も配布していましたが、それらは後にtiny-mime.el とともに1つの package にまとめられ、tm という名前で配布されます。
守岡 知彦 はやがて、MIME message を閲覧するためのプログラムである tm-body.el を書きます。これは、すぐにtm-view.el という名前 に変わりましたが、やがて、これがtiny-mime.el に代わって、tm の中 核となります。
tm-view.el は当然、Content-Transfer-Encoding を扱う必要があります。 この目的のために、MEL が整備されはじめました。Base64 に関しては tiny-mime.el の code が移され、また、新たにQuoted-Printable の code が追加されました。これらがmel-b.el と mel-q.el になり ました。
また、後に、守岡 知彦 によって uuencode 用の mel-u.el が追加され、 その後に、小林 修平 氏によって x-gzip64 用のmel-g.el が追加されま した。
tm では後に、守岡 知彦 によって tiny-mime.el の再実装が行われ、こ の過程で、STD 11 の parser が書かれました。これは、現在の std11.el に当たります。また、この過程で tiny-mime.el は復 号化を行う tm-ew-d.el と符号化を行う tm-ew-e.el に分けられ ました。この両者が現在の eword-decode.el と eword-encode.el の先祖に当たります。
後に、守岡 知彦 らによって tm の全面書き換え作業が行われ、この過程で、tm は APEL, MEL, SEMI, EMH, RMAIL-MIME, Gnus-MIME などに分けられました。こ のうちの MEL が FLIM の直接の先祖に当たります。
後に、APEL から std11.el が移され、また、mailcap.el, eword-decode.el および eword-encode.el が SEMI から移され、 package の名前が FLIM となります。
この直前から田中 哲 氏がより RFC に忠実な実装を書き始め、これは、現在、 FLIM の枝である “FLIM-FLAM” となっています。
Jump to: | 7
8
A B C E G I M N P Q R S T U V X |
---|
Jump to: | 7
8
A B C E G I M N P Q R S T U V X |
---|
Jump to: | E M |
---|
Jump to: | E M |
---|
Jump to: | D M |
---|
Index Entry | Section | |
---|---|---|
D | ||
default-mime-charset | entity formatting | |
M | ||
mime-field-encoding-method-alist | Header encoder/decoder | |
mime-message-structure | Entity hierarchy | |
Jump to: | D M |
---|
https://git.savannah.gnu.org/cgit/emacs.git/plain/CONTRIBUTE
http://www.jpl.org/elips/BUGS-ja.html describes how to in Japanese.