<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' []>
<rfc ipr="trust200902" category="std" docName="draft-gondwana-imap-uniqueid-00">
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc private=""?>
<?rfc topblock="yes"?>
<?rfc comments="no"?>
<front>
<title abbrev="IMAP UniqueID">IMAP Extension for unique identifiers</title>

<author role="editor" initials="B." surname="Gondwana" fullname="Bron Gondwana">
<organization>FastMail</organization>
<address>
<postal>
<street>Level 2, 114 William St</street>
<city>Melbourne</city>
<code>VIC 3000</code>
<country>Australia</country>
<region></region>
</postal>
<phone></phone>
<email>brong@fastmailteam.com</email>
<uri>https://www.fastmail.com</uri>
</address>
</author>
<date year="2018" month="March" day="19"/>

<area>Applications</area>
<workgroup>EXTRA</workgroup>
<keyword>IMAP</keyword>
<keyword>email</keyword>


<abstract>
<t>This document adds new properties to IMAP mailboxes and messages to allow
clients to more efficiently re-use cached data for resources which have
changed location on the server.
</t>
</abstract>


</front>

<middle>

<section anchor="introduction" title="Introduction">
<t>IMAP stores are often used by many clients, which each cache information
locally about the server state so that they don't need to download anything
again.  <xref target="RFC3501"/> defines that a mailbox can be uniquely referenced by
its name and UIDVALIDITY, and a message within that mailbox can be uniquely
referenced by its mailbox (name + UIDVALIDITY) and UID.
</t>
<t>Further, <xref target="RFC4315"/> defines a COPYUID response which allows a client which
copies messages between folders to know the mapping between the UIDs in the
source and destination mailboxes, and hence update its local cache.
</t>
<t>So a client which copies (or <xref target="RFC6851"/> moves) messages or renames folders
can update its local cache, but any other client connected to the same store
can not know with certainty that the messages are identical, and so will
re-download everything.
</t>
<t>This extension adds new properties to a message (MSGID) and mailbox (UNIQUEID)
which allow a client to quickly identify messages or mailboxes which have been
renamed by another client.
</t>
<t>This extension also adds an optional thread identifier (THRID) to messages,
which can be used by the server to indicate messages which it has identified
to be related.
</t>
</section>

<section anchor="conventions-used-in-this-document" title="Conventions Used In This Document">
<t>In examples, &quot;C:&quot; indicates lines sent by a client that is connected
to a server. &quot;S:&quot; indicates lines sent by the server to the client.
</t>
<t>The key words &quot;MUST&quot;, &quot;MUST NOT&quot;, &quot;REQUIRED&quot;, &quot;SHALL&quot;,
&quot;SHALL NOT&quot;, &quot;SHOULD&quot;, &quot;SHOULD NOT&quot;, &quot;RECOMMENDED&quot;,
&quot;MAY&quot;, and &quot;OPTIONAL&quot; in this document are to be interpreted as
described in <xref target="RFC2119"/> when they appear in ALL CAPS.  These words may
also appear in this document in lower case as plain English words,
absent their normative meanings.
</t>
</section>

<section anchor="capability-identification" title="CAPABILITY Identification">
<t>IMAP servers that support this extension MUST include &quot;UNIQUEID&quot; in
the response list to the CAPABILITY command.
</t>
</section>

<section anchor="status-command-and-response-extensions" title="STATUS Command and Response Extensions">
<t>This extension defines one new status data item for the STATUS
command and response:
</t>
<t>UNIQUEID
      A unique identifier for the mailbox.  This identifier SHOULD be
      retained when the mailbox is renamed.  This identifer MUST NOT
      be identical if the mailbox does not meet the invarients for a
      mailbox with the same name and uidvalidity as a mailbox
      previously reported to have this UIDVALIDITY.  A server MUST NOT
      return two mailboxes with the same UNIQUEID.
</t>
<t>The value of the UNIQUEID is an opaque string of 1..255 bytes in length.
The UNIQUEID is server assigned and read-only.
</t>
<t>The server MAY choose to create a UNIQUEID value in a way that does not
survive RENAME, (e.g. a digest of mailboxname + uidvalidity could be used
as a &quot;UNIQUEID&quot; and it would be legal, though of course clients would
not get the full benefits of this extension from such a server).
</t>
<t>Example:
</t>

<figure align="center"><artwork align="center">
C: 3 create foo
S: 3 OK Completed
C: 4 create bar
S: 4 OK Completed
C: 5 status foo (uniqueid uidvalidity)
S: * STATUS foo (UIDVALIDITY 1521472287 UNIQUEID 7uijf0bg4yeo51a7)
S: 5 OK Completed
C: 6 status bar (uniqueid uidvalidity)
S: * STATUS bar (UIDVALIDITY 1521472288 UNIQUEID u8vhi0uy16v5k99p)
S: 6 OK Completed
C: 7 rename foo renamed
S: * OK rename foo renamed
S: 7 OK Completed
C: 8 status renamed (uniqueid uidvalidity)
S: * STATUS renamed (UIDVALIDITY 1521472287 UNIQUEID 7uijf0bg4yeo51a7)
S: 8 OK Completed
C: 9 status bar (uniqueid uidvalidity)
S: * STATUS bar (UIDVALIDITY 1521472288 UNIQUEID u8vhi0uy16v5k99p)
S: 9 OK Completed
</artwork></figure>
<t>When the LIST-STATUS IMAP capability <xref target="RFC5819"/> is also available,
the STATUS command can be combined with the LIST command to further
improve efficiency.  This way, the unique ids of many mailboxes can be
queried with just one LIST command.
</t>
</section>

<section anchor="fetch-command-and-response-extensions" title="FETCH Command and Response Extensions">
<t>This extension defines two additional FETCH items on messages:
</t>
<t>MSGID
    A server allocated opaque string value (1..255 bytes) which
    uniquely identifies the content of a single message.  That is
    the exact bytes of the RFC822 FETCH item.  The server MUST NOT
    return the same MSGID for two different sets of bytes.  The
    server SHOULD return the same MSGID for the same set of bytes.
</t>

<figure align="center"><artwork align="center">
The server SHOULD retain the same INTERNALDATE for messages with
the same MSGID.
</artwork></figure>
<t>THRID
    A server allocated opaque string value (1..255 bytes) which
    is the same for messages which the server has, with its own
    algorithm, decided are &quot;related&quot; in some way.  This is generally
    based on some combination of References, In-Reply-To and Subject
    but the exact logic is left up to the server implementation.
    If the mailbox does not support THRID, it will return NIL for
    fetch.
</t>

<figure align="center"><artwork align="center">
THRID MUST NOT change if MSGID is the same.
</artwork></figure>
<t>Example:
</t>

<figure align="center"><artwork align="center">
C: 5 append inbox "20-Mar-2018 03:07:37 +1100" {733}
[...]
Subject: Message A
Message-ID: &lt;fake.1521475657.54797@hotmail.com&gt;
[...]
S: 5 OK [APPENDUID 1521475658 1] Completed

C: 11 append inbox "20-Mar-2018 03:07:37 +1100" {793}
[...]
Subject: Re: Message A
Message-ID: &lt;fake.1521475657.21213@gmail.com&gt;
References: &lt;fake.1521475657.54797@hotmail.com&gt;
[...]
S: 11 OK [APPENDUID 1521475658 2] Completed

C: 17 append inbox "20-Mar-2018 03:07:37 +1100" {736}
[...]
Subject: Message C
Message-ID: &lt;fake.1521475657.60280@hotmail.com&gt;
[...]
S: 17 OK [APPENDUID 1521475658 3] Completed

C: 22 fetch 1:* (msgid thrid)
S: * 1 FETCH (MSGID Md8976d99ac3275bb4e918af4 THRID T4964b478a75b7ea9)
S: * 2 FETCH (MSGID Mdd3c288836c4c7a762b2d2b9 THRID T4964b478a75b7ea9)
S: * 3 FETCH (MSGID Mf2e25fdc09b49ea703b05cef THRID T6311863d02dd95b5)
S: 22 OK Completed (0.000 sec)

C: 23 move 2 foo
S: * OK [COPYUID 1521475659 2 1] Completed
S: * 2 EXPUNGE
S: 23 OK Completed

C: 24 fetch 1:* (msgid thrid)
S: * 1 FETCH (MSGID Md8976d99ac3275bb4e918af4 THRID T4964b478a75b7ea9)
S: * 2 FETCH (MSGID Mf2e25fdc09b49ea703b05cef THRID T6311863d02dd95b5)
S: 24 OK Completed (0.000 sec)
C: 25 select "foo"

C: 25 select "foo"
[...]
S: 25 OK [READ-WRITE] Completed
C: 26 fetch 1:* (msgid thrid)
: * 1 FETCH (MSGID Mdd3c288836c4c7a762b2d2b9 THRID T4964b478a75b7ea9)
S: 26 OK Completed (0.000 sec)
</artwork></figure>
</section>

<section anchor="search-command-extension" title="SEARCH Command Extension">
<t>This extension defines two new search keys for the SEARCH command:
</t>
<t>MSGID blob
    Messages with the exactly matching MSGID (bytes, does not
    depend on charset, case IS significant)
</t>
<t>THRID blob
    Messages with the exactly matching THRID (bytes, does not
    depend on charset, case IS significant)
</t>
<t>Example: (as if run before the MOVE above when the mailbox had 3 messages)
</t>

<figure align="center"><artwork align="center">
C: 27 search msgid Md8976d99ac3275bb4e918af4
S: * SEARCH 1
S: 27 OK Completed (1 msgs in 0.000 secs)
C: 28 search thrid T4964b478a75b7ea9
S: * SEARCH 1 2
S: 28 OK Completed (2 msgs in 0.000 secs)
</artwork></figure>
</section>

<section anchor="implementation-considerations" title="Implementation considerations">
<t>The case of RENAME INBOX may need special handling for unique ids.
</t>
<t>It is OK to change the uniqueid on a folder RENAME, but you MUST NOT
ever re-use a UNIQUEID which has been shown to a client.
</t>
<t>It is advisable (though not required) to have UNIQUEID be globally
unique, but they it is only required to be unique within a single
server.
</t>
<t>If you have unique IDs larger than 255 bytes in a data store, it is
safe to use a cryptograhically strong hash to convert your IDs into
a UNIQUEID value to display for this extension.  It may be worth
caching that value, as STATUS UNIQUEID is expected to be cheap for
the server to calculate.
</t>
<t>Ideas for implementing MSGID:
</t>
<t>
<list style="symbols">
<t>Digest of (MailboxName/UIDVALIDITY/UID) - is not kept when moving
messages, but is guarantee unique.</t>
<t>Digest of message content (RFC822 bytes) - expensive unless cached</t>
<t>ID allocated at creation time - very efficient but requires storage
of an additional value.</t>
</list>
</t>
<t>Ideas for implementing THRID:
</t>
<t>
<list style="symbols">
<t>Derive from MSGID of first seen message in the thread.</t>
<t>ID allocated at creation time.</t>
</list>
</t>
<t>There is a need to index and look up reference/in-reply-to data
efficiently at message creation to efficiently find matching messages
for threading.  Threading may be either across folders, or within
each folder only.  The server has significant leeway here.
</t>
</section>

<section anchor="future-considerations" title="Future considerations">
<t>This extension is intentionally defined to be compatible with the data
model in JMAP. (XXX: ref)
</t>
<t>A future extension could be proposed to give a way to SELECT a mailbox
by uniqueid rather than name.
</t>
<t>An extension to allow fetching message content directly via MSGID and
fetch message listing by THRID could be proposed.
</t>
</section>

<section anchor="iana-considerations" title="IANA Considerations">
<t>The IANA is requested to add &quot;UNIQUEID&quot; to the &quot;IMAP Capabilities&quot;
registry located at <eref target="http://www.iana.org/assignments/imap-capabilities"/>.
</t>
</section>

<section anchor="security-considerations" title="Security Considerations">
<t>If globally unique identifiers are used as UNIQUIDs on IMAP folders, then
it may be possible to tell when an account or folder has been renamed,
even if all the mail has been deleted, if the folders themselves are
retained.
</t>
</section>

<section anchor="acknowledgments" title="Acknowledgments">
<t>The EXTRA working group at IETF.
</t>
<t>The gmail team's X-GM-THRID and X-GM-MSGID implementation.
</t>
</section>

</middle>
<back>
<references title="Normative References">
<?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"?>
<?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3501.xml"?>
<?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4315.xml"?>
<?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5819.xml"?>
<?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6851.xml"?>
</references>

</back>
</rfc>
