wasm-demo/demo/ermis-f/imap-protocol/cur/1600095050.22701.mbox:2,S

MBOX-Line: From slusarz at curecanti.org  Wed Nov 21 14:54:17 2012
To: imap-protocol@u.washington.edu
From: Michael M Slusarz <slusarz@curecanti.org>
Date: Fri Jun  8 12:34:49 2018
Subject: [Imap-protocol] Re: Suspend/Restore feature proposal
In-Reply-To: <e122fa25-9110-4b36-855d-0e7e273c5805@flaska.net>
References: <20121115215854.Horde.zz6B0O0tmt3ylHiEXzXhQQ4@bigworm.curecanti.org>
	<CABa8R6tHP2My0k2LqT1RzHoLQZA+X_jwUMU0cydm2sAwo8f4Fg@mail.gmail.com>
	<20121116143137.Horde.7Kb3aEW5DhM6CnuA2hAx4Q8@bigworm.curecanti.org>
	<e122fa25-9110-4b36-855d-0e7e273c5805@flaska.net>
Message-ID: <20121121155417.Horde.ZeW7JqTPNxTAI-hTtrAT-Q9@bigworm.curecanti.org>

Jan,

Thanks for the input.  My responses are below.

Quoting Jan Kundr?t <jkt@flaska.net>:

> Hi Michael,
> I've read your draft, it's an interesting extension. However, it
> seems to me that the whole point here is to save a few roundtrips by
> skipping the process of activating/configuring various optional
> features. I'll discuss each extension separately.

I would strongly disagree with this statement.  As written, the draft
is only minimally concerned with saving on network round-trips.

E.g. a webmail implementation: for any even reasonably sized setup,
interaction between the webmail backend and the IMAP server will
almost certainly be done through a private network.  Such a setup has
the added benefit that the IMAP connection does not need any sort of
security [TLS] overhead.  I have assumed in the draft that the
client/server round-trip is negligible or, in the very least, not the
bottleneck in the IMAP interaction.

However, client/server round-trip *is* most likely an issue for a
whole category of disconnected-like clients: those running on mobile
hardware.  Pipelining in this environment in no way guarantees that a
server can or will return the response in the same network packet.  In
other words, this draft becomes *more* important the more that
client/server round-trip time becomes the bottleneck/limiting factor,
whether pipelined or not.

Sidebar: I'm not a huge fan in general of pipelining as a performance
since it is not always a feasible option for clients.  For example, a
client may use an OO-library to connect to the IMAP server.  This
library may not provide a reasonable (or any) way of allowing multiple
commands to be sent at once via the API.  For example, to start
compression, enable QRESYNC, and set the language, it is more than
reasonable to expect this kind of pseudocode:

$result = $imap->useCompression(true);
// Check for success
$imap->useQresync(true);
// Check for success
$imap->setLanguage([LANGUAGE]);
// Check for success

In any OO IMAP interface the order of IMAP commands to allow for
efficient pipelining, or the fact that pipelining even exists, should
obviously not be a part of the API.  Thus pipelining is fairly useless
in the real world as a way to guarantee an increase in performance.

There are other, more important reasons why a mechanism to restore
configuration is useful:

- It prevents the need to re-parse the CAPABILITY list.  Note that
parsing the CAPABILITY list involves *MUCH* more than just the actual
string tokenization of the list, although this alone may not be a
trivial task (see below).

A client may, depending on the capabilities returned, need to perform
various internal initialization tasks.  For example - if
CONDSTORE/QRESYNC is listed, a client may have to then parse a
separate configuration file to grab the details of the local cache
where it is storing this information, and then connect to this cache,
etc.  Or if language is listed, a client might have to parse a local
list of language availability to determine if it can/should change the
language.

And CAPABILITY parsing is more than just determining what capabilities
are listed.  It is also determining which capabilities SHOULD not be
listed.  Just today, Cyrus was fixed due to a bug that our code was
triggering: APPENDing binary data via a literal8 caused Cyrus to
immediately terminate the connection with a BYE response.  Our code is
smart enough to catch this broken behavior by removing BINARY
appending from the list of available capabilities.  But without a way
to ensure that every subsequent connection is a continuation of the
current session, we have to do this detection EVERY SINGLE TIME.  This
is potentially a huge performance hit, since we may be appending MBs
of data to the server before the BYE response can be returned (e.g.
appending a sent-mail message containing attachments).

- As mentioned above, sending an initialization command to the server
may take quite a bit of work on the client side to prepare.  It's not
as easy as hardcoding 'ENABLE QRESYNC' in client code - it may take
quite a bit of CPU cycles to get to that point in a given client.

Another example: a client keeps all of its imap initialization code in
a separate dynamically-loadable module.  If the session is
successfully resumed, this module does not need to be
loaded/interpreted/run.

- From the server side, it may be much more expensive to initiate an
IMAP session as compared with resuming one.  This draft allows the
server to optimize if possible.  I believe Timo's post indicates that
resuming in Dovecot is more efficient than creating a new session.

- Even when pipelining commands, they still need to be sent, the
incoming command needs to be tokenized (server), the command is
performed (server), the response sent back, any untagged responses are
tokenized (client), the untagged responses are interpreted (client),
the tagged response is tokenized (client), and the tagged response is
processed (client).  None of this is "free".  Pipelining eliminates
none of this.

>> COMPRESS=DEFLATE
>
> I was wondering if this one actually provides any benefit for a
> webmail client. But you're right that it indeed has an overhead and
> requires a full roundtrip to set up. However, please note that your
> extension also requires a full roundtrip, so you aren't any better
> here.

First a point of clarification: the draft is not specific to webmail
clients.  It is intended for any disconnected client that may have
need to initiate multiple IMAP connections during the client's lifetime.

Granted, it is extremely useful for webmail clients due to extreme
disconnected nature of the connections, but it would also be highly
useful for clients on any device that does not have a constant (or
reliable) network connection to the server.  e.g. smartmobile clients;
ActiveSync polling.

Whether or not COMPRESS is beneficial to a webmail implementation is
beyond the scope of this discussion.

Second, you are partially right.  A successful restoration of the
configuration state does require a round-trip to the server.  But a
RESUME command sent before initialization is an example of a command
that CAN be easily pipelined with an authentication command.  And the
full round-trip is offset somewhat by the fact that upon a successful
RESUME, the CAPABILITY string will not be sent-back to the client if
the server normally does this automatically on authentication.  And if
the server doesn't normally return CAPABILITY information, then this
is a complete win (RESUME/tagged OK vs. CAPABILITY command/CAPABILITY
untagged response/tagged OK).

>> ENABLE (CONDSTORE/QRESYNC)
>> LANGUAGE
>> COMPARATOR
>
> It looks to me that you can easily pipeline all of these and that
> you do not risk anything by doing so. Yes, I'm aware of the wording
> of the ENABLE RFC which sounds like one really MUST check its return
> code, but a subsequent thread on this list indicated that this was
> not the desired outcome and that it is completely legal to pipeline
> ENABLE QRESYNC with SELECT ... QRESYNC.

I would argue that the language of the RFC still controls despite what
an e-mail on this list says.  A client shouldn't be punished for
interpreting it that way either.

> As of the LANGUAGE -- how often do you expect to hit an error
> condition which is not described by an appropriate response code? I
> don't think that blocking for its result would be a good design
> choice.

That could be your decision as a client author.  I would vehemently disagree.

> And finally, what IMAP servers support the LANGUAGE extension?

Why does this matter?  RFC 5255 is a Standards Track extension.  A
year from now, every IMAP server and 200 new ones may support it.

>> CONVERSIONS
>> saved CONTEXTs
>> NOTIFY
>
> Are you actually aware of a single IMAP server supporting any of
> these (besides CONTEXT=SEARCH, which again can easily be pipelined
> without any race conditions, and is specific to a mailbox state
> anyway, which is outside of scope of your extensions)?

Again, why does this matter?  All of these are Standards Track
extensions (your argument might hold a bit more water if these were
Experimental documents).

And what about future extensions?  Those obviously aren't supported by
ANY server yet.

A given client may not support any of these extensions.  This client
could make the decision that SUSPEND/RESUME is pointless.  That
doesn't mean the SUSPEND feature is pointless since another client may
support ALL of these extensions.

> In general, all of the items which you included as an example look
> like easily pipelineable items. Have you tried to use pipelining for
> these? What was the total time spent waiting for their completion in
> that case? What would be the best theoretical time which you could
> get by RESTORE?

It would be impossible to determine benchmarks since there is no
defined protocol yet.  And, as mentioned above, any given
client/server interaction may provide different results based on their
own internal optimizations and extension support.

About the only thing you could do is look at network traffic savings.
The following is an example of the possible savings giving a moderate
use of IMAP configuration state (this is a more real-world example of
Examples 1 & 2 in the draft):

Initial session:

[User authenticated]
A1 CAPABILITY
* CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE
SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT
MULTIAPPEND UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED
I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN
CONTEXT=SEARCH LIST-STATUS SPECIAL-USE ACL RIGHTS=texk
[This is the CAPABILITY list from Dovecot 2.1.10]
A1 OK Capability completed.
A2 ENABLE QRESYNC
* ENABLED QRESYNC
A2 OK Enabled.
A3 LANGUAGE DE
* LANGUAGE (DE)
* NAMESPACE (("" "/")) (("Other Users/" "/" "TRANSLATION" ("Andere
Ben&APw-tzer/"))) (("Public Folders/" "/" "TRANSLATION" ("Gemeinsame
Postf&AM8-cher/")))
A3 Sprachwechsel durch LANGUAGE-Befehl ausgefuehrt
[...]
A20 SUSPEND
* SUSPEND c3RhdGUgdG9rZW4=
* BYE Server logging out.
A20 OK Logout completed.

Additional network data required by SUSPEND commands: 29 bytes (LOGOUT
vs. SUSPEND; SUSPEND untagged response)
[However, this command will only normally be run the FIRST time the
session is accessed, so this is a one-time only hit]
Additional round-trips required: 0

Subsequent sessions:

A1 RESUME c3RhdGUgdG9rZW4=
A1 OK
A2 LOGIN joe passwd
A2 OK [RESUME c3RhdGUgdG9rZW4=] LOGIN completed and configuration restored.
[...]

Additional network data required by RESUME commands: 61 bytes (RESUME
command, RESUME response code)
Additional round-trips required: 1
Network data saved by RESUME: ~650 bytes
Round-trips saved: 3
Server parsed commands saved: 3
Client issued commands saved: 3
Untagged responses that do not need to be re-parsed: 4


In this example, the one-time addition of 29 bytes of network traffic
(1 additional untagged response parse) results in the savings of 2
round-trips, ~600 bytes of network traffic, and 3 additional commands
that need to be parsed on the client/server side.  And remember this
doesn't factor in any initialization code that needs to be run within
the server/client to perform these commands.

To me, that is substantial savings, especially when the connection may
be re-established every 10 seconds.

Hope this response identifies the reason and necessity of the
proposal.  Thanks again for the constructive input.

michael