350 lines
16 KiB
Plaintext
350 lines
16 KiB
Plaintext
MBOX-Line: From slusarz at curecanti.org Mon Nov 26 16:32:24 2012
|
|
To: imap-protocol@u.washington.edu
|
|
From: Michael M Slusarz <slusarz@curecanti.org>
|
|
Date: Fri Jun 8 12:34:49 2018
|
|
Subject: [Imap-protocol] Re: Suspend/Restore feature proposal
|
|
In-Reply-To: <68035860-387d-43a9-8bb7-00744a7868b9@flaska.net>
|
|
References: <20121115215854.Horde.zz6B0O0tmt3ylHiEXzXhQQ4@bigworm.curecanti.org>
|
|
<CABa8R6tHP2My0k2LqT1RzHoLQZA+X_jwUMU0cydm2sAwo8f4Fg@mail.gmail.com>
|
|
<20121116143137.Horde.7Kb3aEW5DhM6CnuA2hAx4Q8@bigworm.curecanti.org>
|
|
<e122fa25-9110-4b36-855d-0e7e273c5805@flaska.net>
|
|
<20121121155417.Horde.ZeW7JqTPNxTAI-hTtrAT-Q9@bigworm.curecanti.org>
|
|
<68035860-387d-43a9-8bb7-00744a7868b9@flaska.net>
|
|
Message-ID: <20121126173224.Horde.BbqbGly8D0JG4aqG7fxoMw1@bigworm.curecanti.org>
|
|
|
|
Quoting Jan Kundr?t <jkt@flaska.net>:
|
|
|
|
> On Wednesday, 21 November 2012 23:54:17 CEST, Michael M Slusarz wrote:
|
|
>> I would strongly disagree with this statement. As written, the
|
|
>> draft is only minimally concerned with saving on network
|
|
>> round-trips.
|
|
>
|
|
> That's quite different from what I've understood from your draft --
|
|
> I'd suggest making the motivation clearer, then. But point
|
|
> understood, and I've now purged the "let's save roundtrips" from my
|
|
> understanding of the draft :). OK.
|
|
|
|
No need to purge the understanding - saving roundtrips remains a
|
|
useful goal. It's just not the primary motivating factor behind the
|
|
proposal.
|
|
|
|
>> $result = $imap->useCompression(true);
|
|
>> // Check for success
|
|
>> $imap->useQresync(true);
|
|
>> // Check for success
|
|
>> $imap->setLanguage([LANGUAGE]);
|
|
>> // Check for success
|
|
>
|
|
> It is pretty obvious that if you use synchronous primitives for
|
|
> enabling individual sub-features in a serialized fashion, your
|
|
> performance will be limited by the round trip times. To put it more
|
|
> bluntly, you cannot have code like the one shown above and expect a
|
|
> good performace.
|
|
>
|
|
> Coming from that background, I see that it is tempting to replace
|
|
> this endless row of synchronous calls, each enabling a single
|
|
> optional feature, with a quick way to side-step this process by
|
|
> quickly jumping into a pre-negotiated state where everything which
|
|
> was enabled before is enabled now as well. However, my point is that
|
|
> clients already exist proving that the same efficiency can be
|
|
> achieved with the existing facilities. You're right that this
|
|
> requires abolishing the serial, synchronized code, but IMAP is not
|
|
> particularly friendly with synchronous APIs.
|
|
|
|
I realize that the API argument is not my strongest one. It becomes
|
|
less strong considering that, yes: you could do all this configuration
|
|
in a single API call - i.e., when creating the IMAP interaction
|
|
object, you configure everything in there.
|
|
|
|
I still maintain that writing an API that requires advanced knowledge
|
|
of IMAP is not that useful. Things like QRESYNC and LIST-STATUS can
|
|
be entirely abstracted so a client coder does not need to know
|
|
anything about them to take advantage of.
|
|
|
|
>> A client may, depending on the capabilities returned, need to
|
|
>> perform various internal initialization tasks. For example - if
|
|
>> CONDSTORE/QRESYNC is listed, a client may have to then parse a
|
|
>> separate configuration file to grab the details of the local cache
|
|
>> where it is storing this information, and then connect to this
|
|
>> cache, etc.
|
|
>
|
|
> So you want to keep the cache information (among other things)
|
|
> inside some serialized client-side state storage. What prevents you
|
|
> from simply checking the capabilities against the previously
|
|
> recorded state and restoring the state when the capabilities match
|
|
> exactly? You can do that now, without waiting for this extension.
|
|
> Yes, it's ugly, but if your initialization is expensive...
|
|
|
|
Because there's still no guarantee it's the same server/connection:
|
|
that is the key to all of this. A server can "look" the same but that
|
|
doesn't proves anything.
|
|
|
|
What happens when the server is upgraded and UTF-8 searching now
|
|
works? The CAPABILITY string is exactly the same. But UTF-8 has been
|
|
marked as a bad charset so it will still not be available. And what
|
|
about those commands that have been determined to be broken previously
|
|
in the session? It is reasonable to expect the CAPABILITY string to
|
|
be the same between point releases of an IMAP server, but the server
|
|
may have fixed the bug that was causing bad command behavior.
|
|
|
|
>> - Even when pipelining commands, they still need to be sent, the
|
|
>> incoming command needs to be tokenized (server), the command is
|
|
>> performed (server), the response sent back, any untagged responses
|
|
>> are tokenized (client), the untagged responses are interpreted
|
|
>> (client), the tagged response is tokenized (client), and the
|
|
>> tagged response is processed (client). None of this is "free".
|
|
>> Pipelining eliminates none of this.
|
|
>
|
|
> Using the numbers you posted later on, we're speaking about parsing
|
|
> roughly 600 bytes of a well-structured text. For me, it's hard to
|
|
> believe that this has any measurable impact.
|
|
|
|
You are incorrect.
|
|
|
|
I went ahead and setup some rough/quick benchmarking using current
|
|
imapproxy behavior as a proxy for the SUSPEND behavior. In this
|
|
benchmark, the server and client are on the same machine so network
|
|
latency is assumed to be non-existent. The load on this machine is
|
|
also non-existent (this test is the only active IMAP process; disk I/O
|
|
is negligible).
|
|
|
|
Login without resuming session (connecting to a Dovecot 2.1 server)
|
|
|
|
C: 1 LOGIN [login credentials]
|
|
S: 1 OK User logged in
|
|
C: 2 CAPABILITY
|
|
S: * CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE
|
|
IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS
|
|
THREAD=ORDEREDSUBJECT MULTIAPPEND UNSELECT CHILDREN NAMESPACE UIDPLUS
|
|
LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES
|
|
WITHIN CONTEXT=SEARCH LIST-STATUS SPECIAL-USE ACL RIGHTS=texk
|
|
S: 2 OK Capability completed.
|
|
C: 3 ENABLE QRESYNC
|
|
S: * ENABLED QRESYNC
|
|
S: 3 OK Enabled.
|
|
|
|
Average elapsed time: 0.087 seconds
|
|
|
|
Login with resuming session:
|
|
|
|
C: 1 LOGIN [login credentials]
|
|
S: * OK [XPROXYREUSE] IMAP connection reused by squirrelmail-imap_proxy
|
|
S: 1 OK User logged in
|
|
|
|
Average elapsed time: 0.039 seconds
|
|
|
|
Difference: 0.048 seconds (~120% improvement)
|
|
|
|
120% improvement in a very common example. And a reminder that this
|
|
is WITHOUT any network latency; latency would only increase the actual
|
|
real-time difference between the benchmarks.
|
|
|
|
Caveats:
|
|
* imapproxy doesn't require you to provide the token before the auth
|
|
command, so that is admittedly not accounted for here.
|
|
* However, this RESUME could be easily pipelined with the
|
|
authentication command, so you are not adding a round-trip.
|
|
* Additionally. RESUME shouldn't result in much additional server
|
|
load/proccessing since it is doing nothing more than storing the token
|
|
in the server's memory - the server isn't going to process that token
|
|
until the authentication is complete.
|
|
* The above example is being routed through an additional proxy server
|
|
so there are small performance penalties there.
|
|
* Someone will probably say my code sucks, and that parsing shouldn't
|
|
take that long. That could very well be true. But I will note that I
|
|
am running this example on a totally unloaded IMAP server with a
|
|
single user. The reality is that most IMAP servers are not running on
|
|
a box that has 0.00 load.
|
|
|
|
So the gains for this very simple example are significant - initial
|
|
login is twice as fast. A potential savings of 0.10 seconds on a
|
|
given connection could easily be possible: there would easily be this
|
|
much time savings given network latency from a mobile device, for
|
|
example. Given the old Amazon 100ms = 1% study, the theory behind
|
|
SUSPEND needs to at least be discussed.
|
|
|
|
For fun, I also took a look at the performance gains between a
|
|
COPY/STORE/EXPUNGE vs. MOVE command. Here I saw ~30% improvement
|
|
(0.13 seconds vs. 0.10 seconds). Granted, MOVE is being implemented
|
|
to allow for atomicity of the move action, but it is a good comparison.
|
|
|
|
>> I would argue that the language of the RFC still controls despite
|
|
>> what an e-mail on this list says. A client shouldn't be punished
|
|
>> for interpreting it that way either.
|
|
>
|
|
> The RFC is a specification crafted by humans. It has errors, and all
|
|
> subsequent revisions will still have errors. (See the errata for a
|
|
> list of those which are known already.) If you choose to block and
|
|
> not pipeline ENABLE QRESYNC and SELECT ... QRESYNC, you hurt your
|
|
> users. (Also note that the clarification given on this list was by
|
|
> the original authors of the RFC.)
|
|
|
|
Yes, but you cited to an e-mail message that said this should be the
|
|
case. I hardly feel an IMAP implementer is going to take someone's
|
|
opinion in an email as canon.
|
|
|
|
If this shows up as an errata to RFC 5161, I would tend to agree with
|
|
you. But it doesn't at this point.
|
|
|
|
>>> As of the LANGUAGE -- how often do you expect to hit an error
|
|
>>> condition which is not described by an appropriate response code?
|
|
>>> I don't think that blocking for its result would be a good design
|
|
>>> choice.
|
|
>>
|
|
>> That could be your decision as a client author. I would vehemently
|
|
>> disagree.
|
|
>>
|
|
>>> And finally, what IMAP servers support the LANGUAGE extension?
|
|
>>
|
|
>> Why does this matter? RFC 5255 is a Standards Track extension. A
|
|
>> year from now, every IMAP server and 200 new ones may support it.
|
|
>
|
|
> I stand by my reasoning. In order for the block to be actually
|
|
> usefull, you'll have to talk to a server which:
|
|
>
|
|
> 1) actually implements LANGUAGE,
|
|
> 2) executes all commands in parallel OR has the LANGUAGE command
|
|
> implemented in such a slow way that it enables parallel processing
|
|
> for it,
|
|
> 3) returns a failure for one of the first commands which you send
|
|
> *and* does not return an appropriate response code.
|
|
>
|
|
> But it's your client, do whatever you want to do :). I'm merely
|
|
> saying that adding an extension driven by the desire to eliminate
|
|
> issues like this is not something I support.
|
|
|
|
See benchmarks above. LANGUAGE response is a more complex response
|
|
than for ENABLE, so the floor of performance increase is 120%.
|
|
|
|
>> It would be impossible to determine benchmarks since there is no
|
|
>> defined protocol yet. And, as mentioned above, any given
|
|
>> client/server interaction may provide different results based on
|
|
>> their own internal optimizations and extension support.
|
|
>
|
|
> Right. Well, based on how my client works, I don't expect any
|
|
> significant performance gains obtained through this proposal.
|
|
|
|
Sure - just like IDLE is completely useless for disconnected clients.
|
|
That doesn't make SUSPEND not very useful for at least some clients.
|
|
|
|
> I'm not the standards commitee, but having decent numbers saying
|
|
> "see, this RESUME extensions cuts 40% out of the 1300ms required to
|
|
> establish an IMAP session" is something which moves the discussion
|
|
> from the current, very vague stage of "this is good -- nope, this is
|
|
> worthless" into a stage where we can actually discuss what merits it
|
|
> really brings. As you're proposing the extension, you should IMHO
|
|
> provide these numbers.
|
|
|
|
A MOVE saves 30% performance off equivalent commands. SUSPEND, at
|
|
least for a simple example, saves 120%. (And see below re: NOTIFY
|
|
about something that CAN'T practically be done with current
|
|
disconnected clients).
|
|
|
|
> 1) You don't take the initial CAPABILITY into account, but you
|
|
> re-request CAPABILITY after login. (You need the initial capability
|
|
> to see whether the server supports RESUME at all.) This will change
|
|
> the numbers quite a lot.
|
|
|
|
What's the point of including benchmarking of the initial CAPABILITY?
|
|
Both clients need to do this, so there is no difference - it is no
|
|
more expensive for a SUSPEND client than a non-SUSPEND client.
|
|
|
|
And one of the reasons that I designed the RESUME command as I did is
|
|
precisely to address the second part of your comment: the need to
|
|
potentially send CAPABILITY pre-login. From an client implementer's
|
|
standpoint, it is quite likely that you DON'T need this CAPABILITY so
|
|
that is an additional advantage.
|
|
|
|
Let's assume that your client program has previously connected to a
|
|
given IMAP server and executed a successful SUSPEND command. The next
|
|
time it connects to the same IMAP server, it has no way of knowing
|
|
whether that server is identical pre-authentication. However:
|
|
|
|
1. Since the previously connected server supports the SUSPEND command,
|
|
and it is very likely (although not guaranteed) that the server hasn't
|
|
changed in the time since the client last connected, it can be assumed
|
|
to a high degree of probability that the server supports SUSPEND.
|
|
2. A client using SUSPEND information will know which authentication
|
|
method was successful the first time it connected to the server.
|
|
Following the logic in #1, it can be assume that the server continues
|
|
supports this authentication method.
|
|
3. RESUME command doesn't output any response that needs to be parsed
|
|
before authentication can occur.
|
|
|
|
If #1 happens to not be true, this is irrelevant - a client will just
|
|
do normal initialization when resuming (the RESUME command would
|
|
generate a BAD tagged response, but a client SHOULD ignore this).
|
|
|
|
If #2 is not true, a client would have sent 2 unnecessary commands but
|
|
otherwise, no harm done.
|
|
|
|
#1 or #2 is an incorrect assumption in, say, 1 out of 100 connections
|
|
(which is probably a tremendously conservative example. In a large
|
|
webmail installation, with 10,000+ concurrent users, you are getting
|
|
millions of connections a day on software that isn't being touched for
|
|
several months). Even at this rate, it still makes far more sense to
|
|
make these assumptions than 1% of the time sending an additional 2
|
|
round-trips.
|
|
|
|
So a client supporting RESUME will likely save ANOTHER entire
|
|
round-trip, so the 100%+ gain listed above is again shown to be a
|
|
conservative estimate.
|
|
|
|
> 2) The sample token which Timo showed on the other list was way
|
|
> longer than base64("state token") you use. Just saying.
|
|
|
|
Sure. But as long as suspend tokens are not approaching 1000+ bytes,
|
|
they should comfortably fit into an IP packet so this is irrelevant.
|
|
|
|
> 3) Saving 600 bytes of transmitted data per connection is noise
|
|
> compared to what an actual session typically transfers.
|
|
|
|
A 50-100ms reduction in connection time is not noise. Maybe it is for
|
|
a single client connecting to a single server. But it most certainly
|
|
is not for large, distributed systems. This kind of savings can be
|
|
the difference between needing to add an additional server to the
|
|
backend farm, which may cost a significant amount of money in
|
|
hardware/installation/maintenance costs.
|
|
|
|
> 4) You could save even more bytes by converting IMAP to a binary
|
|
> protocol. That possibility in itself is, however, no reason to do so.
|
|
|
|
I'm not looking to write IMAP 5. I'm looking at a relatively
|
|
uncomplicated way to improve performance in IMAP 4.
|
|
|
|
> 5) You're taking an advantage of eliminating NAMESPACE, but so far
|
|
> have ignored LIST and STATUS, even though a typicall client will
|
|
> need them as well. When the LIST responses come into account,
|
|
> savings of 600 bytes starts looking more and more like noise -- not
|
|
> mentioning the mailbox synchronization or data transfers.
|
|
|
|
Mailbox listing is a very touchy spot for disconnected clients.
|
|
Historically, a disconnected client is pretty much stuck with listing
|
|
the mailboxes once with the understanding that if another client
|
|
changes the mailbox structure there's not much we can do about it
|
|
without allowing a user to manually refresh the mailbox list (or
|
|
possibly doing something like polling the mailbox list at a given time
|
|
interval).
|
|
|
|
However, as Timo noted, SUSPEND potentially allows disconnected
|
|
clients to take advantage of NOTIFY. Which would be a gigantic gain.
|
|
With the combination of the two, disconnected clients could
|
|
potentially have the equivalent of QRESYNC for mailbox lists, which is
|
|
a feature that doesn't currently exist. No amount of pipelining is
|
|
going to fix this.
|
|
|
|
Additionally, this behavior makes SUSPEND useful for connected clients
|
|
if such client locally caches mailbox lists: a desktop client that
|
|
opens a second or two faster due to the fact that LIST's don't need to
|
|
be sent is a substantial UI improvement.
|
|
|
|
In other words, SUSPEND brings real-world performance improvements and
|
|
provides multiple features that are not possible with current IMAP
|
|
protocol/extensions.
|
|
|
|
Once again, thanks for the comments.
|
|
|
|
michael
|
|
|
|
|