wasm-demo/demo/ermis-f/imap-protocol/cur/1600095068.22701.mbox:2,S

MBOX-Line: From jkt at flaska.net  Tue Nov 27 07:50:57 2012
To: imap-protocol@u.washington.edu
From: =?iso-8859-1?Q?Jan_Kundr=E1t?= <jkt@flaska.net>
Date: Fri Jun  8 12:34:49 2018
Subject: [Imap-protocol] Re: Suspend/Restore feature proposal
In-Reply-To: <20121126173224.Horde.BbqbGly8D0JG4aqG7fxoMw1@bigworm.curecanti.org>
References: <20121115215854.Horde.zz6B0O0tmt3ylHiEXzXhQQ4@bigworm.curecanti.org>
	<CABa8R6tHP2My0k2LqT1RzHoLQZA+X_jwUMU0cydm2sAwo8f4Fg@mail.gmail.com>
	<20121116143137.Horde.7Kb3aEW5DhM6CnuA2hAx4Q8@bigworm.curecanti.org>
	<e122fa25-9110-4b36-855d-0e7e273c5805@flaska.net>
	<20121121155417.Horde.ZeW7JqTPNxTAI-hTtrAT-Q9@bigworm.curecanti.org>
	<68035860-387d-43a9-8bb7-00744a7868b9@flaska.net>
	<20121126173224.Horde.BbqbGly8D0JG4aqG7fxoMw1@bigworm.curecanti.org>
Message-ID: <441db95f-2452-4d0a-9688-ccb3df28fb3c@flaska.net>

On Tuesday, 27 November 2012 01:32:24 CEST, Michael M Slusarz wrote:
> I still maintain that writing an API that requires advanced
> knowledge  of IMAP is not that useful.  Things like QRESYNC and
> LIST-STATUS can  be entirely abstracted so a client coder does
> not need to know  anything about them to take advantage of.

Depends on who your "client coder" is. Yes, you can easily get your users a view of the mailbox and handle the whole IMAP complexity yourself, in your IMAP library -- I've done that and it works great. But it is important to realize that the user of your API is no longer an "IMAP client implementor" -- you are one, and you will have to deal with all of QRESYNC and LIST-STATUS inside your library which provides that nice IMAP-agnostic API to your users.

Clearly, a layer of abstraction is a good thing here.

> What happens when the server is upgraded and UTF-8 searching
> now  works?  The CAPABILITY string is exactly the same.  But
> UTF-8 has been  marked as a bad charset so it will still not be
> available.  And what  about those commands that have been
> determined to be broken previously  in the session?  It is
> reasonable to expect the CAPABILITY string to  be the same
> between point releases of an IMAP server, but the server  may
> have fixed the bug that was causing bad command behavior.

If I were in your situation, I would probably create a simple script which will test the functionality of your IMAP server and instruct your administrators to run it whenever they update their IMAP servers, or even make an infrastructure to execute the tests "once in a while". That way, you would not have to wait for servers to adopt RESUME.

With regards to the situation of talking to different server versions (thus exhibiting different bugs) behind the same DNS alias -- how often do you expect that to happen? Is that a configuration you have to support, and support efficiently, i.e. not falling back to the lowest common denominator among the servers' features?

> I went ahead and setup some rough/quick benchmarking using
> current  imapproxy behavior as a proxy for the SUSPEND behavior.
>  In this  benchmark, the server and client are on the same
> machine so network  latency is assumed to be non-existent.  The
> load on this machine is  also non-existent (this test is the
> only active IMAP process; disk I/O  is negligible).
>
> Login without resuming session (connecting to a Dovecot 2.1 server)
>
> C: 1 LOGIN [login credentials]
> S: 1 OK User logged in
> C: 2 CAPABILITY
> S: * CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID
> ENABLE  IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS
> THREAD=ORDEREDSUBJECT MULTIAPPEND UNSELECT CHILDREN NAMESPACE
> UIDPLUS  LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH
> ESORT SEARCHRES  WITHIN CONTEXT=SEARCH LIST-STATUS SPECIAL-USE
> ACL RIGHTS=texk
> S: 2 OK Capability completed.
> C: 3 ENABLE QRESYNC
> S: * ENABLED QRESYNC
> S: 3 OK Enabled.
>
> Average elapsed time: 0.087 seconds
>
> Login with resuming session:
>
> C: 1 LOGIN [login credentials]
> S: * OK [XPROXYREUSE] IMAP connection reused by squirrelmail-imap_proxy
> S: 1 OK User logged in
>
> Average elapsed time: 0.039 seconds
>
> Difference: 0.048 seconds (~120% improvement)

Thanks for posting numbers. Now let's focus on possible issues with that. What are you measuring, exactly? Is that the total time your PHP script takes from the time it opens the TCP connection till parsing the last tagged OK? How much time is spent in your client's code which performs tokenization of the CAPABILITY response, i.e. one thing which you've already identified as a performance bottleneck? Can you make your CAPABILITY parsing faster? What about detailed traces showing *when* is the time really spent?

What happens when you put the imapproxy outside of the measurement setup and talk directly to your IMAP daemon?

I've done my own crude benchamrk on my laptop with Dovecot 2.1.9 (Gentoo) using PAM (backed by /etc/shadow with some "pretty recent" hash) and the difference between running the following two commands:

a) time echo -en "2 LOGIN user password\r\n" | socat - TCP:localhost:imap

b) time echo -en "1 CAPABILITY\r\n2 LOGIN user password\r\n3 CAPABILITY\r\n4 ENABLE QRESYNC\r\n5 NAMESPACE\r\n" | socat - TCP:localhost:imap

...is indeed in the noise area -- for the first command which performs less work, bash's `time` builtin reports the following durations, in milliseconds:

a) [25, 82, 56, 78, 53, 74, 66, 52, 59, 53, 82, 74, 63, 61]

while for the other one, I get the following raw data:

b) [66, 62, 64, 59, 63, 63, 74, 65, 54, 37, 32, 40, 32, 53, 57, 33]

Which means that a's average is 62.7 ms with standard deviation of 14.7, while in b's case, the average runtime was 53.4 ms with standard deviation 13.5. It's a long time since my stats class, but it looks like neither Dovecot nor actual I/O performed over TCP are bottleneck here.

Based on the above, I claim that when the following conditions are met:

1) one uses pipelining,
2) the client-side parser takes negligible time,

then the proposed extension will not save any measurable time.

Now, 1) is possible -- these commands can be pipelined. What about #2? I took the liberty to add this particular benchmark to my client's test suite [2], the parsing takes 0.13ms (not 0.13s, but 130ns) when run on my laptop. The parser is a pretty high-level C++ code using Qt's QByteArray with no optimizaiton whatsoever aimed at reducing excess copying and what not. I'm sure that it can be optimized to take a fraction of that time, it's just that I cannot be bothered to optimize something which is not an issue. The actual data I'm parsing are visible in the test suite and represent a real-world output from Dovecot here.

> Yes, but you cited to an e-mail message that said this should
> be the  case.  I hardly feel an IMAP implementer is going to
> take someone's  opinion in an email as canon.
>
> If this shows up as an errata to RFC 5161, I would tend to
> agree with  you.  But it doesn't at this point.

Errata #1365 [1], "held for document update", submitted in March 2008. I have a new draft version of RFC5162-bis in my INBOX and I'll make sure this gets in if it isn't there already.

>> Right. Well, based on how my client works, I don't expect any
>> significant performance gains obtained through this proposal.
>
> Sure - just like IDLE is completely useless for disconnected
> clients.   That doesn't make SUSPEND not very useful for at
> least some clients.

(To clarify, a client which does not keep its connection active is not usually called a "disconnected client", AFAIK. That is usually meant to identify clients which often work without the network connection, but will happily use it when it's available.)

What I'm saying here is that I suspect that your expectation of performance savings is based on your particular client's implementation details which make it inefficient when talking to current IMAP servers. As an example, let's take the CAPABILITY parsing/handling.

Your first option is to suggest a replacement which elliminates it more or less altogether. My first option is to make your CAPABILITY handling fast enough so that RESUME is not needed.

> A MOVE saves 30% performance off equivalent commands.  SUSPEND,
> at  least for a simple example, saves 120%.  (And see below re:
> NOTIFY  about something that CAN'T practically be done with
> current  disconnected clients).

I have two problems with this:

1) The motivation behind MOVE was not to save performance. In addition, your 30% quote in your mail does not come with any measurement results, doesn't clarify whether you used pipelining or not, and does not mention where was that time actually spent.

2) While I would welcome an extension of NOTIFY to notify me about events which have happened while I was offline, please note that there's nothing in your RESUME draft *and* the NOTIFY RFC actually mandating the server to remember the events since the last time. Remember, "will send updates as configured previously" is very different from "will do the same and also send updates on what has happened since that time".

>> 1) You don't take the initial CAPABILITY into account, but you
>>  re-request CAPABILITY after login. (You need the initial
>> capability  to see whether the server supports RESUME at all.)
>> This will change  the numbers quite a lot.
>
> What's the point of including benchmarking of the initial
> CAPABILITY?   Both clients need to do this, so there is no
> difference - it is no  more expensive for a SUSPEND client than
> a non-SUSPEND client.

If you are citing "improvement in speed by 30%", you have to base this 30% on something. The usual approach is to base it on the total duration.

> And one of the reasons that I designed the RESUME command as I
> did is  precisely to address the second part of your comment:
> the need to  potentially send CAPABILITY pre-login.  From an
> client implementer's  standpoint, it is quite likely that you
> DON'T need this CAPABILITY so  that is an additional advantage.

If you want to be strict, you need to parse CAPABILITY at least once per connection to now that you can actually send RESUME.

> Let's assume that your client program has previously connected
> to a  given IMAP server and executed a successful SUSPEND
> command.  The next  time it connects to the same IMAP server, it
> has no way of knowing  whether that server is identical
> pre-authentication.  However:
>
> 1. Since the previously connected server supports the SUSPEND
> command,  and it is very likely (although not guaranteed) that
> the server hasn't  changed in the time since the client last
> connected, it can be assumed  to a high degree of probability
> that the server supports SUSPEND.

Following that reasoning, you can easily "blindly" send ENABLE QRESYNC and LANGUAGE as well. Please be consistent -- either you allow that for none of (RESUME, ENABLE ..., LANGUAGE), or you allow that for all these.

[...]

> So a client supporting RESUME will likely save ANOTHER entire
> round-trip, so the 100%+ gain listed above is again shown to be
> a  conservative estimate.

If you're blindly sending RESUME, you can do the same with any other command shown so far (and including AUTHENTICATE). The risks are always the same (except what, maybe a kilobyte of wasted bandwidth in situations where "something has changed"? Who cares?)

>> 3) Saving 600 bytes of transmitted data per connection is
>> noise  compared to what an actual session typically transfers.
>
> A 50-100ms reduction in connection time is not noise.

We have not established yet that these 50-100ms cannot be addressed by improving your client's code.

> Mailbox listing is a very touchy spot for disconnected clients.
>   Historically, a disconnected client is pretty much stuck with
> listing  the mailboxes once with the understanding that if
> another client  changes the mailbox structure there's not much
> we can do about it  without allowing a user to manually refresh
> the mailbox list (or  possibly doing something like polling the
> mailbox list at a given time  interval).
>
> However, as Timo noted, SUSPEND potentially allows disconnected
>  clients to take advantage of NOTIFY.

I've mentioned it above -- there's nothing in the provided draft which makes it possible to use NOTIFY across sessions. Yep, I agree that it would be cool if it was possible *and* if someone actually implemented NOTIFY. But that's orthogonal to SUSPEND/RESUME.

> Additionally, this behavior makes SUSPEND useful for connected
> clients  if such client locally caches mailbox lists: a desktop
> client that  opens a second or two faster due to the fact that
> LIST's don't need to  be sent is a substantial UI improvement.
>
> In other words, SUSPEND brings real-world performance
> improvements and  provides multiple features that are not
> possible with current IMAP  protocol/extensions.

You've lost me here -- surely this benefit depends on yet unwritten extension to NOTIFY which makes it work across sessions, right?

With kind regards,
Jan

[1] http://www.rfc-editor.org/errata_search.php?eid=1365
[2] http://commits.kde.org/trojita/92ae247ab69121fc3e8c886fe8c0e2da3e1740f7