96 lines
3.9 KiB
Plaintext
96 lines
3.9 KiB
Plaintext
MBOX-Line: From andris.reinman at gmail.com Wed Sep 9 13:39:28 2020
|
|
To: imap-protocol@u.washington.edu
|
|
From: Andris Reinman <andris.reinman@gmail.com>
|
|
Date: Wed Sep 9 13:39:52 2020
|
|
Subject: [Imap-protocol] Any valid use case for COPY besides moving
|
|
messages?
|
|
In-Reply-To: <49C6D280-1110-4545-B7DE-303C98F98A71@gmail.com>
|
|
References: <CAPacwgy_1WJd5TLRDbykTnzfwv9hgcTLzBQZte5=bMRK-RUFLQ@mail.gmail.com>
|
|
<49C6D280-1110-4545-B7DE-303C98F98A71@gmail.com>
|
|
Message-ID: <CAPacwgz-aLLPmM_5ubesXRrqjsCvkMtYLvmMV7nM2yAs_yWJcA@mail.gmail.com>
|
|
|
|
> What does the db data model look like? Is the email data in the database,
|
|
> are there pointers?
|
|
>
|
|
|
|
It is a sharded MongoDB cluster. Messages are parsed into structured
|
|
documents, attachments are decoded and deduplicated and stored separately.
|
|
All email documents are stored in the same large collection (currently at
|
|
250M documents) and include a "mailbox id" value that is used to filter out
|
|
messages that belong to the same IMAP mailbox.
|
|
|
|
Around 30M FETCH commands are run every day against that collection so it
|
|
is quite busy as well.
|
|
|
|
Doing a copy for messages means
|
|
1. getting a cursor for the result set of matching messages
|
|
2. processing the cursor one message at a time
|
|
3. reading the entire message entry from DB to application (without
|
|
attachments as these are stored separately)
|
|
4. modifying the "folder id" and UID, MODSEQ etc values of the record in
|
|
memory
|
|
5. inserting the record to the collection as a new document
|
|
|
|
Copying each message also triggers notifications about added messages to
|
|
IMAP clients, modseq updates for the mailbox etc.
|
|
So obviously this is not very fast and may take a lot of time (in computer
|
|
terms).
|
|
|
|
I wanted to use a different approach at first where there would be just a
|
|
single email document and that document would contain a list of mailbox ids
|
|
where it is currently stored (and also what are the UID/MODSEQ values in
|
|
these mailboxes). I was not able to figure out proper database indexing for
|
|
that so went with the more simplistic approach where every email document
|
|
is dedicated to a specific mailbox.
|
|
|
|
Regards,
|
|
Andris Reinman
|
|
|
|
Kontakt Aaron Burrow (<burrows.labs@gmail.com>) kirjutas kuup?eval K, 9.
|
|
september 2020 kell 22:45:
|
|
|
|
>
|
|
> On Sep 9, 2020, at 3:28 AM, Andris Reinman <andris.reinman@gmail.com>
|
|
> wrote:
|
|
>
|
|
> ?
|
|
> Hi,
|
|
>
|
|
> As the subject states, is there actually any valid use case these days for
|
|
> COPY to just copy messages instead of being a poor substitute for MOVE
|
|
> (that is COPY+EXPUNGE)?
|
|
>
|
|
> If an IMAP server would mark COPYied messages with \Delete and
|
|
> expunge these immediately after a message has been copied, would it break
|
|
> any real-use expectations?
|
|
>
|
|
> Why I'm asking is that I'm building a database backed email server (
|
|
> https://wildduck.email), we have a moderately sized cluster of emails
|
|
> (100k+ users, ~50TB+ of data, few hundred million emails) and when an IMAP
|
|
> client tries to copy all messages from one large folder to another then
|
|
> copying takes a lot of time (eg 'COPY 1:* target' where * is 10 000) as
|
|
> listing the database entries and copying these around takes time. And as
|
|
> there is no response until messages have been fully copied the client might
|
|
> think that TCP connection has been lost and retries the same action, ending
|
|
> up doing multiple COPY calls.
|
|
>
|
|
> So I was wondering if we could simply delete the already copied message
|
|
> from the source folder, as most probably the client would do it anyway once
|
|
> COPY is fully completed. Basically COPY would be an alias for MOVE.
|
|
> Obviously non-standard behavior but would we actually break something
|
|
> client side by doing this?
|
|
>
|
|
> Regards,
|
|
> Andris Reinman
|
|
> https://wildduck.email
|
|
>
|
|
> _______________________________________________
|
|
> Imap-protocol mailing list
|
|
> Imap-protocol@u.washington.edu
|
|
> http://mailman13.u.washington.edu/mailman/listinfo/imap-protocol
|
|
>
|
|
>
|
|
-------------- next part --------------
|
|
An HTML attachment was scrubbed...
|
|
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20200909/342aafb5/attachment.html>
|