63 lines
2.9 KiB
Plaintext
63 lines
2.9 KiB
Plaintext
MBOX-Line: From tss at iki.fi Thu Dec 21 15:35:38 2006
|
|
To: imap-protocol@u.washington.edu
|
|
From: Timo Sirainen <tss@iki.fi>
|
|
Date: Fri Jun 8 12:34:38 2018
|
|
Subject: [Imap-protocol] Searching
|
|
In-Reply-To: <alpine.WNT.0.81.0612201412380.3856@Shimo-Tomobiki.panda.com>
|
|
References: <1166603479.22214.298.camel@hurina>
|
|
<alpine.OSX.0.81.0612201018450.10225@pangtzu.panda.com>
|
|
<1166643258.22214.350.camel@hurina>
|
|
<alpine.WNT.0.81.0612201412380.3856@Shimo-Tomobiki.panda.com>
|
|
Message-ID: <1166744138.22214.523.camel@hurina>
|
|
|
|
On Wed, 2006-12-20 at 14:19 -0800, Mark Crispin wrote:
|
|
> On Wed, 20 Dec 2006, Timo Sirainen wrote:
|
|
> > Optimizing the string search would help some, but for large mailboxes
|
|
> > it's still a bit too slow. People want instant search results
|
|
> > nowadays. :)
|
|
>
|
|
> Please define what you mean by "large" and "instant".
|
|
|
|
Some people would want to see the results as they keep typing the search
|
|
keyword. For that kind of a user interface the search can't really take
|
|
much longer than 0.1 seconds or it'll look slow.
|
|
|
|
> It took some effort for me to construct a mailbox that was pathologically
|
|
> large enough for a search in UW imapd to take a whole 2 seconds.
|
|
|
|
But I guess that's for a mailbox that's already in file cache? I'd think
|
|
that in a real mail server most users' mailboxes need to be read from
|
|
the disk as they're searched, and for a loaded server that can be even
|
|
slower. I've also heard of users whose INBOX is over 2 gigabytes..
|
|
|
|
> > Perhaps. I think it depends on how badly mail admins want it. If it's
|
|
> > only a small s/BODY/X-NONEXACT-BODY/ replace for their webmail code,
|
|
> > it'll get usage at least within Dovecot community.
|
|
>
|
|
> By the way, are you doing charset and i18n case-mapping in your
|
|
> "non-exact" search? That, and not the searching, is what takes time.
|
|
|
|
The "non-exact" naming means just that the text/body searching can
|
|
implement different search string matching rules than what IMAP RFC
|
|
defines. I couldn't think of a good name for it. Maybe
|
|
X-NONRFC-TEXT/BODY :)
|
|
|
|
But for a standard search, yes, I'm converting mails to UTF-8 before
|
|
doing any searching. I should add support for case-insensitive UTF-8
|
|
searches also, but for now I'm doing it only for ASCII. No-one's
|
|
complained yet though :)
|
|
|
|
Anyway, yes, I could probably get my standard search code a lot faster
|
|
(UW-IMAP searches mboxes 2-3 times faster), but that won't help with
|
|
disk I/O usage. Usually there's enough CPU to go around, but not that
|
|
much available disk I/O. Indexing helps a lot with that. So it's not
|
|
just for bringing down search times from a few seconds to zero, but also
|
|
lowering the system load in general.
|
|
-------------- next part --------------
|
|
A non-text attachment was scrubbed...
|
|
Name: signature.asc
|
|
Type: application/pgp-signature
|
|
Size: 196 bytes
|
|
Desc: This is a digitally signed message part
|
|
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20061222/07ac7c51/attachment.sig>
|