107 lines
4.5 KiB
Plaintext
107 lines
4.5 KiB
Plaintext
MBOX-Line: From tss at iki.fi Wed May 10 01:39:50 2006
|
|
To: imap-protocol@u.washington.edu
|
|
From: Timo Sirainen <tss@iki.fi>
|
|
Date: Fri Jun 8 12:34:37 2018
|
|
Subject: [Imap-protocol] SEARCH
|
|
In-Reply-To: <Pine.OSX.4.64.0604141039280.15258@pangtzu.panda.com>
|
|
References: <1145029156.10727.81.camel@hurina>
|
|
<Pine.OSX.4.64.0604141039280.15258@pangtzu.panda.com>
|
|
Message-ID: <1147250390.17524.39.camel@localhost.localdomain>
|
|
|
|
Sorry for a late reply, I just seem to get more and more lagged nowadays
|
|
with replying to emails..
|
|
|
|
On Fri, 2006-04-14 at 11:30 -0700, Mark Crispin wrote:
|
|
> > If the search key is invalid for the given character set, should server
|
|
> > return BAD error to client? Are non-ASCII characters in search key
|
|
> > invalid for US-ASCII charset?
|
|
>
|
|
> I'm not certain what you mean by "invalid".
|
|
>
|
|
> Do you mean "contain a codepoint that is not in that charset"? If so, I
|
|
> think a failed match is better than a BAD, since it may be that the server
|
|
> has an obsolete version of that charset's definition.
|
|
|
|
I think most character sets don't change anymore (ASCII and ISO-8859-*
|
|
especially), but I guess it's nicer for clients to not get BAD replies.
|
|
|
|
> > What about if search key contains non-ASCII characters but no charset
|
|
> > parameter is given? Currently I assume this means just doing a substring
|
|
> > search from messages without doing any charset conversions (i;octet
|
|
> > comparator).
|
|
>
|
|
> It can mean whatever you want, although perhaps a failed match is best.
|
|
> Or maybe a BAD in this case, because the specification does denounce use
|
|
> of 8-bit strings without a charset identification in section 4.3.1
|
|
|
|
I understood that section to only mean message bodies sent as reply to
|
|
FETCH.
|
|
|
|
> > More interesting are MIME footer and trailer sections. Should they be
|
|
> > searched? UW-IMAP skips them.
|
|
>
|
|
> I consider these not to be part of a message at all for any MIME-savvy
|
|
> application.
|
|
|
|
OK, this is mostly what I was concerned about. The RFC doesn't say
|
|
anything about if they should or shouldn't be searched.
|
|
|
|
> > What about MIME boundary lines? UW-IMAP
|
|
> > searches these, but not if you include its "--" prefix in search key.
|
|
>
|
|
> Are you certain that you aren't confusing BODY and TEXT searches? A TEXT
|
|
> search would find them, because they appear in the MIME header.
|
|
|
|
Right, sorry, that must be it.
|
|
|
|
> > Is "Header: value" searching required to work? I think it is, and works
|
|
> > with UW-IMAP.
|
|
>
|
|
> What do you mean by this? If you're talking about a TEXT search, then it
|
|
> may or may not work depending upon the octets in a message. You should be
|
|
> using a "HEADER Header: value" search instead.
|
|
|
|
Yes, I mean TEXT search. I know HEADER is the correct way, but since I
|
|
was going to fix my SEARCH code, I thought I'd make it work correctly in
|
|
all cases (if there were correct ways for cases like this).
|
|
|
|
> > Is "line\r\nline2" (as literal of course with real CR+LF)
|
|
> > searching required to work in message body? Again, I think so and works
|
|
> > with UW-IMAP.
|
|
>
|
|
> Yes, it should in a TEXT search. But see below.
|
|
>
|
|
> > But then is "Header: value\r\nHeader2: value2" searching
|
|
> > required to work? I don't see why not, but this doesn't work anymore
|
|
> > with UW-IMAP.
|
|
>
|
|
> Once again, I'd like to understand what you mean by this.
|
|
>
|
|
> If you're talking about a TEXT search, I don't see why it shouldn't work,
|
|
> although it might be that you have a mailbox format that uses UNIX-style
|
|
> newlines and the data was not CRLF-converted.
|
|
|
|
Yes, TEXT search. If it's supposed (required) to work, then I think it
|
|
shouldn't matter if the data is in LF or CRLF format in the mailbox,
|
|
because client always sees the mails CRLF-terminated.
|
|
|
|
> I don't think that it is useful for a client to have newlines in a search
|
|
> key. Some servers try to do fuzzy matching, so for example if you search
|
|
> for "Joe's trip to Paris" there will be a match even if it was broken by a
|
|
> newline.
|
|
|
|
And do you think this is still allowed by RFC?
|
|
|
|
I was thinking about allowing some text search engines to be used with
|
|
my server, but I thought about creating some new extension for it, since
|
|
I thought using them with SEARCH would break the RFC (because eg. they
|
|
couldn't find "imo" from "timo" string and in general the matches
|
|
wouldn't be exact).
|
|
-------------- next part --------------
|
|
A non-text attachment was scrubbed...
|
|
Name: signature.asc
|
|
Type: application/pgp-signature
|
|
Size: 196 bytes
|
|
Desc: This is a digitally signed message part
|
|
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20060510/e22be140/attachment.sig>
|