wasm-demo/demo/ermis-f/imap-protocol/cur/1600095123.23004.mbox:2,S

MBOX-Line: From tss at iki.fi  Wed May 10 01:39:50 2006
To: imap-protocol@u.washington.edu
From: Timo Sirainen <tss@iki.fi>
Date: Fri Jun  8 12:34:37 2018
Subject: [Imap-protocol] SEARCH
In-Reply-To: <Pine.OSX.4.64.0604141039280.15258@pangtzu.panda.com>
References: <1145029156.10727.81.camel@hurina>
	<Pine.OSX.4.64.0604141039280.15258@pangtzu.panda.com>
Message-ID: <1147250390.17524.39.camel@localhost.localdomain>

Sorry for a late reply, I just seem to get more and more lagged nowadays
with replying to emails..

On Fri, 2006-04-14 at 11:30 -0700, Mark Crispin wrote:
> > If the search key is invalid for the given character set, should server
> > return BAD error to client? Are non-ASCII characters in search key
> > invalid for US-ASCII charset?
>
> I'm not certain what you mean by "invalid".
>
> Do you mean "contain a codepoint that is not in that charset"?  If so, I
> think a failed match is better than a BAD, since it may be that the server
> has an obsolete version of that charset's definition.

I think most character sets don't change anymore (ASCII and ISO-8859-*
especially), but I guess it's nicer for clients to not get BAD replies.

> > What about if search key contains non-ASCII characters but no charset
> > parameter is given? Currently I assume this means just doing a substring
> > search from messages without doing any charset conversions (i;octet
> > comparator).
>
> It can mean whatever you want, although perhaps a failed match is best.
> Or maybe a BAD in this case, because the specification does denounce use
> of 8-bit strings without a charset identification in section 4.3.1

I understood that section to only mean message bodies sent as reply to
FETCH.

> > More interesting are MIME footer and trailer sections. Should they be
> > searched? UW-IMAP skips them.
>
> I consider these not to be part of a message at all for any MIME-savvy
> application.

OK, this is mostly what I was concerned about. The RFC doesn't say
anything about if they should or shouldn't be searched.

> > What about MIME boundary lines? UW-IMAP
> > searches these, but not if you include its "--" prefix in search key.
>
> Are you certain that you aren't confusing BODY and TEXT searches?  A TEXT
> search would find them, because they appear in the MIME header.

Right, sorry, that must be it.

> > Is "Header: value" searching required to work? I think it is, and works
> > with UW-IMAP.
>
> What do you mean by this?  If you're talking about a TEXT search, then it
> may or may not work depending upon the octets in a message.  You should be
> using a "HEADER Header: value" search instead.

Yes, I mean TEXT search. I know HEADER is the correct way, but since I
was going to fix my SEARCH code, I thought I'd make it work correctly in
all cases (if there were correct ways for cases like this).

> > Is "line\r\nline2" (as literal of course with real CR+LF)
> > searching required to work in message body? Again, I think so and works
> > with UW-IMAP.
>
> Yes, it should in a TEXT search.  But see below.
>
> > But then is "Header: value\r\nHeader2: value2" searching
> > required to work? I don't see why not, but this doesn't work anymore
> > with UW-IMAP.
>
> Once again, I'd like to understand what you mean by this.
>
> If you're talking about a TEXT search, I don't see why it shouldn't work,
> although it might be that you have a mailbox format that uses UNIX-style
> newlines and the data was not CRLF-converted.

Yes, TEXT search. If it's supposed (required) to work, then I think it
shouldn't matter if the data is in LF or CRLF format in the mailbox,
because client always sees the mails CRLF-terminated.

> I don't think that it is useful for a client to have newlines in a search
> key.  Some servers try to do fuzzy matching, so for example if you search
> for "Joe's trip to Paris" there will be a match even if it was broken by a
> newline.

And do you think this is still allowed by RFC?

I was thinking about allowing some text search engines to be used with
my server, but I thought about creating some new extension for it, since
I thought using them with SEARCH would break the RFC (because eg. they
couldn't find "imo" from "timo" string and in general the matches
wouldn't be exact).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: This is a digitally signed message part
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20060510/e22be140/attachment.sig>