wasm-demo/demo/ermis-f/python_m/cur/0231

From: tseaver at palladion.com (Tres Seaver)
Date: Wed, 07 Apr 1999 08:32:51 -0500
Subject: GadFly - MemoryError
References: <Pine.SOL2.3.96.SK.990403182611.18369K-100000@sun.med.ru>
Message-ID: <370B5E83.C796BFF3@palladion.com>
Content-Length: 1105
X-UID: 231

Oleg Broytmann wrote:
>
> Hello!
>
>    I tried to add yeat another database backend to my project "Bookmarks
> database". My database contains now about 3000 URLs, not too much, I think.
> I subclass by BookmarksParser to parse bookmarks.html into gadfly database
> and got a database of 500 Kbytes - very small database, I hope.
>    Then I tried to find duplicates (there are duplicates). I ran the query:
>
> SELECT b1.rec_no, b2.rec_no, b1.URL
>    FROM bookmarks b1, bookmarks b2
> WHERE b1.URL = b2.URL
> AND   b1.rec_no < b2.rec_no

How many duplicates are there?  Something like

  SELECT URL FROM bookmarks GROUP BY URL HAVING COUNT(*) > 1

will produce the URL's with duplicates;  you could then do

  SELECT rec_no, URL FROM bookmarks
    WHERE URL IN
      (SELECT URL FROM bookmarks GROUP BY URL HAVING COUNT(*) > 1)

or create a temp table first with the results of the subquery, then join it in a
separate query.
--
=========================================================
Tres Seaver         tseaver at palladion.com    713-523-6582
Palladion Software  http://www.palladion.com