From: aaron_watters at my-dejanews.com (aaron_watters at my-dejanews.com) Date: Thu, 08 Apr 1999 15:15:30 GMT Subject: GadFly - MemoryError References: <370B5E83.C796BFF3@palladion.com> Message-ID: <7eih6b$ck2$1@nnrp1.dejanews.com> Content-Length: 1636 X-UID: 144 It would be interesting to try this, but my guess is that this would be slower than the first query since equalities are optimized and "in" is not. I hope oleg is using gadfly 1.0 too (not beta 0.2 or whatever). -- Aaron Watters In article <370B5E83.C796BFF3 at palladion.com>, Tres Seaver wrote: > Oleg Broytmann wrote: > > > > Hello! > > > > I tried to add yeat another database backend to my project "Bookmarks > > database". My database contains now about 3000 URLs, not too much, I think. > > I subclass by BookmarksParser to parse bookmarks.html into gadfly database > > and got a database of 500 Kbytes - very small database, I hope. > > Then I tried to find duplicates (there are duplicates). I ran the query: > > > > SELECT b1.rec_no, b2.rec_no, b1.URL > > FROM bookmarks b1, bookmarks b2 > > WHERE b1.URL = b2.URL > > AND b1.rec_no < b2.rec_no > > How many duplicates are there? Something like > > SELECT URL FROM bookmarks GROUP BY URL HAVING COUNT(*) > 1 > > will produce the URL's with duplicates; you could then do > > SELECT rec_no, URL FROM bookmarks > WHERE URL IN > (SELECT URL FROM bookmarks GROUP BY URL HAVING COUNT(*) > 1) > > or create a temp table first with the results of the subquery, then join it in a > separate query. > -- > ========================================================= > Tres Seaver tseaver at palladion.com 713-523-6582 > Palladion Software http://www.palladion.com > -----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own