79 lines
2.5 KiB
Plaintext
79 lines
2.5 KiB
Plaintext
From: aaron_watters at my-dejanews.com (aaron_watters at my-dejanews.com)
|
|
Date: Mon, 12 Apr 1999 12:42:05 GMT
|
|
Subject: GadFly - MemoryError
|
|
References: <7eih6b$ck2$1@nnrp1.dejanews.com> <Pine.SOL2.3.96.SK.990408202514.3496S-100000@sun.med.ru>
|
|
Message-ID: <7espms$idh$1@nnrp1.dejanews.com>
|
|
Content-Length: 2202
|
|
X-UID: 1233
|
|
|
|
In article <Pine.SOL2.3.96.SK.990408202514.3496S-100000 at sun.med.ru>,
|
|
phd at sun.med.ru wrote:
|
|
> I started playing with GadFly a few weeks ago, so I downloaded latest
|
|
> versions of GadFly and kjBuckets.
|
|
> Yesterday I found a way to use kjSet in my program.
|
|
>
|
|
> BTW, what are "kw" in "kwParsing" and "kj" in "kjBuckets"?
|
|
|
|
That's for me to know and you to guess.
|
|
|
|
BTW, I ran the following benchmark on my workstation, emulating
|
|
your query with artificial data:
|
|
|
|
===snip
|
|
fanout = 5
|
|
length = 3000
|
|
|
|
# create a table for self-join test
|
|
import gadfly
|
|
g = gadfly.gadfly()
|
|
g.startup("jtest", "dbtest") # dir ./dbtest should exist
|
|
c = g.cursor()
|
|
print "making table"
|
|
c.execute("create table test (a integer, b integer)")
|
|
def mapper(i): return (i, i/fanout)
|
|
data = map(mapper, range(length))
|
|
c.execute("insert into test(a,b) values (?,?)", data)
|
|
|
|
# do a self join with fanout
|
|
from time import time
|
|
print "doing query"
|
|
now = time()
|
|
c.execute("select * from test x, test y where x.b=y.b and x.a<y.a")
|
|
print "elapsed", time()-now
|
|
print len(c.fetchall()), "results generated from initial", length
|
|
====snip
|
|
|
|
On my machine (200Mhz P5 with 64Meg, NT4.0WS) it prints
|
|
|
|
C:\gadfly>testjoin.py
|
|
making table
|
|
doing query
|
|
elapsed 2.39299988747
|
|
6000 results generated from initial 3000
|
|
|
|
This is actually reasonably fast, I think. Adding an index didn't
|
|
make that much of a difference because the join algorithm actually
|
|
builds an index on the fly without one for this particular query.
|
|
The optimized join builds an intermediate table of size
|
|
15000 before eliminating most of the intermediate entries with the
|
|
x.a<y.a predicate, I think.
|
|
|
|
I conclude that the problem you had is probably your data and
|
|
your query, with your machine contributing a bit if it has little
|
|
memory. See my previous remarks for alternative approaches that
|
|
will probably work better.
|
|
-- Aaron Watters
|
|
|
|
===
|
|
His leather jacket had chains that would jingle
|
|
They both met movie stars, partied and mingled
|
|
They're A&R man said "I don't hear a single"
|
|
The future was wide open -- Tom Petty
|
|
|
|
-----------== Posted via Deja News, The Discussion Network ==----------
|
|
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
|
|
|
|
|
|
|
|
|