wasm-demo/demo/ermis-f/python_m/cur/1233

79 lines
2.5 KiB
Plaintext

From: aaron_watters at my-dejanews.com (aaron_watters at my-dejanews.com)
Date: Mon, 12 Apr 1999 12:42:05 GMT
Subject: GadFly - MemoryError
References: <7eih6b$ck2$1@nnrp1.dejanews.com> <Pine.SOL2.3.96.SK.990408202514.3496S-100000@sun.med.ru>
Message-ID: <7espms$idh$1@nnrp1.dejanews.com>
Content-Length: 2202
X-UID: 1233
In article <Pine.SOL2.3.96.SK.990408202514.3496S-100000 at sun.med.ru>,
phd at sun.med.ru wrote:
> I started playing with GadFly a few weeks ago, so I downloaded latest
> versions of GadFly and kjBuckets.
> Yesterday I found a way to use kjSet in my program.
>
> BTW, what are "kw" in "kwParsing" and "kj" in "kjBuckets"?
That's for me to know and you to guess.
BTW, I ran the following benchmark on my workstation, emulating
your query with artificial data:
===snip
fanout = 5
length = 3000
# create a table for self-join test
import gadfly
g = gadfly.gadfly()
g.startup("jtest", "dbtest") # dir ./dbtest should exist
c = g.cursor()
print "making table"
c.execute("create table test (a integer, b integer)")
def mapper(i): return (i, i/fanout)
data = map(mapper, range(length))
c.execute("insert into test(a,b) values (?,?)", data)
# do a self join with fanout
from time import time
print "doing query"
now = time()
c.execute("select * from test x, test y where x.b=y.b and x.a<y.a")
print "elapsed", time()-now
print len(c.fetchall()), "results generated from initial", length
====snip
On my machine (200Mhz P5 with 64Meg, NT4.0WS) it prints
C:\gadfly>testjoin.py
making table
doing query
elapsed 2.39299988747
6000 results generated from initial 3000
This is actually reasonably fast, I think. Adding an index didn't
make that much of a difference because the join algorithm actually
builds an index on the fly without one for this particular query.
The optimized join builds an intermediate table of size
15000 before eliminating most of the intermediate entries with the
x.a<y.a predicate, I think.
I conclude that the problem you had is probably your data and
your query, with your machine contributing a bit if it has little
memory. See my previous remarks for alternative approaches that
will probably work better.
-- Aaron Watters
===
His leather jacket had chains that would jingle
They both met movie stars, partied and mingled
They're A&R man said "I don't hear a single"
The future was wide open -- Tom Petty
-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own