93 lines
3.8 KiB
Plaintext
93 lines
3.8 KiB
Plaintext
From: tismer at appliedbiometrics.com (Christian Tismer)
|
|
Date: Sat, 24 Apr 1999 13:58:57 GMT
|
|
Subject: Python too slow for real world
|
|
References: <372068E6.16A4A90@icrf.icnet.uk>
|
|
<3720A21B.9C62DDB9@icrf.icnet.uk>
|
|
<3720C4DB.7FCF2AE@appliedbiometrics.com>
|
|
<3720C6EE.33CA6494@appliedbiometrics.com> <y0jaevznhha.fsf@vier.idi.ntnu.no>
|
|
Message-ID: <3721CE21.5B881A10@appliedbiometrics.com>
|
|
Content-Length: 3393
|
|
X-UID: 1423
|
|
|
|
|
|
"Magnus L. Hetland" wrote:
|
|
>
|
|
> Christian Tismer <tismer at appliedbiometrics.com> writes:
|
|
>
|
|
> > Just did a little more cleanup to the code.
|
|
> > This it is:
|
|
>
|
|
> Hm. This code is nice enough (although not very intuitive...) But
|
|
> isn't it a bit troublesome that this sort of thing (which in many ways
|
|
> is a natural application for Python) is so much simpler to implement
|
|
> (in an efficient enough way) in Perl?
|
|
|
|
Well, Python has its trouble with its generalism, all the
|
|
object protocols, the stack machine, the name lookups, which
|
|
all apply even for simplest problems like Arne's.
|
|
This leads to non-intutive optimization tricks which I showed.
|
|
Although my buffering techique applies to other languages as
|
|
well. The brain damaging concept is running over big, partial
|
|
chunks of memory, trying to process them effectively without
|
|
much object creation, and making sure that the parts glue
|
|
together correctly, the last record isn't missing and so on.
|
|
The real work is hidden somewhere between like a side effect.
|
|
|
|
> Can something be done about it? Perhaps a buffering parameter to
|
|
> fileinput? In that case, a lot of the code could be put in that
|
|
> module, as part of the standard distribution... Even so -- you would
|
|
> somehow have to be able to treat the buffers as blocks... Hm.
|
|
|
|
I think someting can be done.
|
|
First, I think I can set up a framework for this class of
|
|
problems, which takes a line oriented algorithm and spits
|
|
out such a convoluted thing which does the same.
|
|
|
|
Another thing which appears worthwhile is generalizing the
|
|
realine function. I used that in my own buffered files,
|
|
but this would be twice as fast if readline/s could do
|
|
this alone.
|
|
|
|
What I need is a variable line delimiter which can be set
|
|
as a property for a file object. In this case, I would
|
|
use ">" as delimiter. For a fast XML scanner (which just
|
|
works right partitioning of XML pieces, nothing else),
|
|
I would use "<" as delimiter, read such chunks and break
|
|
them on ">", with a little repair code for comments,
|
|
">" appearing in attributes etc.
|
|
|
|
Conclusion:
|
|
My readline would be parameterized by a delimiter string.
|
|
I would *not* leave it attached to a line (like the CR's),
|
|
instead I would return the delimiter as EOF indicator.
|
|
|
|
> (And... How about builtin regexes in P2?)
|
|
|
|
No. Noo! Please never! :-)
|
|
I really hate them from design, and they shouldn't imfluence
|
|
Python in any way. What I likemuch better is Marc Lemburg's
|
|
tagging engine, which could have been used for this problem.
|
|
One should think of a nicer interface, which allows it to
|
|
build readable, efficient tagging engines from Python code,
|
|
since at the moment, this is a little at the assembly level :-)
|
|
All in all, I'd like to express little engines in Python,
|
|
but not these ugly undebuggable unreadable flie dirt strings
|
|
which they call "regexen".
|
|
But that's my private opinion which should not be an attack
|
|
to anybody. I just prefer little machines whcih can interact
|
|
with Python directly.
|
|
|
|
ciao - chris
|
|
|
|
--
|
|
Christian Tismer :^) <mailto:tismer at appliedbiometrics.com>
|
|
Applied Biometrics GmbH : Have a break! Take a ride on Python's
|
|
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
|
|
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
|
|
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
|
|
we're tired of banana software - shipped green, ripens at home
|
|
|
|
|
|
|
|
|