79 lines
3.2 KiB
Plaintext
79 lines
3.2 KiB
Plaintext
From: tismer at appliedbiometrics.com (Christian Tismer)
|
|
Date: Mon, 26 Apr 1999 15:07:42 GMT
|
|
Subject: Python too slow for real world
|
|
References: <372068E6.16A4A90@icrf.icnet.uk>
|
|
<3720A21B.9C62DDB9@icrf.icnet.uk>
|
|
<3720C4DB.7FCF2AE@appliedbiometrics.com>
|
|
<3720C6EE.33CA6494@appliedbiometrics.com>
|
|
<y0jaevznhha.fsf@vier.idi.ntnu.no>
|
|
<glmvhemn4zx.fsf@caffeine.mitre.org>
|
|
<y0j7lr0obcs.fsf@vier.idi.ntnu.no> <Pine.SUN.3.95-heb-2.07.990426013235.1500B-100000@sunset.ma.huji.ac.il>
|
|
Message-ID: <3724813E.ED53908F@appliedbiometrics.com>
|
|
Content-Length: 2666
|
|
X-UID: 453
|
|
|
|
|
|
Moshe Zadka wrote:
|
|
>
|
|
> On 25 Apr 1999, Magnus L. Hetland wrote:
|
|
...
|
|
> > Now, that's really simple -- because re.py is slow. I thought maybe
|
|
> > some of the slowness might be improved by a c-implementation, that's
|
|
> > all. Not too important to me...
|
|
>
|
|
> Um....two wrong assumptions here:
|
|
> 1. C implementation is /not/ the same as core status: C extension modules
|
|
> are numerous and wonderful, for example...
|
|
|
|
As Mark already pointed out, what's the difference?
|
|
You will not see any performance change, wether a module
|
|
is in the core or in an extra dll. The calling mechanism
|
|
is always the same, (well, the call instr under X86 takes 1 byte more
|
|
to the dll :-) and not very fast. So the less calls, the better.
|
|
|
|
> 2. re.py is a (very thin) wrapper around pcre, a C extension module for
|
|
> Perl Compatible Regular Expressions.
|
|
|
|
Right, but it carries the class protocol overhead all the time.
|
|
Returning mathes always involves creation of an instance of
|
|
a match object, and a number of tuple building operations
|
|
are involved. This is where unnecessary interpreter overhead
|
|
can be saved, and results could be created more efficiently
|
|
from a c extension, since it is not forced to hold every
|
|
intermediate result by a Python object which involves memory
|
|
allocation, and so on.
|
|
|
|
> Which just goes to say that while pcre can certainly be optimized, it
|
|
> can't be done by simply rewriting it in C.
|
|
> <0.5 wink>
|
|
|
|
Surely not since it is written in C <1.5 wink>.
|
|
If you are referring to re.py, a (nearly) direct translation into
|
|
C would indeed not help too much. P2C does that, but since it can
|
|
only remove the interpreter overhead, you will not save more
|
|
than 30-40 percent. A hand-coded C version would try to avoid
|
|
as much overhead as possible. The main difference is that you
|
|
know the data types which you are dealing with, so you will optimize
|
|
this case, instead of having to take care of the general case
|
|
as Python does.
|
|
|
|
But if Python had an optional strong type concept already, plus
|
|
some sealing option for modules and classes which would allow
|
|
to use compiled method tables instead of attribute lookups,
|
|
things could change dramatically. Given that, re.py could
|
|
be made fast enough without involving C, I believe.
|
|
|
|
ciao - chris
|
|
|
|
--
|
|
Christian Tismer :^) <mailto:tismer at appliedbiometrics.com>
|
|
Applied Biometrics GmbH : Have a break! Take a ride on Python's
|
|
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
|
|
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
|
|
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
|
|
we're tired of banana software - shipped green, ripens at home
|
|
|
|
|
|
|
|
|