wasm-demo/demo/ermis-f/python_m/cur/0453

From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Mon, 26 Apr 1999 15:07:42 GMT
Subject: Python too slow for real world
References: <372068E6.16A4A90@icrf.icnet.uk>
		<3720A21B.9C62DDB9@icrf.icnet.uk>
		<3720C4DB.7FCF2AE@appliedbiometrics.com>
		<3720C6EE.33CA6494@appliedbiometrics.com>
		<y0jaevznhha.fsf@vier.idi.ntnu.no>
		<glmvhemn4zx.fsf@caffeine.mitre.org>
		<y0j7lr0obcs.fsf@vier.idi.ntnu.no> <Pine.SUN.3.95-heb-2.07.990426013235.1500B-100000@sunset.ma.huji.ac.il>
Message-ID: <3724813E.ED53908F@appliedbiometrics.com>
Content-Length: 2666
X-UID: 453


Moshe Zadka wrote:
>
> On 25 Apr 1999, Magnus L. Hetland wrote:
...
> > Now, that's really simple -- because re.py is slow. I thought maybe
> > some of the slowness might be improved by a c-implementation, that's
> > all. Not too important to me...
>
> Um....two wrong assumptions here:
> 1. C implementation is /not/ the same as core status: C extension modules
> are numerous and wonderful, for example...

As Mark already pointed out, what's the difference?
You will not see any performance change, wether a module
is in the core or in an extra dll. The calling mechanism
is always the same, (well, the call instr under X86 takes 1 byte more
to the dll :-) and not very fast. So the less calls, the better.

> 2. re.py is a (very thin) wrapper around pcre, a C extension module for
> Perl Compatible Regular Expressions.

Right, but it carries the class protocol overhead all the time.
Returning mathes always involves creation of an instance of
a match object, and a number of tuple building operations
are involved. This is where unnecessary interpreter overhead
can be saved, and results could be created more efficiently
from a c extension, since it is not forced to hold every
intermediate result by a Python object which involves memory
allocation, and so on.

> Which just goes to say that while pcre can certainly be optimized, it
> can't be done by simply rewriting it in C.
> <0.5 wink>

Surely not since it is written in C <1.5 wink>.
If you are referring to re.py, a (nearly) direct translation into
C would indeed not help too much. P2C does that, but since it can
only remove the interpreter overhead, you will not save more
than 30-40 percent. A hand-coded C version would try to avoid
as much overhead as possible. The main difference is that you
know the data types which you are dealing with, so you will optimize
this case, instead of having to take care of the general case
as Python does.

But if Python had an optional strong type concept already, plus
some sealing option for modules and classes which would allow
to use compiled method tables instead of attribute lookups,
things could change dramatically. Given that, re.py could
be made fast enough without involving C, I believe.

ciao - chris

--
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home