42 lines
1.6 KiB
Plaintext
42 lines
1.6 KiB
Plaintext
From: dfan at harmonixmusic.com (Dan Schmidt)
|
|
Date: 10 May 1999 12:26:36 -0400
|
|
Subject: An efficient split function
|
|
References: <wy3e15x8sa.fsf@wiggum.dejanews.com> <14134.64897.771093.125780@amarok.cnri.reston.va.us>
|
|
Message-ID: <wk4slkevr7.fsf@turangalila.harmonixmusic.com>
|
|
Content-Length: 1301
|
|
X-UID: 1811
|
|
|
|
"Andrew M. Kuchling" <akuchlin at cnri.reston.va.us> writes:
|
|
|
|
| William S. Lear writes:
|
|
|
|
|
| >Surprisingly, to me, the Python version far outperformed the Perl
|
|
| >version. Running on 1 million lines of input of 9 fields each, the
|
|
| >Python version ran in just under 20 seconds, the Perl version in
|
|
| >just under 40 seconds (this on a 400Mhz Pentium Linux box).
|
|
|
|
|
| Note that your use of split(/\|/) in Perl requires using the
|
|
| regular expression engine, instead of a simple C splitting loop .
|
|
| Try using a literal string instead of a regex, as in split('|',
|
|
| ...); that will probably even out the speeds.
|
|
|
|
The first argument to Perl's split() is a regular expression. If
|
|
it's a string, it'll just get converted into a regexp (except for the
|
|
special case ' '; it's Perl, there had to be a special case). So
|
|
|
|
- You actually need to use '\|', not '|', if you're going to use a
|
|
string instead of a regexp (try it and see);
|
|
|
|
- '\|' isn't actually any faster than /\|/ (I benchmarked it to
|
|
check).
|
|
|
|
--
|
|
Dan Schmidt -> dfan at harmonixmusic.com, dfan at alum.mit.edu
|
|
Honest Bob & the http://www2.thecia.net/users/dfan/
|
|
Factory-to-Dealer Incentives -> http://www2.thecia.net/users/dfan/hbob/
|
|
Gamelan Galak Tika -> http://web.mit.edu/galak-tika/www/
|
|
|
|
|
|
|
|
|