60 lines
1.9 KiB
Plaintext
60 lines
1.9 KiB
Plaintext
From: tim_one at email.msn.com (Tim Peters)
|
|
Date: Tue, 6 Apr 1999 01:06:45 GMT
|
|
Subject: Possible regex match bug (re module)
|
|
In-Reply-To: <19990405084819.B802985@vislab.epa.gov>
|
|
References: <19990405084819.B802985@vislab.epa.gov>
|
|
Message-ID: <000001be7fc9$bdb0fec0$65a22299@tim>
|
|
Content-Length: 1550
|
|
X-UID: 247
|
|
|
|
[Randall Hopper]
|
|
> Re doesn't handle named groups in alternative patterns like it
|
|
> seems it should. Given an alternative pattern with a particular group
|
|
> name in each, it only assigns the match if the group name matches the
|
|
> last alterative.
|
|
|
|
re should raise an exception here -- it never intended to allow your
|
|
pattern. The deal is that symbolic group names are no more than that:
|
|
names for numbered groups. Like so:
|
|
|
|
>>> import re
|
|
>>> p = re.compile('(---(?P<id>[^-]*)---)|(===(?P<id>[^=]*)===)')
|
|
>>> p.groupindex
|
|
{'id': 4}
|
|
>>>
|
|
|
|
The groupindex member maps a symbolic name to the numeric group for which
|
|
it's an alias, and so in this pattern referring to group "id" is identical
|
|
to referring to group number 4. That explains everything you've seen. re
|
|
should instead notice that it already had a definition for name "id", and
|
|
complain about the redefinition.
|
|
|
|
Same as in Perl, you're going to have to write a hairier regexp with only
|
|
one interesting group, or give the interesting groups different names and
|
|
sort them out after the match (in an alternation involving named groups, at
|
|
most one will be non-None after a match). Here's a discouraging <wink>
|
|
example of the former approach:
|
|
|
|
>>> p = re.compile(r"([-=])\1\1(?P<id>((?!\1).)*)\1\1\1").match
|
|
>>> p("---abc---").group("id")
|
|
'abc'
|
|
>>> p("===def===").group("id")
|
|
'def'
|
|
>>> print p("===ghi---")
|
|
None
|
|
>>> p("------").group("id")
|
|
''
|
|
>>> p("---=---").group("id")
|
|
'='
|
|
>>> print p("===a=b===")
|
|
None
|
|
>>>
|
|
|
|
if-regexps-are-your-friends-you'd-hate-to-meet-your-enemies-ly y'rs - tim
|
|
|
|
|
|
|
|
|
|
|
|
|