The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

import-bot (20211) [Avatar] Offline
#1
Re: Named v. Numeric backreferences
[Originally posted by daryl harms]

Hi Fred,

On 1.5.2 I need to use raw strings to even get the named group example to work.

Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
IDLE 0.5 -- press F1 for help
>>> import re
>>> a = '<element attribute="value"></element> <element></element>'
>>> re.sub("<(?P<element>[^ >]*)(?P<attr>[^>]*)> ?</1>",
"<g><element>g<attr>/>", a)
'<element attribute="value"></element> <element></element>'
>>> re.sub(r"<(?P<element>[^ >]*)(?P<attr>[^>]*)> ?<1>",
r"<g><element>g<attr>/>", a)
'<element attribute="value"/> <element/>'
>>>


The replacement (second) string wouldn't need this but it's safer to
always use raw strings when working with RE's.

Using raw strings also gets your numeric back reference example to
work (for me that is):

>>> re.sub("<([^ >]*)([^>]*)> ?</1>","<12/>",a)
'<element attribute="value"></element> <element></element>'
>>> re.sub(r"<([^ >]*)([^>]*)> ?</1>",r"<12/>",a)
'<element attribute="value"/> <element/>'
>>>

Just as a further note, I usually use the alternate g<1> syntax for numeric
back references and place them in {}'s to keep even safer i.e.:

>>> re.sub(r"<([^ >]*)([^>]*)> ?</1>","r<{g<1>}{g<2>}/>",a)
'r<{element}{ attribute="value"}/> r<{element}{}/>'
>>>

Please send a followup if I'm misinterpeting your example and what you are
trying to do.

Daryl





> I posted this to c.l.python w/o luck. Any help here?
>
> This regex doesn't work:
> MajourasString = re.sub("<([^ >]*)([^>]*)> ?</1>", "<12/>",
> MajourasString)
>
> This one does:
> MajourasString = re.sub("<(?P<element>[^ >]*)(?P<attr>[^>]*)> ?</1>",
> "<g><element>g<attr>/>", MajourasString)
>
> The object is to transform SGML '<element attribute="value"></element>' and
> '<element></element>' to XML '<element attribute="value"/>' and '<element/>'.
>
> According to the docs, named and numeric backreferences should work
> identically. So what gives?
>
> Thanks!
import-bot (20211) [Avatar] Offline
#2
Re: Named v. Numeric backreferences
[Originally posted by daryl harms]

Hi Fred,

Sorry, but I forgot to escape the \'s in my response. Hopefully the following
will make more sense:

On 1.5.2 I need to use raw strings to even get the named group example to work.

Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
IDLE 0.5 -- press F1 for help
>>> import re
>>> a = '<element attribute="value"></element> <element></element>'
>>> re.sub("<(?P<element>[^ >]*)(?P<attr>[^>]*)> ?</1>",
"<g><element>g<attr>/>", a)
'<element attribute="value"></element> <element></element>'
>>> re.sub(r"<(?P<element>[^ >]*)(?P<attr>[^>]*)> ?</1>",
r"<g><element>g<attr>/>", a)
'<element attribute="value"/> <element/>'
>>>


Actually the replacement (second) string wouldn't need this but it's safer to
always use them.

Using raw strings also gets your numeric back reference example to
work (for me that is):

>>> re.sub("<([^ >]*)([^>]*)> ?</1>","<12/>",a)
'<element attribute="value"></element> <element></element>'
>>> re.sub(r"<([^ >]*)([^>]*)> ?</1>",r"<12/>",a)
'<element attribute="value"/> <element/>'
>>>

Just as a further note, I usually use the alternate g<1> syntax for numeric
back references and place them in {}'s to keep even safer i.e.:

>>> re.sub(r"<([^ >]*)([^>]*)> ?</1>","r<{g<1>}{g<2>}/>",a)
'r<{element}{ attribute="value"}/> r<{element}{}/>'
>>>

Please send a followup if I'm misinterpeting your example and what you are
trying to do.

Daryl
import-bot (20211) [Avatar] Offline
#3
[Originally posted by vepxistqaosani]

I posted this to c.l.python w/o luck. Any help here?

This regex doesn't work:
MajourasString = re.sub("<([^ >]*)([^>]*)> ?</1>", "<12/>",
MajourasString)

This one does:
MajourasString = re.sub("<(?P<element>[^ >]*)(?P<attr>[^>]*)> ?</1>",
"<g><element>g<attr>/>", MajourasString)

The object is to transform SGML '<element attribute="value"></element>' and
'<element></element>' to XML '<element attribute="value"/>' and '<element/>'.

According to the docs, named and numeric backreferences should work
identically. So what gives?

Thanks!