作者:daadhkiw_267 | 来源:互联网 | 2023-10-12 11:53
I want to modify a string with the help of re.sub
:
我想在re.sub的帮助下修改一个字符串:
>>> re.sub("sparta", r"\1", "Here is Sparta.", flags=re.IGNORECASE)
I expect to get:
我希望得到:
'Here is Sparta.'
But I get an error instead:
但我得到一个错误:
>>> re.sub("sparta", r"\1", "Here is Sparta.", flags=re.IGNORECASE)
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python2.7/re.py", line 155, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "/usr/lib/python2.7/re.py", line 291, in filter
return sre_parse.expand_template(template, match)
File "/usr/lib/python2.7/sre_parse.py", line 833, in expand_template
raise error, "invalid group reference"
sre_constants.error: invalid group reference
How should I use re.sub
to get the correct result?
我该如何使用re.sub来获得正确的结果?
2 个解决方案
0
When you use \x
in your second string (replacement string I think it's called) where x
is a number, python is going to replace it with the group x
.
当您在第二个字符串中使用\ x(替换字符串,我认为它被称为),其中x是一个数字,python将用组x替换它。
You can define a group in your regex by wrapping it with parentheses, like so:
您可以通过用括号括起来在正则表达式中定义一个组,如下所示:
re.sub(r"capture (['me]{2})", r'group 1: \1', 'capture me!') # => group 1: me
re.sub(r"capture (['me]{2})", r'group 1: \1', "capture 'em!") # => group 1: 'em
Nested captures? I've lost the count!
嵌套捕获?我输掉了数!
It's the opening bracket that defines it's number:
它是定义它的数字的开始括号:
(this is the first group (this is the second) (this is the third))
Named group
Named group are pretty useful when you use the match object that returns re.match
or re.search
for example (refer to the docs for more), but also when you use complex regex, because they bring clarity.
当您使用返回re.match或re.search的匹配对象时,命名组非常有用(请参阅文档了解更多信息),以及使用复杂正则表达式时,因为它们带来了清晰度。
You can name a group with the following syntax:
您可以使用以下语法命名组:
(?Pyour pattern)
So, for example:
所以,例如:
re.sub("(?Phello(?P[test]+)) (?P[a-z])", "first: \g") # => first: hello
What is the group 0
The group 0
is the entire match. But, you can't use \0
, because it's going to print out \x00
(the actual value of this escaped code). The solution is to use the named group syntax (because regular group are kind of named group: they're name is just an integer): \g<0>
. So, for example:
组0是整场比赛。但是,您不能使用\ 0,因为它将打印出\ x00(此转义代码的实际值)。解决方案是使用命名的组语法(因为常规组是一种命名组:它们的名称只是一个整数):\ g <0>。所以,例如:
re.sub(r'[hello]+', r'\g<0>', 'lehleo') # => lehleo
For your problem
This answer is just suppose to explain capturing, not really answering your question, since @Wiktor Stribiżew's one is perfect.
这个答案只是假设解释捕获,而不是真正回答你的问题,因为@WiktorStribiżew的一个是完美的。