1

I have a string as

sg_ts_feature_name_01_some_xyz

In this, i want to extract two words that comes after the pattern - sg_ts with the underscore seperation between them

It must be,

feature_name

This regex,

st = 'sg_ts_my_feature_01'
a = re.match('sg_ts_([a-zA-Z_]*)_*', st)
print a.group()

returns,

sg_ts_my_feature_

whereas, i expect,

my_feature
3
  • Have a look at this demo. Commented Sep 26, 2015 at 9:19
  • stribizhev is too humble to put his best answer as just a comment and leave without traces.... Commented Sep 26, 2015 at 9:24
  • No, I just was looking after my 2 children, I have no time to write a full answer. Glad you could solve your issue with others' help. Have a great weekend. Commented Sep 26, 2015 at 9:59

2 Answers 2

2

The problem is that you are asking for the whole match, not just the capture group. From the manual:

group([group1, ...]) Returns one or more subgroups of the match. If there is a single argument, the result is a single string; if there are multiple arguments, the result is a tuple with one item per argument. Without arguments, group1 defaults to zero (the whole match is returned). If a groupN argument is zero, the corresponding return value is the entire matching string; if it is in the inclusive range [1..99], it is the string matching the corresponding parenthesized group.

and you asked for a.group() which is equivalent to a.group(0) which is the whole match. Asking for a.group(1) will give you only the capture group in the parentheses.

Sign up to request clarification or add additional context in comments.

Comments

2

You can ask for the group surrounded by the parentheses, 'a.group(1)', which returns

'my_feature_'

In addition, if your string is always in this form you could also use the end-of string character $ and to make the inner match lazy instead of greedy (so it doesn't swallow the _).

a = re.match('sg_ts_([a-zA-Z_]*?)[_0-9]*$',st)

Comments