Find all rows whose column name contains a specific string

Question

I have a dataframe as shown below.It has 3 columns with names "TTN_163_2.5_-40 ","TTN_163_2.7_-40" and " TTN_163_3.6_-40".

I need to select all rows whose column name contains '2.5','3.6','2.7'.

I have some column names which contains 1.6,1.62 and 1.656.I need to select these separately.when I am writing df_psrr_funct_1V6.filter(regex='1\.6|^xvalues$') I am geting all rows corresponds to 1.6 ,1.65 and 1.62 .I don't want this .May I know how to select uniquely.

I used this method (df_psrr_funct = df_psrr_funct.filter(regex='2.5'))but it is not capturing 1st column(xvalues)

Sample dataframe

xvalues TTN_163_2.5_-40     TTN_163_2.7_-40   TTN_163_3.6_-40   
23.0279  -58.7591            -58.5892           -60.0966    
30.5284  -58.6903             -57.3153          -59.9111

Please the image my dataframe

May I know how to do this

If it only has those four columns, then why not df_psrr_funct = df_psrr_funct[["xvalues","TTN_163_2.5_-40"]]? — Tim Roberts
– Tim Roberts, Commented Sep 28, 2022 at 5:42

jezrael · Accepted Answer · 2022-09-28 09:32:32Z

Expand regex with | for or, ^ is for start string, $ is for end string for extract column name xvalues and avoid extract colums names with substrings like xvalues 1 or aaa xvalues:

df_psrr_funct = df_psrr_funct.filter(regex='2\.5|^xvalues$')
print (df_psrr_funct)
   xvalues  TTN_163_2.5_-40
0  23.0279         -58.7591
1  30.5284         -58.6903

EDIT: If need values between _ use:

print (df_psrr_funct)
   xvalues  TTN_163_1.6_-40  TTN_163_1.62_-40  TTN_163_1.656_-40
0  23.0279         -58.7591          -58.5892           -60.0966
1  30.5284         -58.6903          -57.3153           -59.9111

df_psrr_funct = df_psrr_funct.filter(regex='_1\.6_|^xvalues$')
print (df_psrr_funct)
   xvalues  TTN_163_1.6_-40
0  23.0279         -58.7591
1  30.5284         -58.6903

Karthik S · Accepted Answer · 2022-09-28 05:54:25Z

1

Another approach:

df_psrr_funct.filter(regex = '^\D+$|2.5')

   xvalues  TTN_163_2.5_-40
0  23.0279  -58.7591
1  30.5284  -58.6903

answered Sep 28, 2022 at 5:54

Karthik S

11.6k2 gold badges14 silver badges32 bronze badges

4 Comments

BeRT2me Over a year ago

Do keep in mind, . is the regex wildcard, so if you want a literal ., you must escape it: \.

Confused Over a year ago

I have some column names which contains 1.6,1.62 and 1.656.I need to select these separately.when I am writing df_psrr_funct_1V6.filter(regex='1\.6|^xvalues$') I am geting all rows corresponds to 1.6 ,1.65 and 1.62 .I don't want this .May I know how to select uniquely

Karthik S Over a year ago

@Hari, can you share full columns names, without looking at the pattern, difficult to code a regex

Confused Over a year ago

I don't think it is possible .I have 829 columns.

Joran Beasley · Accepted Answer · 2022-09-28 06:02:24Z

1

using regex for this doesnt make any sense... just do

columns_with_2point5 = [c for c in df.columns if "2.5" in c]
only_cool_cols = df[['xvalues'] + columns_with_2point5]

dont overcomplicate it ...

if you dont need the first column you can just use filter with like instead of using one of the regex solutions (see first comment from @BeRT2me)

edited Sep 28, 2022 at 6:02

answered Sep 28, 2022 at 5:47

Joran Beasley

114k13 gold badges167 silver badges187 bronze badges

1 Comment

BeRT2me Over a year ago

That list comprehension is essentially equivalent to df.filter(like='2.5')

Collectives™ on Stack Overflow

Find all rows whose column name contains a specific string

3 Answers 3

Comments

4 Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

1 Comment

Related