1

I have a dataframe as shown below.It has 3 columns with names "TTN_163_2.5_-40 ","TTN_163_2.7_-40" and " TTN_163_3.6_-40".

I need to select all rows whose column name contains '2.5','3.6','2.7'.

I have some column names which contains 1.6,1.62 and 1.656.I need to select these separately.when I am writing df_psrr_funct_1V6.filter(regex='1\.6|^xvalues$') I am geting all rows corresponds to 1.6 ,1.65 and 1.62 .I don't want this .May I know how to select uniquely.

I used this method (df_psrr_funct = df_psrr_funct.filter(regex='2.5'))but it is not capturing 1st column(xvalues)

Sample dataframe

xvalues TTN_163_2.5_-40     TTN_163_2.7_-40   TTN_163_3.6_-40   
23.0279  -58.7591            -58.5892           -60.0966    
30.5284  -58.6903             -57.3153          -59.9111    

Please the image my dataframe

enter image description here May I know how to do this

1
  • 1
    If it only has those four columns, then why not df_psrr_funct = df_psrr_funct[["xvalues","TTN_163_2.5_-40"]]? Commented Sep 28, 2022 at 5:42

3 Answers 3

2

Expand regex with | for or, ^ is for start string, $ is for end string for extract column name xvalues and avoid extract colums names with substrings like xvalues 1 or aaa xvalues:

df_psrr_funct = df_psrr_funct.filter(regex='2\.5|^xvalues$')
print (df_psrr_funct)
   xvalues  TTN_163_2.5_-40
0  23.0279         -58.7591
1  30.5284         -58.6903

EDIT: If need values between _ use:

print (df_psrr_funct)
   xvalues  TTN_163_1.6_-40  TTN_163_1.62_-40  TTN_163_1.656_-40
0  23.0279         -58.7591          -58.5892           -60.0966
1  30.5284         -58.6903          -57.3153           -59.9111

df_psrr_funct = df_psrr_funct.filter(regex='_1\.6_|^xvalues$')
print (df_psrr_funct)
   xvalues  TTN_163_1.6_-40
0  23.0279         -58.7591
1  30.5284         -58.6903
Sign up to request clarification or add additional context in comments.

Comments

1

Another approach:

df_psrr_funct.filter(regex = '^\D+$|2.5')

   xvalues  TTN_163_2.5_-40
0  23.0279  -58.7591
1  30.5284  -58.6903

4 Comments

Do keep in mind, . is the regex wildcard, so if you want a literal ., you must escape it: \.
I have some column names which contains 1.6,1.62 and 1.656.I need to select these separately.when I am writing df_psrr_funct_1V6.filter(regex='1\.6|^xvalues$') I am geting all rows corresponds to 1.6 ,1.65 and 1.62 .I don't want this .May I know how to select uniquely
@Hari, can you share full columns names, without looking at the pattern, difficult to code a regex
I don't think it is possible .I have 829 columns.
1

using regex for this doesnt make any sense... just do

columns_with_2point5 = [c for c in df.columns if "2.5" in c]
only_cool_cols = df[['xvalues'] + columns_with_2point5]

dont overcomplicate it ...

if you dont need the first column you can just use filter with like instead of using one of the regex solutions (see first comment from @BeRT2me)

1 Comment

That list comprehension is essentially equivalent to df.filter(like='2.5')

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.