1

Is there a way to find a string in the DataFrame and return the column names upon match.

In the below example, I am trying to find the columns where "SRC" appears, not sure if I am close, but it returns all the column names instead of only the relevant ones. I 'm sure I am doing something silly.

df = pd.DataFrame({'col1':['foo SRC','bar','baz'], 'col2':['foo','bar','baz'],'col3':['SRC','bar','SRC'],
                  'col4':['SRC','SRC','SRC']})

df['col_list']= '/'.join(df.apply(lambda x : x.str.contains('SRC')).any().loc[lambda x : x].index)


Actual Result:
---------------------------------------------
col1    |col2   |col3   |col4   |col_list
--------|-------|-------|-------|----------------
foo SRC |foo    |SRC    |SRC    |col1/col3/col4
bar     |bar    |bar    |SRC    |col1/col3/col4
baz     |baz    |SRC    |SRC    |col1/col3/col4

Expected:

col1    |col2   |col3   |col4   |col_list
--------|-------|-------|-------|----------------
foo SRC |foo    |SRC    |SRC    |col1/col3/col4
bar     |bar    |bar    |SRC    |col4
baz     |baz    |SRC    |SRC    |col3/col4 

1 Answer 1

2

Use applymap with df.dot():

df['col_list']=df.applymap(lambda x: 'SRC' in x).dot(df.columns + '/').str[:-1]

Or apply with series.str.contains() and df.dot:

df['col_list']=df.apply(lambda x: 
                  x.str.contains('SRC',na=False)).dot(df.columns + '/').str[:-1]
print(df)

      col1 col2 col3 col4        col_list
0  foo SRC  foo  SRC  SRC  col1/col3/col4
1      bar  bar  bar  SRC            col4
2      baz  baz  SRC  SRC       col3/col4
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! That was fast :)
always nice to see dot solution :) +1

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.